Convert Hex to Integer
-
wrote on 18 Jul 2017, 02:52 last edited by
When converting a QString text value (in hex) to an integer I noticed a problem that doesn't make a lot of sense:
int value; QString text("FFFFFFFF"); bool valid; value = text.toLong(&valid,16); // okay value = text.toInt(&valid,16); // not okay
If the text is "7FFFFFFF" or smaller then the member function .toInt(...) works fine. The option .toLong(...) works fine but this is something else that doesn't make sense (on a 64 bit system type long is 8 bytes, int is 4 bytes so I would expect only FFFFFFFFFFFFFFFF to be -1).
Any thoughts on why it works this way?
-
When converting a QString text value (in hex) to an integer I noticed a problem that doesn't make a lot of sense:
int value; QString text("FFFFFFFF"); bool valid; value = text.toLong(&valid,16); // okay value = text.toInt(&valid,16); // not okay
If the text is "7FFFFFFF" or smaller then the member function .toInt(...) works fine. The option .toLong(...) works fine but this is something else that doesn't make sense (on a 64 bit system type long is 8 bytes, int is 4 bytes so I would expect only FFFFFFFFFFFFFFFF to be -1).
Any thoughts on why it works this way?
@Rondog
Afaic, int type reserves 32 bit as signed -> −2.147.483.64 to 2.147.483.647FFFFFFFF needs at least uint32_t. Seems like the function
toInt()
doesn't simply return the overflow, but does not convert at all, if the Number is bigger than the reserved memory.Your example returns for me 0 and false in both cases. With
toLongLong
, which is int64_t, it returns the correct value. -
wrote on 18 Jul 2017, 11:20 last edited by
@J-Hilk It is interesting that on your system both examples failed to return a value. I assume you have a 64 bit OS and type long is 64 bit (if type long is 32 bit then this makes sense).
It is odd that longlong works in your case (and long in my case which is 64 bit). It is the same idea as treating FFFF as -1 instead of 65535 for a 32 bit integer (for a 16 bit integer those values are correct).
I guess this means the signed conversions functions should be used very carefully.
-
@J-Hilk It is interesting that on your system both examples failed to return a value. I assume you have a 64 bit OS and type long is 64 bit (if type long is 32 bit then this makes sense).
It is odd that longlong works in your case (and long in my case which is 64 bit). It is the same idea as treating FFFF as -1 instead of 65535 for a 32 bit integer (for a 16 bit integer those values are correct).
I guess this means the signed conversions functions should be used very carefully.
@Rondog
I'm using a 64bit system, on Windows 10.However I'm not sure Qt uses by default the systems typdef, we have QLocale::toInt after all. But I'm open for correction.
taking a quick look into qglobals.h I find these:
typedef signed char qint8; /* 8 bit signed */ 200 typedef unsigned char quint8; /* 8 bit unsigned */ 201 typedef short qint16; /* 16 bit signed */ 202 typedef unsigned short quint16; /* 16 bit unsigned */ 203 typedef int qint32; /* 32 bit signed */ 204 typedef unsigned int quint32; /* 32 bit unsigned */ 205 #if defined(Q_OS_WIN) && !defined(Q_CC_GNU) 206 # define Q_INT64_C(c) c ## i64 /* signed 64 bit constant */ 207 # define Q_UINT64_C(c) c ## ui64 /* unsigned 64 bit constant */ 208 typedef __int64 qint64; /* 64 bit signed */ 209 typedef unsigned __int64 quint64; /* 64 bit unsigned */ 210 #else 211 # define Q_INT64_C(c) static_cast<long long>(c ## LL) /* signed 64 bit constant */ 212 # define Q_UINT64_C(c) static_cast<unsigned long long>(c ## ULL) /* unsigned 64 bit constant */ 213 typedef long long qint64; /* 64 bit signed */ 214 typedef unsigned long long quint64; /* 64 bit unsigned */ 215 #endif 216 217 typedef qint64 qlonglong; 218 typedef quint64 qulonglong; 219 220 /* 221 Useful type definitions for Qt 222 */ 223 224 QT_BEGIN_INCLUDE_NAMESPACE 225 typedef unsigned char uchar; 226 typedef unsigned short ushort; 227 typedef unsigned int uint; 228 typedef unsigned long ulong; 229 QT_END_INCLUDE_NAMESPACE
There's no specific definition of long, however: From the docu:
typedef qint64 Typedef for long long int (__int64 on Windows). This type is guaranteed to be 64-bit on all platforms supported by Qt.
=> long only 32bit!?
-
wrote on 18 Jul 2017, 13:26 last edited by
@J-Hilk That was the first thing I checked:
sizeof(int); // 4 bytes sizeof(long); // 8 bytes
I am using OSX and not Windows. Likely type long is different for these two OS's. In the old days type long was 32 bit when type int was 16 so I expected it is 64 bit when int is 32. It is good to know this may not always be true.
-
wrote on 18 Jul 2017, 15:15 last edited by
Okay, I think I understand this a bit more and can explain a few things I saw. To get this to work I need to use the unsigned conversion member functions of QString and ignore the signed versions:
// target is 32 bit signed integer int value; QString text("FFFFFFFF"); bool valid; value = text.toUInt(&valid,16); // okay, value is -1 // target is 16 bit signed integer short short_value; QString text("7777FFFF"); bool valid; short_value = text.toUInt(&valid,16); // okay, short_value is -1
The returned bit pattern is cast to the target variable (signed int, signed short, ...) so the extra bits, if any, are truncated at that point. This is the key part that turns the unsigned value into a signed value. I am not casting the value but this is done automatically in this case.
For example, this works:
char byte_value; QString text("7F001234567890FF"); bool valid; byte_value = text.toULongLong(&valid,16); // okay, byte_value is -1 based only on 'FF'
-
@J-Hilk That was the first thing I checked:
sizeof(int); // 4 bytes sizeof(long); // 8 bytes
I am using OSX and not Windows. Likely type long is different for these two OS's. In the old days type long was 32 bit when type int was 16 so I expected it is 64 bit when int is 32. It is good to know this may not always be true.
Here's another quirk (bug?) of QString's number conversion functions:
Code:
qDebug() << QString::number(static_cast<qint8>(-1), 16); qDebug() << QString::number(static_cast<qint16>(-1), 16); qDebug() << QString::number(static_cast<qint32>(-1), 16); qDebug() << QString::number(static_cast<qint64>(-1), 16);
Expected output:
"ff" "ffff" "ffffffff" "ffffffffffffffff"
Actual output:
"ffffffffffffffff" "ffffffffffffffff" "ffffffffffffffff" "ffffffffffffffff"
Digging into the source code for QString::number(int n, int base), we can see why it behaves this way: The number is always converted to
qlonglong
first! QString::toInt(bool*, int) does something similar.@Rondog said in Convert Hex to Integer:
I am using OSX and not Windows. Likely type long is different for these two OS's. In the old days type long was 32 bit when type int was 16 so I expected it is 64 bit when int is 32. It is good to know this may not always be true.
https://stackoverflow.com/questions/29748189/c-sizeof-integral-types
The standard does not specify an exact size, only the minimum size. The constraint is:
sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(longlong)
, andsizeof(long)
is indeed 4 on 64-bit Windows. -
Here's another quirk (bug?) of QString's number conversion functions:
Code:
qDebug() << QString::number(static_cast<qint8>(-1), 16); qDebug() << QString::number(static_cast<qint16>(-1), 16); qDebug() << QString::number(static_cast<qint32>(-1), 16); qDebug() << QString::number(static_cast<qint64>(-1), 16);
Expected output:
"ff" "ffff" "ffffffff" "ffffffffffffffff"
Actual output:
"ffffffffffffffff" "ffffffffffffffff" "ffffffffffffffff" "ffffffffffffffff"
Digging into the source code for QString::number(int n, int base), we can see why it behaves this way: The number is always converted to
qlonglong
first! QString::toInt(bool*, int) does something similar.@Rondog said in Convert Hex to Integer:
I am using OSX and not Windows. Likely type long is different for these two OS's. In the old days type long was 32 bit when type int was 16 so I expected it is 64 bit when int is 32. It is good to know this may not always be true.
https://stackoverflow.com/questions/29748189/c-sizeof-integral-types
The standard does not specify an exact size, only the minimum size. The constraint is:
sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(longlong)
, andsizeof(long)
is indeed 4 on 64-bit Windows.@JKSH said in Convert Hex to Integer:
Here's another quirk (bug?) of QString's number conversion functions
Not
QString
's fault, it's the compiler's - it fails to follow overload resolutions strictly. What is the compiler?@Rondog said in Convert Hex to Integer:
This is the key part that turns the unsigned value into a signed value. I am not casting the value but this is done automatically in this case.
I don't follow. There's no difference between signed and unsigned values memory-wise; they use the same layout. It's just the interpretation of the data that make them signed or not. As for the casting, well it's the compiler that's doing it. You have an implicit narrowing:
short_value = text.toUInt(&valid,16); // short = int
this usually gives you a warning, or at least it should.
@JKSH said in Convert Hex to Integer:
and sizeof(long) is indeed 4 on 64-bit Windows.
Yep. It has been for as long as I can remember. ;)
-
wrote on 19 Jul 2017, 12:44 last edited by
@JKSH I noticed this as well where any value I convert to a hex string it is always a fixed width. I deal with this by removing everything except the last characters at the expected length. For example a short is 2 bytes long so the hex version should be no more than four characters in length.
Type long is a 64 bit value on my OS and compiler and I didn't realize it was only 32 bit on Windows. I don't use this too often (I don't remember ever using it outside of the example in this thread actually) but I am aware it exists. Type long is used quite a bit in Windows so maybe it was kept at 32 bit for compatibility reasons (?).
@kshegunov said in Convert Hex to Integer:
@Rondog said in Convert Hex to Integer:
This is the key part that turns the unsigned value into a signed value. I am not casting the value but this is done automatically in this case.
I don't follow. There's no difference between signed and unsigned values memory-wise; they use the same layout. It's just the interpretation of the data that make them signed or not. As for the casting, well it's the compiler that's doing it. You have an implicit narrowing:
I wrote it in the program like this:
command_id = static_cast<int>(text.toUInt(&status,16));
The cast would be unnecessary if I used to function .toInt(...) but this doesn't work in all cases. The name of the QString member functions are misleading as the input text must meet the requirements (sort of) of the output type and not just what the output type will be. For example, using .toShort(...) means the input text must be no greater than 7FFF where .toUShort(...) the input must be no greater than FFFF. This looks like a bug on the surface but it is probably more related to some values that use a sign character (base 10) where others don't have a sign character (base 16) and some are unknown what they might look like (base 21, 22, ...).
The cast prevents a compiler warning (which I should see without the cast) but it works either way. The fact these are identical memory-wise is the only reason this works.
-
@JKSH I noticed this as well where any value I convert to a hex string it is always a fixed width. I deal with this by removing everything except the last characters at the expected length. For example a short is 2 bytes long so the hex version should be no more than four characters in length.
Type long is a 64 bit value on my OS and compiler and I didn't realize it was only 32 bit on Windows. I don't use this too often (I don't remember ever using it outside of the example in this thread actually) but I am aware it exists. Type long is used quite a bit in Windows so maybe it was kept at 32 bit for compatibility reasons (?).
@kshegunov said in Convert Hex to Integer:
@Rondog said in Convert Hex to Integer:
This is the key part that turns the unsigned value into a signed value. I am not casting the value but this is done automatically in this case.
I don't follow. There's no difference between signed and unsigned values memory-wise; they use the same layout. It's just the interpretation of the data that make them signed or not. As for the casting, well it's the compiler that's doing it. You have an implicit narrowing:
I wrote it in the program like this:
command_id = static_cast<int>(text.toUInt(&status,16));
The cast would be unnecessary if I used to function .toInt(...) but this doesn't work in all cases. The name of the QString member functions are misleading as the input text must meet the requirements (sort of) of the output type and not just what the output type will be. For example, using .toShort(...) means the input text must be no greater than 7FFF where .toUShort(...) the input must be no greater than FFFF. This looks like a bug on the surface but it is probably more related to some values that use a sign character (base 10) where others don't have a sign character (base 16) and some are unknown what they might look like (base 21, 22, ...).
The cast prevents a compiler warning (which I should see without the cast) but it works either way. The fact these are identical memory-wise is the only reason this works.
@Rondog
it actually makes sencetake a look at the
toInt()
functionint QString::toInt(bool *ok, int base) const { qint64 v = toLongLong(ok, base); if (v < INT_MIN || v > INT_MAX) { if (ok) *ok = false; v = 0; } return v; }
and
toUInt
uint QString::toUInt(bool *ok, int base) const { quint64 v = toULongLong(ok, base); if (v > UINT_MAX) { if (ok) *ok = false; v = 0; } return (uint)v; }
and
toLong
long QString::toLong(bool *ok, int base) const { qint64 v = toLongLong(ok, base); if (v < LONG_MIN || v > LONG_MAX) { if (ok) *ok = false; v = 0; } return (long)v; }
with
- INT_MAX = 2147483647
- INT_MIN = -2147483648
- UINT_MAX = 4294967295
- LONG_MAX = 2147483647
- LONG_MIN = -2147483648
Totaly expected behavior...
but looking at the quellcode, in your case it might be more useful/faster to call
toLongLong
and cast that into int. Skips a whole bunch of steps. -
wrote on 19 Jul 2017, 21:38 last edited by Rondog
@J.Hilk said in Convert Hex to Integer:
int QString::toInt(bool *ok, int base) const
{
qint64 v = toLongLong(ok, base);
if (v < INT_MIN || v > INT_MAX) {
if (ok)
*ok = false;
v = 0;
}
return v;
}The above code is where the problem exists as 0xFFFFFFFF with a 64 bit integer is always a positive value (and outside the range of a 32 bit integer). To get anything with a negative value for a 64 bit integer you would need to have a value from 0x8000000000000000 to 0xFFFFFFFFFFFFFFFF.
So, yeah, this will always fail for any signed 32 bit integer with a negative value (unless the hex value has a negative sign in front written like -01 instead of FF). And, if you need to rely on casting the return value, then using the function .toLongLong(...) directly makes sense (why not cut out the middle man).
-
@J.Hilk said in Convert Hex to Integer:
int QString::toInt(bool *ok, int base) const
{
qint64 v = toLongLong(ok, base);
if (v < INT_MIN || v > INT_MAX) {
if (ok)
*ok = false;
v = 0;
}
return v;
}The above code is where the problem exists as 0xFFFFFFFF with a 64 bit integer is always a positive value (and outside the range of a 32 bit integer). To get anything with a negative value for a 64 bit integer you would need to have a value from 0x8000000000000000 to 0xFFFFFFFFFFFFFFFF.
So, yeah, this will always fail for any signed 32 bit integer with a negative value (unless the hex value has a negative sign in front written like -01 instead of FF). And, if you need to rely on casting the return value, then using the function .toLongLong(...) directly makes sense (why not cut out the middle man).
I finally understood what the problem is. Okay, just use the unsigned variants
toU...
and cast the integer to a signed one explicitly. I imagine the function works that way because0xFFFFFFFF
is -1 in a specific memory layout (and byteorder).So if you think about it the original author of the function would need to assume a specific byte order and a specific integer memory layout implementation if he were to directly return the integer with that memory representation, too much assuming gets people in trouble. ;)
1/12