Unsolved How can QChar::LastValidCodePoint fit inside an unsigned short int?
-
I was just looking through the
QChar
documentation and saw this member ofQChar::SpecialCharacters
and I spotted the value was - not unreasonably0x10FFFF
- but then I remembered that, and I quote:"The
QChar
class provides a 16-bit Unicode character.""In Qt, Unicode characters are 16-bit entities without any markup or structure. This class represents such an entity. It is lightweight, so it can be used everywhere. Most compilers treat it like a
unsigned short
."However an unsigned short integer only has to be at least 16 bits according to Wikipedia which directs the curious to ISO/IEC 9899:1999 specification, TC3 (PDF). p. 22, § 5.2.4.2.1 Sizes of integer types <limits.h>. - so how can Qt guarantee that a (lightweight)
QChar
can contain0x10FFFF
when the underlying type specification may only be big enough to hold0xFFFF
- I thought the whole reason to have to handleQString
very carefully when dealing with non-BMP characters was because of the need to tread carefully when there are High & Low Surrogates in there to deal with the worst of both worlds that is UTF-16 - if aQChar
can hold all 21 bits needed to represent a Unicode codepoint such as the highest legitimate one why do we not have aQWideString
class that works on simple arrays of them to work directly with UTF-32? -
@SlySven I didn't check QChar implementation, but I guess QChar uses int/uint internally. Many methods in QChar expect or return uint.
-
@jsulm said in How can QChar::LastValidCodePoint fit inside an unsigned short int?:
but I guess QChar uses int/uint internally
it does not: https://code.woboq.org/qt5/include/qt/QtCore/qchar.h.html#87
looks like it just gets truncated to 0xFFFF