How can QChar::LastValidCodePoint fit inside an unsigned short int?

  • I was just looking through the QChar documentation and saw this member of QChar::SpecialCharacters and I spotted the value was - not unreasonably 0x10FFFF - but then I remembered that, and I quote:

    "The QChar class provides a 16-bit Unicode character."

    "In Qt, Unicode characters are 16-bit entities without any markup or structure. This class represents such an entity. It is lightweight, so it can be used everywhere. Most compilers treat it like a unsigned short."

    However an unsigned short integer only has to be at least 16 bits according to Wikipedia which directs the curious to ISO/IEC 9899:1999 specification, TC3 (PDF). p. 22, § Sizes of integer types <limits.h>. - so how can Qt guarantee that a (lightweight) QChar can contain 0x10FFFF when the underlying type specification may only be big enough to hold 0xFFFF - I thought the whole reason to have to handle QString very carefully when dealing with non-BMP characters was because of the need to tread carefully when there are High & Low Surrogates in there to deal with the worst of both worlds that is UTF-16 - if a QChar can hold all 21 bits needed to represent a Unicode codepoint such as the highest legitimate one why do we not have a QWideString class that works on simple arrays of them to work directly with UTF-32?

  • Qt Champions 2018

    @SlySven I didn't check QChar implementation, but I guess QChar uses int/uint internally. Many methods in QChar expect or return uint.

  • Qt Champions 2018

    @jsulm said in How can QChar::LastValidCodePoint fit inside an unsigned short int?:

    but I guess QChar uses int/uint internally

    it does not:

    looks like it just gets truncated to 0xFFFF

Log in to reply

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.