Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. How can QChar::LastValidCodePoint fit inside an unsigned short int?
Forum Updated to NodeBB v4.3 + New Features

How can QChar::LastValidCodePoint fit inside an unsigned short int?

Scheduled Pinned Locked Moved Unsolved General and Desktop
3 Posts 3 Posters 919 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Offline
    S Offline
    SlySven
    wrote on last edited by
    #1

    I was just looking through the QChar documentation and saw this member of QChar::SpecialCharacters and I spotted the value was - not unreasonably 0x10FFFF - but then I remembered that, and I quote:

    "The QChar class provides a 16-bit Unicode character."

    "In Qt, Unicode characters are 16-bit entities without any markup or structure. This class represents such an entity. It is lightweight, so it can be used everywhere. Most compilers treat it like a unsigned short."

    However an unsigned short integer only has to be at least 16 bits according to Wikipedia which directs the curious to ISO/IEC 9899:1999 specification, TC3 (PDF). p. 22, § 5.2.4.2.1 Sizes of integer types <limits.h>. - so how can Qt guarantee that a (lightweight) QChar can contain 0x10FFFF when the underlying type specification may only be big enough to hold 0xFFFF - I thought the whole reason to have to handle QString very carefully when dealing with non-BMP characters was because of the need to tread carefully when there are High & Low Surrogates in there to deal with the worst of both worlds that is UTF-16 - if a QChar can hold all 21 bits needed to represent a Unicode codepoint such as the highest legitimate one why do we not have a QWideString class that works on simple arrays of them to work directly with UTF-32?

    jsulmJ 1 Reply Last reply
    0
    • S SlySven

      I was just looking through the QChar documentation and saw this member of QChar::SpecialCharacters and I spotted the value was - not unreasonably 0x10FFFF - but then I remembered that, and I quote:

      "The QChar class provides a 16-bit Unicode character."

      "In Qt, Unicode characters are 16-bit entities without any markup or structure. This class represents such an entity. It is lightweight, so it can be used everywhere. Most compilers treat it like a unsigned short."

      However an unsigned short integer only has to be at least 16 bits according to Wikipedia which directs the curious to ISO/IEC 9899:1999 specification, TC3 (PDF). p. 22, § 5.2.4.2.1 Sizes of integer types <limits.h>. - so how can Qt guarantee that a (lightweight) QChar can contain 0x10FFFF when the underlying type specification may only be big enough to hold 0xFFFF - I thought the whole reason to have to handle QString very carefully when dealing with non-BMP characters was because of the need to tread carefully when there are High & Low Surrogates in there to deal with the worst of both worlds that is UTF-16 - if a QChar can hold all 21 bits needed to represent a Unicode codepoint such as the highest legitimate one why do we not have a QWideString class that works on simple arrays of them to work directly with UTF-32?

      jsulmJ Online
      jsulmJ Online
      jsulm
      Lifetime Qt Champion
      wrote on last edited by
      #2

      @SlySven I didn't check QChar implementation, but I guess QChar uses int/uint internally. Many methods in QChar expect or return uint.

      https://forum.qt.io/topic/113070/qt-code-of-conduct

      VRoninV 1 Reply Last reply
      0
      • jsulmJ jsulm

        @SlySven I didn't check QChar implementation, but I guess QChar uses int/uint internally. Many methods in QChar expect or return uint.

        VRoninV Offline
        VRoninV Offline
        VRonin
        wrote on last edited by
        #3

        @jsulm said in How can QChar::LastValidCodePoint fit inside an unsigned short int?:

        but I guess QChar uses int/uint internally

        it does not: https://code.woboq.org/qt5/include/qt/QtCore/qchar.h.html#87

        looks like it just gets truncated to 0xFFFF

        "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
        ~Napoleon Bonaparte

        On a crusade to banish setIndexWidget() from the holy land of Qt

        1 Reply Last reply
        4

        • Login

        • Login or register to search.
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • Users
        • Groups
        • Search
        • Get Qt Extensions
        • Unsolved