Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Effective Qt Strings
Forum Updated to NodeBB v4.3 + New Features

Effective Qt Strings

Scheduled Pinned Locked Moved General and Desktop
12 Posts 4 Posters 7.2k Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R Offline
    R Offline
    reactive
    wrote on last edited by
    #1

    I'm trying to write a simple parser, but QString has me a bit confused.
    In Java, char is Unicode, String is a simple wrapper over char[] and you can iterate with charAt(pos). No fuss.

    QString's at() and [] wrap each character in a QChar object that I don't need, so should I use utf16()?

    @
    const ushort *QString::utf16() const
    {
    if (d->data != d->array) {
    ... realloc() ...
    }
    return d->array;
    }
    @

    It seems to return a pointer to the ushort array QString uses internally, but I'm not sure what would
    cause that realloc() and copying in that if statement?

    Should/could I ditch QString altogether and load the text file as ushort* somehow?
    Thanks.

    1 Reply Last reply
    0
    • G Offline
      G Offline
      giesbert
      wrote on last edited by
      #2

      If you want to access each single character, use at() or []. QChar is a wrapper of a unicode char.

      But how do you want to parse? Character by Character? or searching? word by word?

      That has influence on how to do it....

      Nokia Certified Qt Specialist.
      Programming Is Like Sex: One mistake and you have to support it for the rest of your life. (Michael Sinz)

      1 Reply Last reply
      0
      • G Offline
        G Offline
        goetz
        wrote on last edited by
        #3

        Depends on what you want to do. In general, I would strongly recommend sticking to QString/QChar since then Qt handles all the unicode stuff for you and you will not have to deal with all the endianess stuff and the like.

        If you really need the code point, "QChar":http://doc.qt.nokia.com/latest/qchar.html has "unicode() ":http://doc.qt.nokia.com/latest/qchar.html#unicode methods that return you the ushort value.

        And no, you almost never want to deal with the encoding/decoding stuff of textfiles yourself when your fine library provides you convenient means for this.

        http://www.catb.org/~esr/faqs/smart-questions.html

        1 Reply Last reply
        0
        • R Offline
          R Offline
          reactive
          wrote on last edited by
          #4

          QChar is a wrapper of a unicode char.

          Yea, that's my issue. Since QChar just holds a ushort, why can't I just get it from QString?
          Ironically, it seems weird to do all that boxing coming from Java.

          1 Reply Last reply
          0
          • G Offline
            G Offline
            goetz
            wrote on last edited by
            #5

            [quote author="reactive" date="1296747071"]> QChar is a wrapper of a unicode char.

            Yea, that's my issue. Since QChar just holds a ushort, why can't I just get it from QString?
            Ironically, it seems weird to do all that boxing coming from Java.[/quote]

            Because Java strings are dumb :-)

            Java's char only contains the character value, much like C/C++'s char with the only improvement that it can contain unicode chars, not only 8bit like in C/C++.

            QChar, on the other hand, provides much more information about the character (which may not be of interest in a plain GUI-agnostic language like Java).

            As stated in the API docs of QChar, it's lightweight and does not create any overhead, thus does not harm the performance or memory print of your application.

            http://www.catb.org/~esr/faqs/smart-questions.html

            1 Reply Last reply
            0
            • G Offline
              G Offline
              giesbert
              wrote on last edited by
              #6

              if you need to compare during parsing, you could do it this way, and I think, it's not with overhead (only in chars to type :-) ):

              @
              QString myString = ...;
              for(int i = 0; i < myString.length(); ++i)
              {
              QChar c = myString.at(i);
              if(c == QChar('a'))
              ....
              }
              @

              Nokia Certified Qt Specialist.
              Programming Is Like Sex: One mistake and you have to support it for the rest of your life. (Michael Sinz)

              1 Reply Last reply
              0
              • R Offline
                R Offline
                reactive
                wrote on last edited by
                #7

                does not create any overhead, thus does not harm the performance or memory print of your application.

                Java Strings in source are interned. If the above is true does that mean this is equivalent:

                @
                if(text[0] == QUOTE) if(text[0] == QChar('"'))
                @

                or would it create a new QChar every time?

                Thanks for both your help! QChar it is.

                [EDIT: code formatting, please use @-tags, Volker]

                1 Reply Last reply
                0
                • G Offline
                  G Offline
                  giesbert
                  wrote on last edited by
                  #8

                  Itv will create a new object as you request it.
                  But it will be fast
                  And even faster if you use:

                  @
                  if(text[0] == QChar(L'"'))
                  @

                  Nokia Certified Qt Specialist.
                  Programming Is Like Sex: One mistake and you have to support it for the rest of your life. (Michael Sinz)

                  1 Reply Last reply
                  0
                  • R Offline
                    R Offline
                    reactive
                    wrote on last edited by
                    #9

                    Haha, weird! Is that a C++ thing or a Qt thing? What does adding the letter "L" do?

                    EDIT:
                    "A character literal that begins with the letter L, such as L’x’, is a wide-character literal. A wide-character literal has type wchar_t"

                    1 Reply Last reply
                    0
                    • G Offline
                      G Offline
                      giesbert
                      wrote on last edited by
                      #10

                      Thats C++

                      L'x' --> unicode char, or wchar_t
                      'x' --> ASCII char

                      Nokia Certified Qt Specialist.
                      Programming Is Like Sex: One mistake and you have to support it for the rest of your life. (Michael Sinz)

                      1 Reply Last reply
                      0
                      • G Offline
                        G Offline
                        goetz
                        wrote on last edited by
                        #11

                        [quote author="reactive" date="1296747811"]> does not create any overhead, thus does not harm the performance or memory print of your application.

                        Java Strings in source are interned. If the above is true does that mean this is equivalent:

                        @
                        if(text[0] == QUOTE) if(text[0] == QChar('"'))
                        @

                        or would it create a new QChar every time?

                        Thanks for both your help! QChar it is.[/quote]

                        Depends on the compiler, but to be on the safe side, assume creating a new object. Although it is constructed on the stack and destroyed as soon as it goes out of scope, no memory penalty in this. You definitely can avoid this by allocating static objects with your constants.

                        http://www.catb.org/~esr/faqs/smart-questions.html

                        1 Reply Last reply
                        0
                        • D Offline
                          D Offline
                          dangelog
                          wrote on last edited by
                          #12

                          Java's java.lang.String and Qt's QStrings use very similar internals. Both are Unicode-compliant and encoded in UTF-16, thus require surrogate pairs to be able to represent code points outside the BMP.

                          The little problem is that Java "char" is defined as a UTF-16 encoded code unit, therefore 16 bit wide, while in the C/C++ world a char has an unspecified number of bits given by the CHAR_BIT define (usually: 8).

                          That's why in Qt you have QChar, which is nothing more than a UTF-16 code unit.

                          More info:
                          http://en.wikipedia.org/wiki/UTF-16/UCS-2
                          http://doc.qt.nokia.com/latest/qstring.html#details
                          http://doc.qt.nokia.com/latest/qchar.html#details
                          http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#unicode
                          http://download.oracle.com/javase/6/docs/api/java/lang/CharSequence.html
                          http://java.sun.com/javase/technologies/core/basic/intl/faq.jsp#core-textrep

                          Software Engineer
                          KDAB (UK) Ltd., a KDAB Group company

                          1 Reply Last reply
                          0

                          • Login

                          • Login or register to search.
                          • First post
                            Last post
                          0
                          • Categories
                          • Recent
                          • Tags
                          • Popular
                          • Users
                          • Groups
                          • Search
                          • Get Qt Extensions
                          • Unsolved