Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. [Solved] QString unicode conversion to utf8

[Solved] QString unicode conversion to utf8

Scheduled Pinned Locked Moved General and Desktop
6 Posts 3 Posters 24.9k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C Offline
    C Offline
    changsheng230
    wrote on 18 Jul 2011, 06:51 last edited by
    #1

    Hi,

    As given a QByteArray with GBK encoded, there is a need to convert from it to a Utf-8 QString. However, the process_line_method1 did NOT work, while process_line_method2 did work. However, I have no idea that why method 1 did NOT work. Anyone know the reasons?

    @
    // did not work, utfStr display as messy code
    void process_line_method1(QByteArray line)
    {
    QTextCodec *codec = QTextCodec::codecForName("GBK");
    QString uc =QTextCodec::codecForName( "GBK")->toUnicode(line);

    QByteArray data = uc.toUtf8();
    QString utfStr = codec->toUnicode(data);
    qDebug() << utfStr;
    

    }
    @

    @
    // works well, uftStr output the readable Chinese string
    void process_line_method2(QByteArray line)
    {
    QTextCodec *codec = QTextCodec::codecForName("GBK");
    QString uc =QTextCodec::codecForName( "GBK")->toUnicode(line);
    QString utfStr;
    QTextStream streamFileOut(&uc;);
    streamFileOut.setCodec("UTF-8");
    streamFileOut >> utfStr;
    qDebug() << utfStr;
    }
    @

    Chang Sheng
    常升

    1 Reply Last reply
    0
    • F Offline
      F Offline
      Franzk
      wrote on 18 Jul 2011, 07:21 last edited by
      #2

      Have a look at the comments I added.

      [quote author="changsheng230" date="1310971873"]
      @
      void process_line_method1(QByteArray line)
      {
      QTextCodec *codec = QTextCodec::codecForName("GBK");
      QString uc =QTextCodec::codecForName( "GBK")->toUnicode(line); // here you already have utf16 encoded data

      QByteArray data = uc.toUtf8(); // here you have an utf8 encoded string in the byte array
      QString utfStr = codec->toUnicode(data); // here you are interpreting an utf8 encoded string as GBK, which will result in the mess you mention
      qDebug() << utfStr; // this prints the mess
      

      }
      @
      [/quote]

      "Horse sense is the thing a horse has which keeps it from betting on people." -- W.C. Fields

      http://www.catb.org/~esr/faqs/smart-questions.html

      1 Reply Last reply
      0
      • C Offline
        C Offline
        changsheng230
        wrote on 18 Jul 2011, 07:30 last edited by
        #3

        Thanks a lot! Method 1 works well following your comments.

        @ // works well again, thank Franzk :)
        void process_line(QByteArray line)
        {
        QTextCodec *codec = QTextCodec::codecForName("GBK");
        QString uc =QTextCodec::codecForName( "GBK")->toUnicode(line);
        QByteArray data = uc.toUtf8();
        QTextCodec *codec2 = QTextCodec::codecForName("UTF-8");
        QString utfStr = codec2->toUnicode(data);
        qDebug() << utfStr;
        }
        @
        [quote author="Franzk" date="1310973700"]Have a look at the comments I added.

        [quote author="changsheng230" date="1310971873"]
        @
        void process_line_method1(QByteArray line)
        {
        QTextCodec *codec = QTextCodec::codecForName("GBK");
        QString uc =QTextCodec::codecForName( "GBK")->toUnicode(line); // here you already have utf16 encoded data

        QByteArray data = uc.toUtf8(); // here you have an utf8 encoded string in the byte array
        QString utfStr = codec->toUnicode(data); // here you are interpreting an utf8 encoded string as GBK, which will result in the mess you mention
        qDebug() << utfStr; // this prints the mess
        

        }
        @
        [/quote]
        [/quote]

        Chang Sheng
        常升

        1 Reply Last reply
        0
        • A Offline
          A Offline
          aliosa_sbbv
          wrote on 22 Aug 2014, 19:19 last edited by
          #4

          Hello
          Can you help me with one simple issue ?
          I do not know how to work with unicode strings.
          For example I want to write the name of dialog a chinese string.

          setWindowTitle( tr("國家") ) did not work !

          How to work with clear unicode strings in source code and with qt framework ?

          I did not find any exmple with internation unicode strings !

          Thank you in advance.

          1 Reply Last reply
          0
          • A Offline
            A Offline
            aliosa_sbbv
            wrote on 22 Aug 2014, 19:30 last edited by
            #5

            I think I found a solution

            QByteArray encodedString = "國家";
            QTextCodec *codec = QTextCodec::codecForName( "UTF-8" );
            QString string = codec->toUnicode( encodedString );
            setWindowTitle( string );
            

            But I do not like it.

            I want to use plain unicode strings in my source code for a test.

            1 Reply Last reply
            0
            • F Offline
              F Offline
              Franzk
              wrote on 22 Aug 2014, 20:38 last edited by
              #6

              tl;dr:
              [quote author="aliosa_sbbv" date="1408735194"]Hello
              How to work with clear unicode strings in source code and with qt framework ?[/quote]

              Don't. Keeping encoding in check is hard. Write in English; stick to ASCII. Then use Qt's (or any other) translation system for the result strings.

              The long answer:
              [quote author="aliosa_sbbv" date="1408735194"]
              For example I want to write the name of dialog a chinese string.

              @setWindowTitle( tr("國家") )@

              did not work ![/quote]

              Your compiler has no clue what to do with these characters. It may not even be interpreting the characters in the same way your code editor does.

              This also has the problem that I, a westerner with barely any knowledge of kanji, wouldn't have the slightest clue about what your string means.

              [quote author="aliosa_sbbv" date="1408735830"]I think I found a solution

              @
              QByteArray encodedString = "國家";
              QTextCodec *codec = QTextCodec::codecForName( "UTF-8" );
              QString string = codec->toUnicode( encodedString );
              setWindowTitle( string );
              @

              But I do not like it.[/quote]

              Neither do I. Here absolutely no one will understand what you mean. If you want to use unicode strings safely and directly in your code (while still suffering from some of the aforementioned problems), use

              @setWindowTitle(trUtf8("\u750b\u5bb6")); /* 國家 */@

              You basically shouldn't use any characters outside the ASCII range in your C or C++ code (comments are somewhat OK).

              Your encodedString solution is quite possibly going to come back and haunt you as soon as you switch compilers. Even the file encoding may interfere with proper compilation. Even comments are tricky. While the compiler will not be bothered with them too much, your version control system may have trouble understanding what happens, as well as non-natives in the language.

              The best solution, as practically always in programming, is to stick with pure English in your code:

              @setWindowTitle(tr("My window text"));@

              Then use Qt's translation system to turn "My window text" into your Chinese text. This has two advantages. The first is that almost anyone on this planet is going to be able to understand the text. The second is that you're making your application portable to other languages as well. Don't worry too much about making spelling mistakes in your code strings. The English should also be covered by a translation for serious applications.

              Hope this clears something up.

              See also:

              • qt-project.org/wiki/Strings_and_encodings_in_Qt

              "Horse sense is the thing a horse has which keeps it from betting on people." -- W.C. Fields

              http://www.catb.org/~esr/faqs/smart-questions.html

              1 Reply Last reply
              0

              • Login

              • Login or register to search.
              • First post
                Last post
              0
              • Categories
              • Recent
              • Tags
              • Popular
              • Users
              • Groups
              • Search
              • Get Qt Extensions
              • Unsolved