Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. QString Turkish Character Problem
Qt 6.11 is out! See what's new in the release blog

QString Turkish Character Problem

Scheduled Pinned Locked Moved Unsolved General and Desktop
4 Posts 4 Posters 1.6k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C Offline
    C Offline
    Ceng0
    wrote on last edited by
    #1

    Hi,
    I use Visual Studio for coding. I used the first piece of code below in my project, and the second piece of code I added in the project I created in Visual Studio for trial purposes. When I want to use Turkish characters , I encounter different characters in my own project. In the project I use for trial purposes, 'QTextCodec::codecForLocale()->toUnicode(data.c_str());' I am getting the correct result for . What is the cause of this problem? How can I solve it?

    Project Code:

    std::string dataString = "üöçİış";
    
                QString dataQString = QTextCodec::codecForLocale()->toUnicode(dataString .c_str());
                QString dataQString2 = QString::fromStdString(dataString .c_str());
                QString dataQString3 = QString::fromLatin1(dataString .c_str());
                QString dataQString4 = QString::fromLocal8Bit(dataString .c_str());
    
                qDebug() << dataQString;
                qDebug() << dataQString2;
                qDebug() << dataQString3;
                qDebug() << dataQString4;
    

    Output :

    Ekran görüntüsü 2024-03-22 102409.png

    Trial Code :

        std::string data = "üöçİış";
        QString dataQString = QTextCodec::codecForLocale()->toUnicode(data.c_str());
        QString dataQString2 = QString::fromStdString(data.c_str());
        QString dataQString3 = QString::fromLatin1(data.c_str());
        QString dataQString4 = QString::fromLocal8Bit(data.c_str());
    
        ui.labeltoUnicode->setText(dataQString);
        ui.labelfromStdString->setText(dataQString2);
        ui.labelfromLatin1->setText(dataQString3);
        ui.labelfromLocal8Bit->setText(dataQString4);
    

    Output:
    Ekran görüntüsü 2024-03-22 102516.png

    C 1 Reply Last reply
    0
    • C Ceng0

      Hi,
      I use Visual Studio for coding. I used the first piece of code below in my project, and the second piece of code I added in the project I created in Visual Studio for trial purposes. When I want to use Turkish characters , I encounter different characters in my own project. In the project I use for trial purposes, 'QTextCodec::codecForLocale()->toUnicode(data.c_str());' I am getting the correct result for . What is the cause of this problem? How can I solve it?

      Project Code:

      std::string dataString = "üöçİış";
      
                  QString dataQString = QTextCodec::codecForLocale()->toUnicode(dataString .c_str());
                  QString dataQString2 = QString::fromStdString(dataString .c_str());
                  QString dataQString3 = QString::fromLatin1(dataString .c_str());
                  QString dataQString4 = QString::fromLocal8Bit(dataString .c_str());
      
                  qDebug() << dataQString;
                  qDebug() << dataQString2;
                  qDebug() << dataQString3;
                  qDebug() << dataQString4;
      

      Output :

      Ekran görüntüsü 2024-03-22 102409.png

      Trial Code :

          std::string data = "üöçİış";
          QString dataQString = QTextCodec::codecForLocale()->toUnicode(data.c_str());
          QString dataQString2 = QString::fromStdString(data.c_str());
          QString dataQString3 = QString::fromLatin1(data.c_str());
          QString dataQString4 = QString::fromLocal8Bit(data.c_str());
      
          ui.labeltoUnicode->setText(dataQString);
          ui.labelfromStdString->setText(dataQString2);
          ui.labelfromLatin1->setText(dataQString3);
          ui.labelfromLocal8Bit->setText(dataQString4);
      

      Output:
      Ekran görüntüsü 2024-03-22 102516.png

      C Offline
      C Offline
      ChrisW67
      wrote on last edited by ChrisW67
      #2

      @Ceng0 said in QString Turkish Character Problem:

      std::string data = "üöçİış";
      

      Places these characters encoded to bytes using your system's locale in the string
      There's a reasonable chance this is the Windows 1254 8-bit encoding for Turkish machines
      Bytes in hex: FC F6 E7 DD FD FE

      QString dataQString = QTextCodec::codecForLocale()->toUnicode(data.c_str());
      

      Makes an educated guess at your system's locale, converts the bytes encoded in that locale into unicode equivalents, places them in the string. Will usually match data.

      QString dataQString2 = QString::fromStdString(data.c_str());
      

      From the docs: The given string is assumed to be encoded in UTF-8, and is converted to QString using the fromUtf8() function.
      If it was not encoded as UTF-8 in the first place this risks corrupting the string.
      Your string as UTF-8 should be bytes: c3 bc c3 b6 c3 a7 c4 b0 c4 b1 c5 9f

      QString dataQString3 = QString::fromLatin1(data.c_str());
      

      The given string is assumed to be encoded in Latin-1 (ISO8859-1, Windows 1252)
      If it was not encoded as ISO8859-1 in the first place this risks corrupting the string.
      The last three characters in your string do not exist in ISO8859-1.

      QString dataQString4 = QString::fromLocal8Bit(data.c_str());
      

      The given string is assumed to be encoded in the default 8-bit encoding for your locale. This is likely to match the encoding of your source file and string.

      Have a play with this tool to see what the various encodings/decodings produce.

      How you deal with this depends on exactly what you want in your std::string
      Worst case, you can insert Unicode characters into C++ string literals using "\uxxxx":

      std::string data = "\u00fc\u00f6\u00e7\u0130\u0131\u015f";
      

      Tedious but guaranteed to be encoding agnostic.

      JonBJ 1 Reply Last reply
      4
      • C ChrisW67

        @Ceng0 said in QString Turkish Character Problem:

        std::string data = "üöçİış";
        

        Places these characters encoded to bytes using your system's locale in the string
        There's a reasonable chance this is the Windows 1254 8-bit encoding for Turkish machines
        Bytes in hex: FC F6 E7 DD FD FE

        QString dataQString = QTextCodec::codecForLocale()->toUnicode(data.c_str());
        

        Makes an educated guess at your system's locale, converts the bytes encoded in that locale into unicode equivalents, places them in the string. Will usually match data.

        QString dataQString2 = QString::fromStdString(data.c_str());
        

        From the docs: The given string is assumed to be encoded in UTF-8, and is converted to QString using the fromUtf8() function.
        If it was not encoded as UTF-8 in the first place this risks corrupting the string.
        Your string as UTF-8 should be bytes: c3 bc c3 b6 c3 a7 c4 b0 c4 b1 c5 9f

        QString dataQString3 = QString::fromLatin1(data.c_str());
        

        The given string is assumed to be encoded in Latin-1 (ISO8859-1, Windows 1252)
        If it was not encoded as ISO8859-1 in the first place this risks corrupting the string.
        The last three characters in your string do not exist in ISO8859-1.

        QString dataQString4 = QString::fromLocal8Bit(data.c_str());
        

        The given string is assumed to be encoded in the default 8-bit encoding for your locale. This is likely to match the encoding of your source file and string.

        Have a play with this tool to see what the various encodings/decodings produce.

        How you deal with this depends on exactly what you want in your std::string
        Worst case, you can insert Unicode characters into C++ string literals using "\uxxxx":

        std::string data = "\u00fc\u00f6\u00e7\u0130\u0131\u015f";
        

        Tedious but guaranteed to be encoding agnostic.

        JonBJ Offline
        JonBJ Offline
        JonB
        wrote on last edited by
        #3

        @ChrisW67 Doubtless correct answer. Just confirms to me how ridiculously complicated language encoding is in C++!

        Christian EhrlicherC 1 Reply Last reply
        0
        • JonBJ JonB

          @ChrisW67 Doubtless correct answer. Just confirms to me how ridiculously complicated language encoding is in C++!

          Christian EhrlicherC Offline
          Christian EhrlicherC Offline
          Christian Ehrlicher
          Lifetime Qt Champion
          wrote on last edited by
          #4

          @JonB said in QString Turkish Character Problem:

          how ridiculously complicated language encoding is in C++!

          It's more a windows/msvc problem using anything else but utf-8 for the source files and even during runtime.

          Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
          Visit the Qt Academy at https://academy.qt.io/catalog

          1 Reply Last reply
          2

          • Login

          • Login or register to search.
          • First post
            Last post
          0
          • Categories
          • Recent
          • Tags
          • Popular
          • Users
          • Groups
          • Search
          • Get Qt Extensions
          • Unsolved