Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. QString Turkish Character Problem
Forum Updated to NodeBB v4.3 + New Features

QString Turkish Character Problem

Scheduled Pinned Locked Moved Unsolved General and Desktop
4 Posts 4 Posters 915 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C Offline
    C Offline
    Ceng0
    wrote on last edited by
    #1

    Hi,
    I use Visual Studio for coding. I used the first piece of code below in my project, and the second piece of code I added in the project I created in Visual Studio for trial purposes. When I want to use Turkish characters , I encounter different characters in my own project. In the project I use for trial purposes, 'QTextCodec::codecForLocale()->toUnicode(data.c_str());' I am getting the correct result for . What is the cause of this problem? How can I solve it?

    Project Code:

    std::string dataString = "üöçİış";
    
                QString dataQString = QTextCodec::codecForLocale()->toUnicode(dataString .c_str());
                QString dataQString2 = QString::fromStdString(dataString .c_str());
                QString dataQString3 = QString::fromLatin1(dataString .c_str());
                QString dataQString4 = QString::fromLocal8Bit(dataString .c_str());
    
                qDebug() << dataQString;
                qDebug() << dataQString2;
                qDebug() << dataQString3;
                qDebug() << dataQString4;
    

    Output :

    Ekran görüntüsü 2024-03-22 102409.png

    Trial Code :

        std::string data = "üöçİış";
        QString dataQString = QTextCodec::codecForLocale()->toUnicode(data.c_str());
        QString dataQString2 = QString::fromStdString(data.c_str());
        QString dataQString3 = QString::fromLatin1(data.c_str());
        QString dataQString4 = QString::fromLocal8Bit(data.c_str());
    
        ui.labeltoUnicode->setText(dataQString);
        ui.labelfromStdString->setText(dataQString2);
        ui.labelfromLatin1->setText(dataQString3);
        ui.labelfromLocal8Bit->setText(dataQString4);
    

    Output:
    Ekran görüntüsü 2024-03-22 102516.png

    C 1 Reply Last reply
    0
    • C Ceng0

      Hi,
      I use Visual Studio for coding. I used the first piece of code below in my project, and the second piece of code I added in the project I created in Visual Studio for trial purposes. When I want to use Turkish characters , I encounter different characters in my own project. In the project I use for trial purposes, 'QTextCodec::codecForLocale()->toUnicode(data.c_str());' I am getting the correct result for . What is the cause of this problem? How can I solve it?

      Project Code:

      std::string dataString = "üöçİış";
      
                  QString dataQString = QTextCodec::codecForLocale()->toUnicode(dataString .c_str());
                  QString dataQString2 = QString::fromStdString(dataString .c_str());
                  QString dataQString3 = QString::fromLatin1(dataString .c_str());
                  QString dataQString4 = QString::fromLocal8Bit(dataString .c_str());
      
                  qDebug() << dataQString;
                  qDebug() << dataQString2;
                  qDebug() << dataQString3;
                  qDebug() << dataQString4;
      

      Output :

      Ekran görüntüsü 2024-03-22 102409.png

      Trial Code :

          std::string data = "üöçİış";
          QString dataQString = QTextCodec::codecForLocale()->toUnicode(data.c_str());
          QString dataQString2 = QString::fromStdString(data.c_str());
          QString dataQString3 = QString::fromLatin1(data.c_str());
          QString dataQString4 = QString::fromLocal8Bit(data.c_str());
      
          ui.labeltoUnicode->setText(dataQString);
          ui.labelfromStdString->setText(dataQString2);
          ui.labelfromLatin1->setText(dataQString3);
          ui.labelfromLocal8Bit->setText(dataQString4);
      

      Output:
      Ekran görüntüsü 2024-03-22 102516.png

      C Offline
      C Offline
      ChrisW67
      wrote on last edited by ChrisW67
      #2

      @Ceng0 said in QString Turkish Character Problem:

      std::string data = "üöçİış";
      

      Places these characters encoded to bytes using your system's locale in the string
      There's a reasonable chance this is the Windows 1254 8-bit encoding for Turkish machines
      Bytes in hex: FC F6 E7 DD FD FE

      QString dataQString = QTextCodec::codecForLocale()->toUnicode(data.c_str());
      

      Makes an educated guess at your system's locale, converts the bytes encoded in that locale into unicode equivalents, places them in the string. Will usually match data.

      QString dataQString2 = QString::fromStdString(data.c_str());
      

      From the docs: The given string is assumed to be encoded in UTF-8, and is converted to QString using the fromUtf8() function.
      If it was not encoded as UTF-8 in the first place this risks corrupting the string.
      Your string as UTF-8 should be bytes: c3 bc c3 b6 c3 a7 c4 b0 c4 b1 c5 9f

      QString dataQString3 = QString::fromLatin1(data.c_str());
      

      The given string is assumed to be encoded in Latin-1 (ISO8859-1, Windows 1252)
      If it was not encoded as ISO8859-1 in the first place this risks corrupting the string.
      The last three characters in your string do not exist in ISO8859-1.

      QString dataQString4 = QString::fromLocal8Bit(data.c_str());
      

      The given string is assumed to be encoded in the default 8-bit encoding for your locale. This is likely to match the encoding of your source file and string.

      Have a play with this tool to see what the various encodings/decodings produce.

      How you deal with this depends on exactly what you want in your std::string
      Worst case, you can insert Unicode characters into C++ string literals using "\uxxxx":

      std::string data = "\u00fc\u00f6\u00e7\u0130\u0131\u015f";
      

      Tedious but guaranteed to be encoding agnostic.

      JonBJ 1 Reply Last reply
      4
      • C ChrisW67

        @Ceng0 said in QString Turkish Character Problem:

        std::string data = "üöçİış";
        

        Places these characters encoded to bytes using your system's locale in the string
        There's a reasonable chance this is the Windows 1254 8-bit encoding for Turkish machines
        Bytes in hex: FC F6 E7 DD FD FE

        QString dataQString = QTextCodec::codecForLocale()->toUnicode(data.c_str());
        

        Makes an educated guess at your system's locale, converts the bytes encoded in that locale into unicode equivalents, places them in the string. Will usually match data.

        QString dataQString2 = QString::fromStdString(data.c_str());
        

        From the docs: The given string is assumed to be encoded in UTF-8, and is converted to QString using the fromUtf8() function.
        If it was not encoded as UTF-8 in the first place this risks corrupting the string.
        Your string as UTF-8 should be bytes: c3 bc c3 b6 c3 a7 c4 b0 c4 b1 c5 9f

        QString dataQString3 = QString::fromLatin1(data.c_str());
        

        The given string is assumed to be encoded in Latin-1 (ISO8859-1, Windows 1252)
        If it was not encoded as ISO8859-1 in the first place this risks corrupting the string.
        The last three characters in your string do not exist in ISO8859-1.

        QString dataQString4 = QString::fromLocal8Bit(data.c_str());
        

        The given string is assumed to be encoded in the default 8-bit encoding for your locale. This is likely to match the encoding of your source file and string.

        Have a play with this tool to see what the various encodings/decodings produce.

        How you deal with this depends on exactly what you want in your std::string
        Worst case, you can insert Unicode characters into C++ string literals using "\uxxxx":

        std::string data = "\u00fc\u00f6\u00e7\u0130\u0131\u015f";
        

        Tedious but guaranteed to be encoding agnostic.

        JonBJ Offline
        JonBJ Offline
        JonB
        wrote on last edited by
        #3

        @ChrisW67 Doubtless correct answer. Just confirms to me how ridiculously complicated language encoding is in C++!

        Christian EhrlicherC 1 Reply Last reply
        0
        • JonBJ JonB

          @ChrisW67 Doubtless correct answer. Just confirms to me how ridiculously complicated language encoding is in C++!

          Christian EhrlicherC Offline
          Christian EhrlicherC Offline
          Christian Ehrlicher
          Lifetime Qt Champion
          wrote on last edited by
          #4

          @JonB said in QString Turkish Character Problem:

          how ridiculously complicated language encoding is in C++!

          It's more a windows/msvc problem using anything else but utf-8 for the source files and even during runtime.

          Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
          Visit the Qt Academy at https://academy.qt.io/catalog

          1 Reply Last reply
          2

          • Login

          • Login or register to search.
          • First post
            Last post
          0
          • Categories
          • Recent
          • Tags
          • Popular
          • Users
          • Groups
          • Search
          • Get Qt Extensions
          • Unsolved