Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. toLatin() method replaces apostrophe and double quotes with ?
Forum Updated to NodeBB v4.3 + New Features

toLatin() method replaces apostrophe and double quotes with ?

Scheduled Pinned Locked Moved Solved General and Desktop
11 Posts 6 Posters 1.1k Views 3 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • N Offline
    N Offline
    nitingera
    wrote on last edited by
    #1

    Hi
    I am using toLatin1 to convert a QString to QByteArray. The string contains apostrophe and double quotes. However on console the apostrophe and double quotes are changed to '?'
    I am using Qt 6.4.2

    This is my code
    QString str1 = "That’s good. He is going to “Canada”";

    qDebug() << "Unchanged:" << str1;
    qDebug() << "Latin1:" << str1.toLatin1();
    qDebug() << "Utf8" << str1.toUtf8();
    

    This is the output
    Unchanged: "That’s good. He is going to “Canada”"
    Latin1: "That?s good. He is going to ?Canada?"
    Utf8 "That\xE2\x80\x99s good. He is going to \xE2\x80\x9C""Canada\xE2\x80\x9D"

    If I update and replace all the ’ with ' and “ ” with " then it works fine.
    This is the file I have received from client and sadly I cannot update it. Any way that these can be displayed without modifying the input file

    C 1 Reply Last reply
    0
    • Paul ColbyP Offline
      Paul ColbyP Offline
      Paul Colby
      wrote on last edited by
      #3

      Hi @nitingera,

      The string contains apostrophe and double quotes.

      That's actually subtly wrong. Your string does not contain an apostrophe (U+0027), but a right-single-quote (U+2019).

      Although they may look very similar (or even identical) depending on your screen font, there is actually no Latin-1 representation for right-single-quote, so as per the QString::toLaitin1() docs:

      The returned byte array is undefined if the string contains non-Latin1 characters. Those characters may be suppressed or replaced with a question mark.

      The same goes for your left and right double-quotes.

      Try, for example:

          const QString str1 = QString::fromUtf8("That’s good. He is going to “Canada”");
          qDebug().noquote() << "str1 Unchanged:" << str1;
          qDebug().noquote() << "str1 Latin1:" << str1.toLatin1();
          qDebug().noquote() << "str1 Utf8" << str1.toUtf8();
      
          const QString str2 = QString(str1)
              .replace(QString::fromUtf8("’"),QStringLiteral("'"))
              .replace(QString::fromUtf8("“"),QStringLiteral("\""))
              .replace(QString::fromUtf8("”"),QStringLiteral("\""));
          qDebug().noquote() << "str2 Unchanged:" << str2;
          qDebug().noquote() << "str2 Latin1:" << str2.toLatin1();
          qDebug().noquote() << "str2 Utf8" << str2.toUtf8();
      

      Output:

      str1 Unchanged: That’s good. He is going to “Canada”
      str1 Latin1: That?s good. He is going to ?Canada?
      str1 Utf8 That’s good. He is going to “Canada”
      str2 Unchanged: That's good. He is going to "Canada"
      str2 Latin1: That's good. He is going to "Canada"
      str2 Utf8 That's good. He is going to "Canada"
      

      Why do you want to convert to Latin-1, and assuming you do, what do you want to happen to those non-Latin-1 characters?

      Cheers.

      N 1 Reply Last reply
      2
      • N nitingera

        Hi
        I am using toLatin1 to convert a QString to QByteArray. The string contains apostrophe and double quotes. However on console the apostrophe and double quotes are changed to '?'
        I am using Qt 6.4.2

        This is my code
        QString str1 = "That’s good. He is going to “Canada”";

        qDebug() << "Unchanged:" << str1;
        qDebug() << "Latin1:" << str1.toLatin1();
        qDebug() << "Utf8" << str1.toUtf8();
        

        This is the output
        Unchanged: "That’s good. He is going to “Canada”"
        Latin1: "That?s good. He is going to ?Canada?"
        Utf8 "That\xE2\x80\x99s good. He is going to \xE2\x80\x9C""Canada\xE2\x80\x9D"

        If I update and replace all the ’ with ' and “ ” with " then it works fine.
        This is the file I have received from client and sadly I cannot update it. Any way that these can be displayed without modifying the input file

        C Offline
        C Offline
        Chops
        wrote on last edited by
        #2

        @nitingera said in toLatin() method replaces apostrophe and double quotes with ?:

        Any way that these can be displayed without modifying the input file

        What does "displayed" mean? Looks like you can display them without any trouble.

        qDebug() << "Unchanged:" << str1;
        

        That line of code displays it fine, does it not?

        Any way that these can be displayed without modifying the input file

        You could read in from the input file and then modify the in-memory string you read.

        N 1 Reply Last reply
        0
        • Paul ColbyP Offline
          Paul ColbyP Offline
          Paul Colby
          wrote on last edited by
          #3

          Hi @nitingera,

          The string contains apostrophe and double quotes.

          That's actually subtly wrong. Your string does not contain an apostrophe (U+0027), but a right-single-quote (U+2019).

          Although they may look very similar (or even identical) depending on your screen font, there is actually no Latin-1 representation for right-single-quote, so as per the QString::toLaitin1() docs:

          The returned byte array is undefined if the string contains non-Latin1 characters. Those characters may be suppressed or replaced with a question mark.

          The same goes for your left and right double-quotes.

          Try, for example:

              const QString str1 = QString::fromUtf8("That’s good. He is going to “Canada”");
              qDebug().noquote() << "str1 Unchanged:" << str1;
              qDebug().noquote() << "str1 Latin1:" << str1.toLatin1();
              qDebug().noquote() << "str1 Utf8" << str1.toUtf8();
          
              const QString str2 = QString(str1)
                  .replace(QString::fromUtf8("’"),QStringLiteral("'"))
                  .replace(QString::fromUtf8("“"),QStringLiteral("\""))
                  .replace(QString::fromUtf8("”"),QStringLiteral("\""));
              qDebug().noquote() << "str2 Unchanged:" << str2;
              qDebug().noquote() << "str2 Latin1:" << str2.toLatin1();
              qDebug().noquote() << "str2 Utf8" << str2.toUtf8();
          

          Output:

          str1 Unchanged: That’s good. He is going to “Canada”
          str1 Latin1: That?s good. He is going to ?Canada?
          str1 Utf8 That’s good. He is going to “Canada”
          str2 Unchanged: That's good. He is going to "Canada"
          str2 Latin1: That's good. He is going to "Canada"
          str2 Utf8 That's good. He is going to "Canada"
          

          Why do you want to convert to Latin-1, and assuming you do, what do you want to happen to those non-Latin-1 characters?

          Cheers.

          N 1 Reply Last reply
          2
          • C Chops

            @nitingera said in toLatin() method replaces apostrophe and double quotes with ?:

            Any way that these can be displayed without modifying the input file

            What does "displayed" mean? Looks like you can display them without any trouble.

            qDebug() << "Unchanged:" << str1;
            

            That line of code displays it fine, does it not?

            Any way that these can be displayed without modifying the input file

            You could read in from the input file and then modify the in-memory string you read.

            N Offline
            N Offline
            nitingera
            wrote on last edited by
            #4

            @Chops Thanks for your reply.
            I have received a file from client who is trying to import it in the application.
            I am reading that file and the strings are being converted to Latin1 format for some processing before being displayed.
            When it is converted to Latin1, then all such characters (right-single-quote, left and right double-quotes) are converted to question marks.
            I am looking for a way to prevent them from being converted to question marks

            jsulmJ 1 Reply Last reply
            0
            • N nitingera

              @Chops Thanks for your reply.
              I have received a file from client who is trying to import it in the application.
              I am reading that file and the strings are being converted to Latin1 format for some processing before being displayed.
              When it is converted to Latin1, then all such characters (right-single-quote, left and right double-quotes) are converted to question marks.
              I am looking for a way to prevent them from being converted to question marks

              jsulmJ Offline
              jsulmJ Offline
              jsulm
              Lifetime Qt Champion
              wrote on last edited by
              #5

              @nitingera But why do you need to convert to Latin-1?

              https://forum.qt.io/topic/113070/qt-code-of-conduct

              N 1 Reply Last reply
              0
              • Paul ColbyP Paul Colby

                Hi @nitingera,

                The string contains apostrophe and double quotes.

                That's actually subtly wrong. Your string does not contain an apostrophe (U+0027), but a right-single-quote (U+2019).

                Although they may look very similar (or even identical) depending on your screen font, there is actually no Latin-1 representation for right-single-quote, so as per the QString::toLaitin1() docs:

                The returned byte array is undefined if the string contains non-Latin1 characters. Those characters may be suppressed or replaced with a question mark.

                The same goes for your left and right double-quotes.

                Try, for example:

                    const QString str1 = QString::fromUtf8("That’s good. He is going to “Canada”");
                    qDebug().noquote() << "str1 Unchanged:" << str1;
                    qDebug().noquote() << "str1 Latin1:" << str1.toLatin1();
                    qDebug().noquote() << "str1 Utf8" << str1.toUtf8();
                
                    const QString str2 = QString(str1)
                        .replace(QString::fromUtf8("’"),QStringLiteral("'"))
                        .replace(QString::fromUtf8("“"),QStringLiteral("\""))
                        .replace(QString::fromUtf8("”"),QStringLiteral("\""));
                    qDebug().noquote() << "str2 Unchanged:" << str2;
                    qDebug().noquote() << "str2 Latin1:" << str2.toLatin1();
                    qDebug().noquote() << "str2 Utf8" << str2.toUtf8();
                

                Output:

                str1 Unchanged: That’s good. He is going to “Canada”
                str1 Latin1: That?s good. He is going to ?Canada?
                str1 Utf8 That’s good. He is going to “Canada”
                str2 Unchanged: That's good. He is going to "Canada"
                str2 Latin1: That's good. He is going to "Canada"
                str2 Utf8 That's good. He is going to "Canada"
                

                Why do you want to convert to Latin-1, and assuming you do, what do you want to happen to those non-Latin-1 characters?

                Cheers.

                N Offline
                N Offline
                nitingera
                wrote on last edited by
                #6

                @Paul-Colby
                Thanks for your response. My aim to to process the string that I read from file which has right-single-quote, left and right double-quotes and convert them to QByteArray for processing and then display the output.
                It is not mandatory to convert it to Latin1 but when I tried to convert it to UTF8 as well then also these characters were not properly displayed.
                The exact same code works fine with Qt4 but not with Qt6

                1 Reply Last reply
                0
                • jsulmJ jsulm

                  @nitingera But why do you need to convert to Latin-1?

                  N Offline
                  N Offline
                  nitingera
                  wrote on last edited by
                  #7

                  @jsulm I want to convert it to QByteArray..
                  My issue is that even if I convert it to utf8, it still doesn't display right-single-quote, left and right double-quotes

                  jsulmJ 1 Reply Last reply
                  0
                  • N nitingera

                    @jsulm I want to convert it to QByteArray..
                    My issue is that even if I convert it to utf8, it still doesn't display right-single-quote, left and right double-quotes

                    jsulmJ Offline
                    jsulmJ Offline
                    jsulm
                    Lifetime Qt Champion
                    wrote on last edited by
                    #8

                    @nitingera One comment: do not trust qDebug() output in such cases! qDebug is only for debugging! Better use std::cout or a widget to display the text.
                    What is the encoding of the string you get?

                    https://forum.qt.io/topic/113070/qt-code-of-conduct

                    JonBJ 1 Reply Last reply
                    0
                    • jsulmJ jsulm

                      @nitingera One comment: do not trust qDebug() output in such cases! qDebug is only for debugging! Better use std::cout or a widget to display the text.
                      What is the encoding of the string you get?

                      JonBJ Offline
                      JonBJ Offline
                      JonB
                      wrote on last edited by
                      #9

                      @jsulm
                      I wanted to say that too. But when I look at @Paul-Colby's post above using qDebug() he shows

                      str1 Unchanged: That’s good. He is going to “Canada”
                      

                      so I think that means qDebug() to console window does manage to show them?

                      1 Reply Last reply
                      0
                      • N nitingera has marked this topic as solved on
                      • Paul ColbyP Offline
                        Paul ColbyP Offline
                        Paul Colby
                        wrote on last edited by Paul Colby
                        #10

                        so I think that means qDebug() to console window does manage to show them?

                        By default (it can be overridden using qInstallMessageHandler) qDebug() messages end up going through the qDefaultMessageHandler(), which, when writing to a console (as opposed to various other outputs, like syslog), end up using QString::toLocal8Bit() like:

                        fprintf(stderr, "%s\n", formattedMessage.toLocal8Bit().constData());
                        

                        (see stderr_message_handler for example)

                        And as per the QString::toLocal8Bit() docs:

                        Returns the local 8-bit representation of the string as a QByteArray. The returned byte array is undefined if the string contains characters not supported by the local 8-bit encoding.
                        On Unix systems this is equivalent to toUtf8(), on Windows the systems current code page is being used.
                        If this string contains any characters that cannot be encoded in the locale, the returned byte array is undefined. Those characters may be suppressed or replaced by another.

                        So in my case it was fine to use qDebug() to demonstrate the differences in the strings before and after replacing various characters, because my local console handles Unicode with no problems, but as @nitingera wrote, that's not something you should rely on for user-facing output.

                        Cheers.

                        Edit: If I remember correctly, older Qt versions used to use QString::qUtf8Printable() for qDebug() output, but as per the docs:

                        This is equivalent to str.toUtf8().constData().

                        So it ends up the same anyway :)

                        kkoehneK 1 Reply Last reply
                        2
                        • Paul ColbyP Paul Colby

                          so I think that means qDebug() to console window does manage to show them?

                          By default (it can be overridden using qInstallMessageHandler) qDebug() messages end up going through the qDefaultMessageHandler(), which, when writing to a console (as opposed to various other outputs, like syslog), end up using QString::toLocal8Bit() like:

                          fprintf(stderr, "%s\n", formattedMessage.toLocal8Bit().constData());
                          

                          (see stderr_message_handler for example)

                          And as per the QString::toLocal8Bit() docs:

                          Returns the local 8-bit representation of the string as a QByteArray. The returned byte array is undefined if the string contains characters not supported by the local 8-bit encoding.
                          On Unix systems this is equivalent to toUtf8(), on Windows the systems current code page is being used.
                          If this string contains any characters that cannot be encoded in the locale, the returned byte array is undefined. Those characters may be suppressed or replaced by another.

                          So in my case it was fine to use qDebug() to demonstrate the differences in the strings before and after replacing various characters, because my local console handles Unicode with no problems, but as @nitingera wrote, that's not something you should rely on for user-facing output.

                          Cheers.

                          Edit: If I remember correctly, older Qt versions used to use QString::qUtf8Printable() for qDebug() output, but as per the docs:

                          This is equivalent to str.toUtf8().constData().

                          So it ends up the same anyway :)

                          kkoehneK Offline
                          kkoehneK Offline
                          kkoehne
                          Moderators
                          wrote on last edited by
                          #11

                          @Paul-Colby said in toLatin() method replaces apostrophe and double quotes with ?:

                          Edit: If I remember correctly, older Qt versions used to use QString::qUtf8Printable() for qDebug() output, but as per the docs:

                          This is equivalent to str.toUtf8().constData().

                          So it ends up the same anyway :)

                          qUtf8Printable() is still actually used with qDebug() ... but in a different place: Use it if you want to use the "printf-style" API of qDebug, which expects utf-8 encoded strings for %s. E.g.

                          QString str = "...";
                          qDebug("Output: %s", qUtf8Printable(str));
                          

                          This is different from system printf, which expects the local 8 bit encoding by default, so you better use qPrintable()/.toLocal8Bit().constData():

                          QString str = "...";
                          printf(stdout, "Output: %s", qPrintable(str));
                          

                          But yeah, there's still no guarantee that the printed string will actually also show up correctly if printed on console , as Windows has it's own limitations there ...

                          Director R&D, The Qt Company

                          1 Reply Last reply
                          2

                          • Login

                          • Login or register to search.
                          • First post
                            Last post
                          0
                          • Categories
                          • Recent
                          • Tags
                          • Popular
                          • Users
                          • Groups
                          • Search
                          • Get Qt Extensions
                          • Unsolved