Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Unexpected result from QTextCodec::canEncode(QString&)
QtWS25 Last Chance

Unexpected result from QTextCodec::canEncode(QString&)

Scheduled Pinned Locked Moved Solved General and Desktop
qtextcodeccanencode
7 Posts 3 Posters 1.2k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D Offline
    D Offline
    DDEH
    wrote on 22 Aug 2018, 12:03 last edited by
    #1

    Dear Qt-ists,

    I have created a QTextCodec instance, named codec, for the encoding "US-ASCII", using the method QTextCodec::codecForName.

    Now, I have a QString object str that contains symbols that cannot be encoded using this encoding (© and é, precisely). Indeed, these symbols are replaced with character ? in the QByteArray resulting from a call to codec->fromUnicode(str).

    However the call to codec->canEncode(str) yields true, which I find rather counterintuitive as a result.

    Is this expected behaviour ? If so, then I suppose that the documentation of the method canEncode should be expanded.

    J 1 Reply Last reply 23 Aug 2018, 04:48
    0
    • S Offline
      S Offline
      SGaist
      Lifetime Qt Champion
      wrote on 22 Aug 2018, 21:10 last edited by
      #2

      Hi and welcome to devnet,

      What version of Qt are you using ?
      On what platform ?

      Can you post a minimal compilable sample code that reproduces that ?

      Interested in AI ? www.idiap.ch
      Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

      D 1 Reply Last reply 23 Aug 2018, 10:17
      1
      • D DDEH
        22 Aug 2018, 12:03

        Dear Qt-ists,

        I have created a QTextCodec instance, named codec, for the encoding "US-ASCII", using the method QTextCodec::codecForName.

        Now, I have a QString object str that contains symbols that cannot be encoded using this encoding (© and é, precisely). Indeed, these symbols are replaced with character ? in the QByteArray resulting from a call to codec->fromUnicode(str).

        However the call to codec->canEncode(str) yields true, which I find rather counterintuitive as a result.

        Is this expected behaviour ? If so, then I suppose that the documentation of the method canEncode should be expanded.

        J Offline
        J Offline
        JKSH
        Moderators
        wrote on 23 Aug 2018, 04:48 last edited by
        #3

        @DDEH said in Unexpected result from QTextCodec::canEncode(QString&):

        I have created a QTextCodec instance, named codec, for the encoding "US-ASCII", using the method QTextCodec::codecForName.

        What do you get when you do qDebug() << codec-> mibEnum() << codec->name(); ?

        Indeed, these symbols are replaced with character ? in the QByteArray resulting from a call to codec->fromUnicode(str).

        1. How did you create the unicode string?
        2. How did you display the encoded string?
        3. Call toHex() on your QByteArray. Are the ? characters 0x3F?

        Qt Doc Search for browsers: forum.qt.io/topic/35616/web-browser-extension-for-improved-doc-searches

        D 1 Reply Last reply 23 Aug 2018, 10:28
        3
        • S SGaist
          22 Aug 2018, 21:10

          Hi and welcome to devnet,

          What version of Qt are you using ?
          On what platform ?

          Can you post a minimal compilable sample code that reproduces that ?

          D Offline
          D Offline
          DDEH
          wrote on 23 Aug 2018, 10:17 last edited by DDEH
          #4

          @SGaist said in Unexpected result from QTextCodec::canEncode(QString&):

          Hi and welcome to devnet,

          Hi,

          Thanks for the welcome.

          What version of Qt are you using ?

          5.5.1

          On what platform ?

          linux-x86_64

          Can you post a minimal compilable sample code that reproduces that ?

          Yes, kind of.

          main.cpp

          #include "monediteur.h"
          #include <QApplication>
          
          int main(int argc, char *argv[])
          {
              QApplication a(argc, argv);
              MonEditeur w;
              w.show();
          
              return a.exec();
          }
          

          monediteur.h

          #ifndef MONEDITEUR_H
          #define MONEDITEUR_H
          
          #include <QMainWindow>
          
          namespace Ui {
          class MonEditeur;
          }
          
          class MonEditeur : public QMainWindow
          {
              Q_OBJECT
          
          public:
              explicit MonEditeur(QWidget *parent = 0);
              ~MonEditeur();
          
          private:
              Ui::MonEditeur *ui;
          
          private slots:
              void process();
          };
          
          #endif // MONEDITEUR_H
          

          monediteur.cpp

          #include "monediteur.h"
          #include "ui_monediteur.h"
          
          #include <QTextCodec>
          #include <QDebug>
          
          MonEditeur::MonEditeur(QWidget *parent) :
              QMainWindow(parent),
              ui(new Ui::MonEditeur)
          {
              ui->setupUi(this);
              connect(ui->pushButton, SIGNAL(clicked(bool)), this, SLOT(process()));
          }
          
          MonEditeur::~MonEditeur()
          {
              delete ui;
          }
          
          void MonEditeur::process()
          {
              QString codecId("US-ASCII");
              const QString contents= ui->textEditor->toPlainText();
              QTextCodec* codec=QTextCodec::codecForName(codecId.toLatin1());
              qDebug() << "Some attributes of this codec:";
              qDebug() << codec-> mibEnum() << codec->name();
              if (codec->canEncode(contents)) {
                  qDebug() << codecId << " can encode the contents";
                  qDebug() << contents;
                  QByteArray ba=codec->fromUnicode(contents);
                  QString check=codec->toUnicode(ba);
                  qDebug() << "check: ";
                  qDebug() << check;
                  qDebug() << "--";
                  QByteArray hex = ba.toHex();
                  qDebug() << "toHex: ";
                  qDebug() << hex;
              }
          }
          

          monediteur.ui

          <?xml version="1.0" encoding="UTF-8"?>
          <ui version="4.0">
           <class>MonEditeur</class>
           <widget class="QMainWindow" name="MonEditeur">
            <property name="geometry">
             <rect>
              <x>0</x>
              <y>0</y>
              <width>381</width>
              <height>324</height>
             </rect>
            </property>
            <property name="windowTitle">
             <string>MonEditeur</string>
            </property>
            <widget class="QWidget" name="centralWidget">
             <widget class="QPlainTextEdit" name="textEditor">
              <property name="geometry">
               <rect>
                <x>0</x>
                <y>0</y>
                <width>381</width>
                <height>241</height>
               </rect>
              </property>
             </widget>
             <widget class="QPushButton" name="pushButton">
              <property name="geometry">
               <rect>
                <x>130</x>
                <y>250</y>
                <width>80</width>
                <height>25</height>
               </rect>
              </property>
              <property name="text">
               <string>Process</string>
              </property>
             </widget>
            </widget>
            <widget class="QToolBar" name="mainToolBar">
             <attribute name="toolBarArea">
              <enum>TopToolBarArea</enum>
             </attribute>
             <attribute name="toolBarBreak">
              <bool>false</bool>
             </attribute>
            </widget>
            <widget class="QStatusBar" name="statusBar"/>
           </widget>
           <layoutdefault spacing="6" margin="11"/>
           <resources/>
           <connections/>
          </ui>
          
          1 Reply Last reply
          0
          • J JKSH
            23 Aug 2018, 04:48

            @DDEH said in Unexpected result from QTextCodec::canEncode(QString&):

            I have created a QTextCodec instance, named codec, for the encoding "US-ASCII", using the method QTextCodec::codecForName.

            What do you get when you do qDebug() << codec-> mibEnum() << codec->name(); ?

            Indeed, these symbols are replaced with character ? in the QByteArray resulting from a call to codec->fromUnicode(str).

            1. How did you create the unicode string?
            2. How did you display the encoded string?
            3. Call toHex() on your QByteArray. Are the ? characters 0x3F?
            D Offline
            D Offline
            DDEH
            wrote on 23 Aug 2018, 10:28 last edited by
            #5

            Thanks for taking some of your time to look at this issue.

            My answers are embedded in your post.
            @JKSH said in Unexpected result from QTextCodec::canEncode(QString&):

            @DDEH said in Unexpected result from QTextCodec::canEncode(QString&):

            I have created a QTextCodec instance, named codec, for the encoding "US-ASCII", using the method QTextCodec::codecForName.

            What do you get when you do qDebug() << codec-> mibEnum() << codec->name(); ?

            3 "US-ASCII"

            Indeed, these symbols are replaced with character ? in the QByteArray resulting from a call to codec->fromUnicode(str).

            1. How did you create the unicode string?

            The string is the result of toPlainText() froma qTextEdit instance.

            1. How did you display the encoded string?

            With qDebug() for instance.

            1. Call toHex() on your QByteArray. Are the ? characters 0x3F?

            Yes they are.

            The output of the program I posted in my previous post is the following:

            Some attributes of this codec:
            3 "US-ASCII"
            "US-ASCII"  can encode the contents
            "© André Cymone"
            check: 
            "? Andr? Cymone"
            --
            toHex: 
            "3f20416e64723f2043796d6f6e65"
            
            J 1 Reply Last reply 23 Aug 2018, 13:50
            0
            • D DDEH
              23 Aug 2018, 10:28

              Thanks for taking some of your time to look at this issue.

              My answers are embedded in your post.
              @JKSH said in Unexpected result from QTextCodec::canEncode(QString&):

              @DDEH said in Unexpected result from QTextCodec::canEncode(QString&):

              I have created a QTextCodec instance, named codec, for the encoding "US-ASCII", using the method QTextCodec::codecForName.

              What do you get when you do qDebug() << codec-> mibEnum() << codec->name(); ?

              3 "US-ASCII"

              Indeed, these symbols are replaced with character ? in the QByteArray resulting from a call to codec->fromUnicode(str).

              1. How did you create the unicode string?

              The string is the result of toPlainText() froma qTextEdit instance.

              1. How did you display the encoded string?

              With qDebug() for instance.

              1. Call toHex() on your QByteArray. Are the ? characters 0x3F?

              Yes they are.

              The output of the program I posted in my previous post is the following:

              Some attributes of this codec:
              3 "US-ASCII"
              "US-ASCII"  can encode the contents
              "© André Cymone"
              check: 
              "? Andr? Cymone"
              --
              toHex: 
              "3f20416e64723f2043796d6f6e65"
              
              J Offline
              J Offline
              JKSH
              Moderators
              wrote on 23 Aug 2018, 13:50 last edited by
              #6

              @DDEH said in Unexpected result from QTextCodec::canEncode(QString&):

              The output of the program I posted in my previous post is the following:

              Some attributes of this codec:
              3 "US-ASCII"
              "US-ASCII"  can encode the contents
              "© André Cymone"
              check: 
              "? Andr? Cymone"
              --
              toHex: 
              "3f20416e64723f2043796d6f6e65"
              

              Looks like you found some incorrect behaviour; I agree that canEncode() should return false in your example.

              If it still behaves the same in the latest release (Qt 5.11.1), then you can submit a bug report at https://bugreports.qt.io/. However, I'm guessing that the report will be given low priority since US-ASCII is not a recommended encoding nowadays. (The devs are already putting all their time and energy into fixing much more serious bugs and adding new features)

              Qt Doc Search for browsers: forum.qt.io/topic/35616/web-browser-extension-for-improved-doc-searches

              D 1 Reply Last reply 24 Aug 2018, 15:13
              2
              • J JKSH
                23 Aug 2018, 13:50

                @DDEH said in Unexpected result from QTextCodec::canEncode(QString&):

                The output of the program I posted in my previous post is the following:

                Some attributes of this codec:
                3 "US-ASCII"
                "US-ASCII"  can encode the contents
                "© André Cymone"
                check: 
                "? Andr? Cymone"
                --
                toHex: 
                "3f20416e64723f2043796d6f6e65"
                

                Looks like you found some incorrect behaviour; I agree that canEncode() should return false in your example.

                If it still behaves the same in the latest release (Qt 5.11.1), then you can submit a bug report at https://bugreports.qt.io/. However, I'm guessing that the report will be given low priority since US-ASCII is not a recommended encoding nowadays. (The devs are already putting all their time and energy into fixing much more serious bugs and adding new features)

                D Offline
                D Offline
                DDEH
                wrote on 24 Aug 2018, 15:13 last edited by
                #7

                @JKSH Thanks.

                I would not bet that the bug is restricted to this encoding. But I will investigate this later and possibly report a bug to the correct venue.

                Thanks again for your time and attention.

                1 Reply Last reply
                0

                4/7

                23 Aug 2018, 10:17

                • Login

                • Login or register to search.
                4 out of 7
                • First post
                  4/7
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • Users
                • Groups
                • Search
                • Get Qt Extensions
                • Unsolved