Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Unexpected result from QTextCodec::canEncode(QString&)
QtWS25 Last Chance

Unexpected result from QTextCodec::canEncode(QString&)

Scheduled Pinned Locked Moved Solved General and Desktop
qtextcodeccanencode
7 Posts 3 Posters 1.2k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D Offline
    D Offline
    DDEH
    wrote on last edited by
    #1

    Dear Qt-ists,

    I have created a QTextCodec instance, named codec, for the encoding "US-ASCII", using the method QTextCodec::codecForName.

    Now, I have a QString object str that contains symbols that cannot be encoded using this encoding (© and é, precisely). Indeed, these symbols are replaced with character ? in the QByteArray resulting from a call to codec->fromUnicode(str).

    However the call to codec->canEncode(str) yields true, which I find rather counterintuitive as a result.

    Is this expected behaviour ? If so, then I suppose that the documentation of the method canEncode should be expanded.

    JKSHJ 1 Reply Last reply
    0
    • SGaistS Offline
      SGaistS Offline
      SGaist
      Lifetime Qt Champion
      wrote on last edited by
      #2

      Hi and welcome to devnet,

      What version of Qt are you using ?
      On what platform ?

      Can you post a minimal compilable sample code that reproduces that ?

      Interested in AI ? www.idiap.ch
      Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

      D 1 Reply Last reply
      1
      • D DDEH

        Dear Qt-ists,

        I have created a QTextCodec instance, named codec, for the encoding "US-ASCII", using the method QTextCodec::codecForName.

        Now, I have a QString object str that contains symbols that cannot be encoded using this encoding (© and é, precisely). Indeed, these symbols are replaced with character ? in the QByteArray resulting from a call to codec->fromUnicode(str).

        However the call to codec->canEncode(str) yields true, which I find rather counterintuitive as a result.

        Is this expected behaviour ? If so, then I suppose that the documentation of the method canEncode should be expanded.

        JKSHJ Offline
        JKSHJ Offline
        JKSH
        Moderators
        wrote on last edited by
        #3

        @DDEH said in Unexpected result from QTextCodec::canEncode(QString&):

        I have created a QTextCodec instance, named codec, for the encoding "US-ASCII", using the method QTextCodec::codecForName.

        What do you get when you do qDebug() << codec-> mibEnum() << codec->name(); ?

        Indeed, these symbols are replaced with character ? in the QByteArray resulting from a call to codec->fromUnicode(str).

        1. How did you create the unicode string?
        2. How did you display the encoded string?
        3. Call toHex() on your QByteArray. Are the ? characters 0x3F?

        Qt Doc Search for browsers: forum.qt.io/topic/35616/web-browser-extension-for-improved-doc-searches

        D 1 Reply Last reply
        3
        • SGaistS SGaist

          Hi and welcome to devnet,

          What version of Qt are you using ?
          On what platform ?

          Can you post a minimal compilable sample code that reproduces that ?

          D Offline
          D Offline
          DDEH
          wrote on last edited by DDEH
          #4

          @SGaist said in Unexpected result from QTextCodec::canEncode(QString&):

          Hi and welcome to devnet,

          Hi,

          Thanks for the welcome.

          What version of Qt are you using ?

          5.5.1

          On what platform ?

          linux-x86_64

          Can you post a minimal compilable sample code that reproduces that ?

          Yes, kind of.

          main.cpp

          #include "monediteur.h"
          #include <QApplication>
          
          int main(int argc, char *argv[])
          {
              QApplication a(argc, argv);
              MonEditeur w;
              w.show();
          
              return a.exec();
          }
          

          monediteur.h

          #ifndef MONEDITEUR_H
          #define MONEDITEUR_H
          
          #include <QMainWindow>
          
          namespace Ui {
          class MonEditeur;
          }
          
          class MonEditeur : public QMainWindow
          {
              Q_OBJECT
          
          public:
              explicit MonEditeur(QWidget *parent = 0);
              ~MonEditeur();
          
          private:
              Ui::MonEditeur *ui;
          
          private slots:
              void process();
          };
          
          #endif // MONEDITEUR_H
          

          monediteur.cpp

          #include "monediteur.h"
          #include "ui_monediteur.h"
          
          #include <QTextCodec>
          #include <QDebug>
          
          MonEditeur::MonEditeur(QWidget *parent) :
              QMainWindow(parent),
              ui(new Ui::MonEditeur)
          {
              ui->setupUi(this);
              connect(ui->pushButton, SIGNAL(clicked(bool)), this, SLOT(process()));
          }
          
          MonEditeur::~MonEditeur()
          {
              delete ui;
          }
          
          void MonEditeur::process()
          {
              QString codecId("US-ASCII");
              const QString contents= ui->textEditor->toPlainText();
              QTextCodec* codec=QTextCodec::codecForName(codecId.toLatin1());
              qDebug() << "Some attributes of this codec:";
              qDebug() << codec-> mibEnum() << codec->name();
              if (codec->canEncode(contents)) {
                  qDebug() << codecId << " can encode the contents";
                  qDebug() << contents;
                  QByteArray ba=codec->fromUnicode(contents);
                  QString check=codec->toUnicode(ba);
                  qDebug() << "check: ";
                  qDebug() << check;
                  qDebug() << "--";
                  QByteArray hex = ba.toHex();
                  qDebug() << "toHex: ";
                  qDebug() << hex;
              }
          }
          

          monediteur.ui

          <?xml version="1.0" encoding="UTF-8"?>
          <ui version="4.0">
           <class>MonEditeur</class>
           <widget class="QMainWindow" name="MonEditeur">
            <property name="geometry">
             <rect>
              <x>0</x>
              <y>0</y>
              <width>381</width>
              <height>324</height>
             </rect>
            </property>
            <property name="windowTitle">
             <string>MonEditeur</string>
            </property>
            <widget class="QWidget" name="centralWidget">
             <widget class="QPlainTextEdit" name="textEditor">
              <property name="geometry">
               <rect>
                <x>0</x>
                <y>0</y>
                <width>381</width>
                <height>241</height>
               </rect>
              </property>
             </widget>
             <widget class="QPushButton" name="pushButton">
              <property name="geometry">
               <rect>
                <x>130</x>
                <y>250</y>
                <width>80</width>
                <height>25</height>
               </rect>
              </property>
              <property name="text">
               <string>Process</string>
              </property>
             </widget>
            </widget>
            <widget class="QToolBar" name="mainToolBar">
             <attribute name="toolBarArea">
              <enum>TopToolBarArea</enum>
             </attribute>
             <attribute name="toolBarBreak">
              <bool>false</bool>
             </attribute>
            </widget>
            <widget class="QStatusBar" name="statusBar"/>
           </widget>
           <layoutdefault spacing="6" margin="11"/>
           <resources/>
           <connections/>
          </ui>
          
          1 Reply Last reply
          0
          • JKSHJ JKSH

            @DDEH said in Unexpected result from QTextCodec::canEncode(QString&):

            I have created a QTextCodec instance, named codec, for the encoding "US-ASCII", using the method QTextCodec::codecForName.

            What do you get when you do qDebug() << codec-> mibEnum() << codec->name(); ?

            Indeed, these symbols are replaced with character ? in the QByteArray resulting from a call to codec->fromUnicode(str).

            1. How did you create the unicode string?
            2. How did you display the encoded string?
            3. Call toHex() on your QByteArray. Are the ? characters 0x3F?
            D Offline
            D Offline
            DDEH
            wrote on last edited by
            #5

            Thanks for taking some of your time to look at this issue.

            My answers are embedded in your post.
            @JKSH said in Unexpected result from QTextCodec::canEncode(QString&):

            @DDEH said in Unexpected result from QTextCodec::canEncode(QString&):

            I have created a QTextCodec instance, named codec, for the encoding "US-ASCII", using the method QTextCodec::codecForName.

            What do you get when you do qDebug() << codec-> mibEnum() << codec->name(); ?

            3 "US-ASCII"

            Indeed, these symbols are replaced with character ? in the QByteArray resulting from a call to codec->fromUnicode(str).

            1. How did you create the unicode string?

            The string is the result of toPlainText() froma qTextEdit instance.

            1. How did you display the encoded string?

            With qDebug() for instance.

            1. Call toHex() on your QByteArray. Are the ? characters 0x3F?

            Yes they are.

            The output of the program I posted in my previous post is the following:

            Some attributes of this codec:
            3 "US-ASCII"
            "US-ASCII"  can encode the contents
            "© André Cymone"
            check: 
            "? Andr? Cymone"
            --
            toHex: 
            "3f20416e64723f2043796d6f6e65"
            
            JKSHJ 1 Reply Last reply
            0
            • D DDEH

              Thanks for taking some of your time to look at this issue.

              My answers are embedded in your post.
              @JKSH said in Unexpected result from QTextCodec::canEncode(QString&):

              @DDEH said in Unexpected result from QTextCodec::canEncode(QString&):

              I have created a QTextCodec instance, named codec, for the encoding "US-ASCII", using the method QTextCodec::codecForName.

              What do you get when you do qDebug() << codec-> mibEnum() << codec->name(); ?

              3 "US-ASCII"

              Indeed, these symbols are replaced with character ? in the QByteArray resulting from a call to codec->fromUnicode(str).

              1. How did you create the unicode string?

              The string is the result of toPlainText() froma qTextEdit instance.

              1. How did you display the encoded string?

              With qDebug() for instance.

              1. Call toHex() on your QByteArray. Are the ? characters 0x3F?

              Yes they are.

              The output of the program I posted in my previous post is the following:

              Some attributes of this codec:
              3 "US-ASCII"
              "US-ASCII"  can encode the contents
              "© André Cymone"
              check: 
              "? Andr? Cymone"
              --
              toHex: 
              "3f20416e64723f2043796d6f6e65"
              
              JKSHJ Offline
              JKSHJ Offline
              JKSH
              Moderators
              wrote on last edited by
              #6

              @DDEH said in Unexpected result from QTextCodec::canEncode(QString&):

              The output of the program I posted in my previous post is the following:

              Some attributes of this codec:
              3 "US-ASCII"
              "US-ASCII"  can encode the contents
              "© André Cymone"
              check: 
              "? Andr? Cymone"
              --
              toHex: 
              "3f20416e64723f2043796d6f6e65"
              

              Looks like you found some incorrect behaviour; I agree that canEncode() should return false in your example.

              If it still behaves the same in the latest release (Qt 5.11.1), then you can submit a bug report at https://bugreports.qt.io/. However, I'm guessing that the report will be given low priority since US-ASCII is not a recommended encoding nowadays. (The devs are already putting all their time and energy into fixing much more serious bugs and adding new features)

              Qt Doc Search for browsers: forum.qt.io/topic/35616/web-browser-extension-for-improved-doc-searches

              D 1 Reply Last reply
              2
              • JKSHJ JKSH

                @DDEH said in Unexpected result from QTextCodec::canEncode(QString&):

                The output of the program I posted in my previous post is the following:

                Some attributes of this codec:
                3 "US-ASCII"
                "US-ASCII"  can encode the contents
                "© André Cymone"
                check: 
                "? Andr? Cymone"
                --
                toHex: 
                "3f20416e64723f2043796d6f6e65"
                

                Looks like you found some incorrect behaviour; I agree that canEncode() should return false in your example.

                If it still behaves the same in the latest release (Qt 5.11.1), then you can submit a bug report at https://bugreports.qt.io/. However, I'm guessing that the report will be given low priority since US-ASCII is not a recommended encoding nowadays. (The devs are already putting all their time and energy into fixing much more serious bugs and adding new features)

                D Offline
                D Offline
                DDEH
                wrote on last edited by
                #7

                @JKSH Thanks.

                I would not bet that the bug is restricted to this encoding. But I will investigate this later and possibly report a bug to the correct venue.

                Thanks again for your time and attention.

                1 Reply Last reply
                0

                • Login

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • Users
                • Groups
                • Search
                • Get Qt Extensions
                • Unsolved