Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Can I read/write Windows-1252 and other legacy encodings in Qt 6?
Forum Updated to NodeBB v4.3 + New Features

Can I read/write Windows-1252 and other legacy encodings in Qt 6?

Scheduled Pinned Locked Moved Unsolved General and Desktop
20 Posts 7 Posters 4.0k Views 2 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • AndyBriceA Offline
    AndyBriceA Offline
    AndyBrice
    wrote on last edited by
    #1

    My application uses Qt 5 on Windows and Mac. Via QTextStream::setCodec() it is able to access 59 different text encodings.

    I have mostly ported my application to Qt 6. But it seems that Qt 6 only supported a handful of text encodings. Qt 6.7 on Windows supports just:

    UTF-8
    UTF-16
    UTF-16LE
    UTF-16BE
    UTF-32
    UTF-32LE
    UTF-32BE
    ISO-8859-1

    My customers want to be able to read and write using encodings such as Windows-1252 and Windows-1256. This is stopping me moving to Qt 6.

    Does anyone know if these legacy encodings will be added back into Qt 6? There are some open issues, but I don't see anything definitive:

    https://bugreports.qt.io/browse/QTBUG-109254
    https://bugreports.qt.io/browse/QTBUG-117362
    https://codereview.qt-project.org/c/qt/qtbase/+/393373
    https://codereview.qt-project.org/c/qt/qtbase/+/429820

    Any information gratefully received.

    A 1 Reply Last reply
    0
    • C Offline
      C Offline
      ChrisW67
      wrote on last edited by ChrisW67
      #2

      I cannot answer the primary question.

      Are the files of a size manageable in RAM, or are we talking multi gigabyte monsters?
      If they are "small" then you could possibly use QStringDecoder to process the source file into a QByteArray. Wrap the byte array with QBuffer and feed that to your existing QTextStream logic.

      1 Reply Last reply
      0
      • AndyBriceA Offline
        AndyBriceA Offline
        AndyBrice
        wrote on last edited by
        #3

        But doesn't QStringDecoder only support 8 of the original 59 encodings? If so, I don't see how that would help.

        1 Reply Last reply
        0
        • aha_1980A Offline
          aha_1980A Offline
          aha_1980
          Lifetime Qt Champion
          wrote on last edited by
          #4

          @AndyBrice

          If that feature is important for you, then comment to the bugreports stating the need and/or ask what's needed to finish the open patches.

          If I get it correctly correctly, most of the work is already done and whats missing is the connection of ICU library to QTextStream.

          Qt has to stay free or it will die.

          1 Reply Last reply
          2
          • B Offline
            B Offline
            Bonnie
            wrote on last edited by
            #5

            From QStringConverter::encodingForName, it says

            Such a name may, none the less, be accepted by the QStringConverter constructor when Qt is built with ICU, if ICU provides a converter with the given name.

            So it should be possible to load an icu-supported codec name. (Seems to start from 6.6 according to this SO post.)
            But I'm not sure if the prebuilt Qt is built with ICU or not.

            AndyBriceA 1 Reply Last reply
            2
            • AndyBriceA AndyBrice

              My application uses Qt 5 on Windows and Mac. Via QTextStream::setCodec() it is able to access 59 different text encodings.

              I have mostly ported my application to Qt 6. But it seems that Qt 6 only supported a handful of text encodings. Qt 6.7 on Windows supports just:

              UTF-8
              UTF-16
              UTF-16LE
              UTF-16BE
              UTF-32
              UTF-32LE
              UTF-32BE
              ISO-8859-1

              My customers want to be able to read and write using encodings such as Windows-1252 and Windows-1256. This is stopping me moving to Qt 6.

              Does anyone know if these legacy encodings will be added back into Qt 6? There are some open issues, but I don't see anything definitive:

              https://bugreports.qt.io/browse/QTBUG-109254
              https://bugreports.qt.io/browse/QTBUG-117362
              https://codereview.qt-project.org/c/qt/qtbase/+/393373
              https://codereview.qt-project.org/c/qt/qtbase/+/429820

              Any information gratefully received.

              A Offline
              A Offline
              ankou29666
              wrote on last edited by
              #6

              @AndyBrice said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

              My customers want to be able to read and write using encodings such as Windows-1252 and Windows-1256. This is stopping me moving to Qt 6.

              If I can understand the need of reading existing data in old formats, however, is there really a point writing to those old formats ?

              aha_1980A AndyBriceA 2 Replies Last reply
              0
              • A ankou29666

                @AndyBrice said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                My customers want to be able to read and write using encodings such as Windows-1252 and Windows-1256. This is stopping me moving to Qt 6.

                If I can understand the need of reading existing data in old formats, however, is there really a point writing to those old formats ?

                aha_1980A Offline
                aha_1980A Offline
                aha_1980
                Lifetime Qt Champion
                wrote on last edited by
                #7

                @ankou29666

                If I can understand the need of reading existing data in old formats, however, is there really a point writing to those old formats ?

                Interoperability with existing, old applications?

                Qt has to stay free or it will die.

                1 Reply Last reply
                1
                • A ankou29666

                  @AndyBrice said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                  My customers want to be able to read and write using encodings such as Windows-1252 and Windows-1256. This is stopping me moving to Qt 6.

                  If I can understand the need of reading existing data in old formats, however, is there really a point writing to those old formats ?

                  AndyBriceA Offline
                  AndyBriceA Offline
                  AndyBrice
                  wrote on last edited by
                  #8

                  @ankou29666 said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                  If I can understand the need of reading existing data in old formats, however, is there really a point writing to those old formats ?

                  In an ideal world, no. But people have to work with all sorts of legacy systems.

                  1 Reply Last reply
                  0
                  • B Bonnie

                    From QStringConverter::encodingForName, it says

                    Such a name may, none the less, be accepted by the QStringConverter constructor when Qt is built with ICU, if ICU provides a converter with the given name.

                    So it should be possible to load an icu-supported codec name. (Seems to start from 6.6 according to this SO post.)
                    But I'm not sure if the prebuilt Qt is built with ICU or not.

                    AndyBriceA Offline
                    AndyBriceA Offline
                    AndyBrice
                    wrote on last edited by
                    #9

                    @Bonnie said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                    So it should be possible to load an icu-supported codec name. (Seems to start from 6.6 according to this SO post.)
                    But I'm not sure if the prebuilt Qt is built with ICU or not.

                    As far as I can make out, it isn't possible to access these additional Codecs from the Qt 6.7 or 6.8 binaries.

                    Years ago, I used to build my own Qt binaries from source. But it just got too difficult.

                    1 Reply Last reply
                    0
                    • AndyBriceA Offline
                      AndyBriceA Offline
                      AndyBrice
                      wrote on last edited by
                      #10

                      I guess my other option is to build a command line encoding converter in Qt 5 and call it from my Qt 6 application. Hardly ideal though.

                      B 1 Reply Last reply
                      0
                      • A Offline
                        A Offline
                        ankou29666
                        wrote on last edited by
                        #11

                        seems like the QTextCodec in Qt5 compatibility module supports Win1250 to 1258 endodings.
                        https://doc.qt.io/qt-6/qtextcodec.html

                        AndyBriceA 1 Reply Last reply
                        1
                        • AndyBriceA AndyBrice

                          I guess my other option is to build a command line encoding converter in Qt 5 and call it from my Qt 6 application. Hardly ideal though.

                          B Offline
                          B Offline
                          Bonnie
                          wrote on last edited by Bonnie
                          #12

                          @AndyBrice Or you can link to icu and use its api your self just like what Qt did in its internal codes, or even use other thirdparty codec libraries.

                          AndyBriceA 1 Reply Last reply
                          3
                          • A ankou29666

                            seems like the QTextCodec in Qt5 compatibility module supports Win1250 to 1258 endodings.
                            https://doc.qt.io/qt-6/qtextcodec.html

                            AndyBriceA Offline
                            AndyBriceA Offline
                            AndyBrice
                            wrote on last edited by
                            #13

                            @ankou29666 said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                            seems like the QTextCodec in Qt5 compatibility module supports Win1250 to 1258 endodings.
                            https://doc.qt.io/qt-6/qtextcodec.html

                            Ok, I didn't spot that. Thanks.

                            So I can install handle these encodings, but to read windows-1252 encoding where I used to do this in Qt 5:

                            QFile f( path );
                            if ( f.open( QIODevice::ReadOnly ) )
                            {
                              QTextStream t( &f );
                              QTextCodec* codec = QTextCodec::codecForName( "windows-1252" );
                              t.setCodec( codec );
                              ...
                            }
                            
                            

                            I have to do this in Qt 6:

                            QByteArray encodedString = "..."; // read from file
                            QTextCodec* codec = QTextCodec::codecForName("windows-1252");
                            QString unencodedString = codec->toUnicode(encodedString);
                            

                            Is that right?

                            1 Reply Last reply
                            0
                            • I Offline
                              I Offline
                              IgKh
                              wrote on last edited by
                              #14

                              If willing to use a sub process, you can always use a iconv binary to pre/post process input or output to/from UTF-16. There are many ways to tackle this issue, which is probably part of the reason no one was motivated enough so far to push the patch to extend Qt 6's QTextCodec to the finish line.

                              AndyBriceA 1 Reply Last reply
                              1
                              • B Bonnie

                                @AndyBrice Or you can link to icu and use its api your self just like what Qt did in its internal codes, or even use other thirdparty codec libraries.

                                AndyBriceA Offline
                                AndyBriceA Offline
                                AndyBrice
                                wrote on last edited by
                                #15

                                @Bonnie said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                                Or you can link to icu and use its api your self just like what Qt did in its internal codes, or even use other thirdparty codec libraries.

                                Are their prebuilt ICU binaries for Windows and Mac? I had a quick look on https://unicode-org.github.io/icu/, but didn't see them.

                                Do you know what the licensing of the binaries is? If they are GPL, I won't be able to use them in my commercial product.

                                S 1 Reply Last reply
                                0
                                • I IgKh

                                  If willing to use a sub process, you can always use a iconv binary to pre/post process input or output to/from UTF-16. There are many ways to tackle this issue, which is probably part of the reason no one was motivated enough so far to push the patch to extend Qt 6's QTextCodec to the finish line.

                                  AndyBriceA Offline
                                  AndyBriceA Offline
                                  AndyBrice
                                  wrote on last edited by
                                  #16

                                  @IgKh said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                                  If willing to use a sub process, you can always use a iconv binary to pre/post process input or output to/from UTF-16.

                                  Is iconv related to the ICU libraries, or completely different?

                                  I 1 Reply Last reply
                                  0
                                  • AndyBriceA AndyBrice

                                    @IgKh said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                                    If willing to use a sub process, you can always use a iconv binary to pre/post process input or output to/from UTF-16.

                                    Is iconv related to the ICU libraries, or completely different?

                                    I Offline
                                    I Offline
                                    IgKh
                                    wrote on last edited by
                                    #17

                                    @AndyBrice iconv isn't related to ICU, it is a very old POSIX API and a corresponding CLI binary that's included in every UNIX-like/Linux system and it is not hard to find compatible Windows versions of it. Usually can work with any text encoding ever known to mankind.

                                    It can be integrated using QProcess, i.e. something like:

                                    QProcess* proc = new QProcess(parent);
                                    proc->setStandardInputFile("path/to/input/file");
                                    proc->start("path/to/iconv", QStringList() << "-f" << "WINDOWS-1252" << "-t" << "UTF16");
                                    

                                    A then the QProcess can be used as source device for QTextStream, since it is a kind of QIODevice. Likewise for the output.

                                    AndyBriceA 1 Reply Last reply
                                    1
                                    • I IgKh

                                      @AndyBrice iconv isn't related to ICU, it is a very old POSIX API and a corresponding CLI binary that's included in every UNIX-like/Linux system and it is not hard to find compatible Windows versions of it. Usually can work with any text encoding ever known to mankind.

                                      It can be integrated using QProcess, i.e. something like:

                                      QProcess* proc = new QProcess(parent);
                                      proc->setStandardInputFile("path/to/input/file");
                                      proc->start("path/to/iconv", QStringList() << "-f" << "WINDOWS-1252" << "-t" << "UTF16");
                                      

                                      A then the QProcess can be used as source device for QTextStream, since it is a kind of QIODevice. Likewise for the output.

                                      AndyBriceA Offline
                                      AndyBriceA Offline
                                      AndyBrice
                                      wrote on last edited by
                                      #18

                                      @IgKh Ok, thanks for the explanation.

                                      1 Reply Last reply
                                      0
                                      • AndyBriceA AndyBrice

                                        @Bonnie said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                                        Or you can link to icu and use its api your self just like what Qt did in its internal codes, or even use other thirdparty codec libraries.

                                        Are their prebuilt ICU binaries for Windows and Mac? I had a quick look on https://unicode-org.github.io/icu/, but didn't see them.

                                        Do you know what the licensing of the binaries is? If they are GPL, I won't be able to use them in my commercial product.

                                        S Offline
                                        S Offline
                                        SimonSchroeder
                                        wrote on last edited by
                                        #19

                                        @AndyBrice said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                                        Are their prebuilt ICU binaries for Windows and Mac? I had a quick look on https://unicode-org.github.io/icu/, but didn't see them.

                                        Binaries are located on their GitHub page under release: https://github.com/unicode-org/icu/releases/tag/release-76-rc. Upon a quick glance I'm not sure if any of these are for macOS, though.

                                        @AndyBrice said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                                        Do you know what the licensing of the binaries is? If they are GPL, I won't be able to use them in my commercial product.

                                        They have a lilst of all the licenses (including 3rd party) that apply: https://github.com/unicode-org/icu?tab=License-1-ov-file. ICU itself seems to be very permissive. Some of the 3rd party libs seem to require a mention with their copyright notice. Overall it should be useable for commercial products.

                                        1 Reply Last reply
                                        1
                                        • AndyBriceA Offline
                                          AndyBriceA Offline
                                          AndyBrice
                                          wrote on last edited by
                                          #20

                                          Thanks, Simon.

                                          1 Reply Last reply
                                          0

                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • Users
                                          • Groups
                                          • Search
                                          • Get Qt Extensions
                                          • Unsolved