Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Can I read/write Windows-1252 and other legacy encodings in Qt 6?
Qt 6.11 is out! See what's new in the release blog

Can I read/write Windows-1252 and other legacy encodings in Qt 6?

Scheduled Pinned Locked Moved Unsolved General and Desktop
20 Posts 7 Posters 5.4k Views 2 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C Offline
    C Offline
    ChrisW67
    wrote on last edited by ChrisW67
    #2

    I cannot answer the primary question.

    Are the files of a size manageable in RAM, or are we talking multi gigabyte monsters?
    If they are "small" then you could possibly use QStringDecoder to process the source file into a QByteArray. Wrap the byte array with QBuffer and feed that to your existing QTextStream logic.

    1 Reply Last reply
    0
    • AndyBriceA Offline
      AndyBriceA Offline
      AndyBrice
      wrote on last edited by
      #3

      But doesn't QStringDecoder only support 8 of the original 59 encodings? If so, I don't see how that would help.

      1 Reply Last reply
      0
      • aha_1980A Offline
        aha_1980A Offline
        aha_1980
        Lifetime Qt Champion
        wrote on last edited by
        #4

        @AndyBrice

        If that feature is important for you, then comment to the bugreports stating the need and/or ask what's needed to finish the open patches.

        If I get it correctly correctly, most of the work is already done and whats missing is the connection of ICU library to QTextStream.

        Qt has to stay free or it will die.

        1 Reply Last reply
        2
        • B Offline
          B Offline
          Bonnie
          wrote on last edited by
          #5

          From QStringConverter::encodingForName, it says

          Such a name may, none the less, be accepted by the QStringConverter constructor when Qt is built with ICU, if ICU provides a converter with the given name.

          So it should be possible to load an icu-supported codec name. (Seems to start from 6.6 according to this SO post.)
          But I'm not sure if the prebuilt Qt is built with ICU or not.

          AndyBriceA 1 Reply Last reply
          2
          • AndyBriceA AndyBrice

            My application uses Qt 5 on Windows and Mac. Via QTextStream::setCodec() it is able to access 59 different text encodings.

            I have mostly ported my application to Qt 6. But it seems that Qt 6 only supported a handful of text encodings. Qt 6.7 on Windows supports just:

            UTF-8
            UTF-16
            UTF-16LE
            UTF-16BE
            UTF-32
            UTF-32LE
            UTF-32BE
            ISO-8859-1

            My customers want to be able to read and write using encodings such as Windows-1252 and Windows-1256. This is stopping me moving to Qt 6.

            Does anyone know if these legacy encodings will be added back into Qt 6? There are some open issues, but I don't see anything definitive:

            https://bugreports.qt.io/browse/QTBUG-109254
            https://bugreports.qt.io/browse/QTBUG-117362
            https://codereview.qt-project.org/c/qt/qtbase/+/393373
            https://codereview.qt-project.org/c/qt/qtbase/+/429820

            Any information gratefully received.

            A Offline
            A Offline
            ankou29666
            wrote on last edited by
            #6

            @AndyBrice said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

            My customers want to be able to read and write using encodings such as Windows-1252 and Windows-1256. This is stopping me moving to Qt 6.

            If I can understand the need of reading existing data in old formats, however, is there really a point writing to those old formats ?

            aha_1980A AndyBriceA 2 Replies Last reply
            0
            • A ankou29666

              @AndyBrice said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

              My customers want to be able to read and write using encodings such as Windows-1252 and Windows-1256. This is stopping me moving to Qt 6.

              If I can understand the need of reading existing data in old formats, however, is there really a point writing to those old formats ?

              aha_1980A Offline
              aha_1980A Offline
              aha_1980
              Lifetime Qt Champion
              wrote on last edited by
              #7

              @ankou29666

              If I can understand the need of reading existing data in old formats, however, is there really a point writing to those old formats ?

              Interoperability with existing, old applications?

              Qt has to stay free or it will die.

              1 Reply Last reply
              1
              • A ankou29666

                @AndyBrice said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                My customers want to be able to read and write using encodings such as Windows-1252 and Windows-1256. This is stopping me moving to Qt 6.

                If I can understand the need of reading existing data in old formats, however, is there really a point writing to those old formats ?

                AndyBriceA Offline
                AndyBriceA Offline
                AndyBrice
                wrote on last edited by
                #8

                @ankou29666 said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                If I can understand the need of reading existing data in old formats, however, is there really a point writing to those old formats ?

                In an ideal world, no. But people have to work with all sorts of legacy systems.

                1 Reply Last reply
                0
                • B Bonnie

                  From QStringConverter::encodingForName, it says

                  Such a name may, none the less, be accepted by the QStringConverter constructor when Qt is built with ICU, if ICU provides a converter with the given name.

                  So it should be possible to load an icu-supported codec name. (Seems to start from 6.6 according to this SO post.)
                  But I'm not sure if the prebuilt Qt is built with ICU or not.

                  AndyBriceA Offline
                  AndyBriceA Offline
                  AndyBrice
                  wrote on last edited by
                  #9

                  @Bonnie said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                  So it should be possible to load an icu-supported codec name. (Seems to start from 6.6 according to this SO post.)
                  But I'm not sure if the prebuilt Qt is built with ICU or not.

                  As far as I can make out, it isn't possible to access these additional Codecs from the Qt 6.7 or 6.8 binaries.

                  Years ago, I used to build my own Qt binaries from source. But it just got too difficult.

                  1 Reply Last reply
                  0
                  • AndyBriceA Offline
                    AndyBriceA Offline
                    AndyBrice
                    wrote on last edited by
                    #10

                    I guess my other option is to build a command line encoding converter in Qt 5 and call it from my Qt 6 application. Hardly ideal though.

                    B 1 Reply Last reply
                    0
                    • A Offline
                      A Offline
                      ankou29666
                      wrote on last edited by
                      #11

                      seems like the QTextCodec in Qt5 compatibility module supports Win1250 to 1258 endodings.
                      https://doc.qt.io/qt-6/qtextcodec.html

                      AndyBriceA 1 Reply Last reply
                      1
                      • AndyBriceA AndyBrice

                        I guess my other option is to build a command line encoding converter in Qt 5 and call it from my Qt 6 application. Hardly ideal though.

                        B Offline
                        B Offline
                        Bonnie
                        wrote on last edited by Bonnie
                        #12

                        @AndyBrice Or you can link to icu and use its api your self just like what Qt did in its internal codes, or even use other thirdparty codec libraries.

                        AndyBriceA 1 Reply Last reply
                        3
                        • A ankou29666

                          seems like the QTextCodec in Qt5 compatibility module supports Win1250 to 1258 endodings.
                          https://doc.qt.io/qt-6/qtextcodec.html

                          AndyBriceA Offline
                          AndyBriceA Offline
                          AndyBrice
                          wrote on last edited by
                          #13

                          @ankou29666 said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                          seems like the QTextCodec in Qt5 compatibility module supports Win1250 to 1258 endodings.
                          https://doc.qt.io/qt-6/qtextcodec.html

                          Ok, I didn't spot that. Thanks.

                          So I can install handle these encodings, but to read windows-1252 encoding where I used to do this in Qt 5:

                          QFile f( path );
                          if ( f.open( QIODevice::ReadOnly ) )
                          {
                            QTextStream t( &f );
                            QTextCodec* codec = QTextCodec::codecForName( "windows-1252" );
                            t.setCodec( codec );
                            ...
                          }
                          
                          

                          I have to do this in Qt 6:

                          QByteArray encodedString = "..."; // read from file
                          QTextCodec* codec = QTextCodec::codecForName("windows-1252");
                          QString unencodedString = codec->toUnicode(encodedString);
                          

                          Is that right?

                          1 Reply Last reply
                          0
                          • I Offline
                            I Offline
                            IgKh
                            wrote on last edited by
                            #14

                            If willing to use a sub process, you can always use a iconv binary to pre/post process input or output to/from UTF-16. There are many ways to tackle this issue, which is probably part of the reason no one was motivated enough so far to push the patch to extend Qt 6's QTextCodec to the finish line.

                            AndyBriceA 1 Reply Last reply
                            1
                            • B Bonnie

                              @AndyBrice Or you can link to icu and use its api your self just like what Qt did in its internal codes, or even use other thirdparty codec libraries.

                              AndyBriceA Offline
                              AndyBriceA Offline
                              AndyBrice
                              wrote on last edited by
                              #15

                              @Bonnie said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                              Or you can link to icu and use its api your self just like what Qt did in its internal codes, or even use other thirdparty codec libraries.

                              Are their prebuilt ICU binaries for Windows and Mac? I had a quick look on https://unicode-org.github.io/icu/, but didn't see them.

                              Do you know what the licensing of the binaries is? If they are GPL, I won't be able to use them in my commercial product.

                              S 1 Reply Last reply
                              0
                              • I IgKh

                                If willing to use a sub process, you can always use a iconv binary to pre/post process input or output to/from UTF-16. There are many ways to tackle this issue, which is probably part of the reason no one was motivated enough so far to push the patch to extend Qt 6's QTextCodec to the finish line.

                                AndyBriceA Offline
                                AndyBriceA Offline
                                AndyBrice
                                wrote on last edited by
                                #16

                                @IgKh said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                                If willing to use a sub process, you can always use a iconv binary to pre/post process input or output to/from UTF-16.

                                Is iconv related to the ICU libraries, or completely different?

                                I 1 Reply Last reply
                                0
                                • AndyBriceA AndyBrice

                                  @IgKh said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                                  If willing to use a sub process, you can always use a iconv binary to pre/post process input or output to/from UTF-16.

                                  Is iconv related to the ICU libraries, or completely different?

                                  I Offline
                                  I Offline
                                  IgKh
                                  wrote on last edited by
                                  #17

                                  @AndyBrice iconv isn't related to ICU, it is a very old POSIX API and a corresponding CLI binary that's included in every UNIX-like/Linux system and it is not hard to find compatible Windows versions of it. Usually can work with any text encoding ever known to mankind.

                                  It can be integrated using QProcess, i.e. something like:

                                  QProcess* proc = new QProcess(parent);
                                  proc->setStandardInputFile("path/to/input/file");
                                  proc->start("path/to/iconv", QStringList() << "-f" << "WINDOWS-1252" << "-t" << "UTF16");
                                  

                                  A then the QProcess can be used as source device for QTextStream, since it is a kind of QIODevice. Likewise for the output.

                                  AndyBriceA 1 Reply Last reply
                                  1
                                  • I IgKh

                                    @AndyBrice iconv isn't related to ICU, it is a very old POSIX API and a corresponding CLI binary that's included in every UNIX-like/Linux system and it is not hard to find compatible Windows versions of it. Usually can work with any text encoding ever known to mankind.

                                    It can be integrated using QProcess, i.e. something like:

                                    QProcess* proc = new QProcess(parent);
                                    proc->setStandardInputFile("path/to/input/file");
                                    proc->start("path/to/iconv", QStringList() << "-f" << "WINDOWS-1252" << "-t" << "UTF16");
                                    

                                    A then the QProcess can be used as source device for QTextStream, since it is a kind of QIODevice. Likewise for the output.

                                    AndyBriceA Offline
                                    AndyBriceA Offline
                                    AndyBrice
                                    wrote on last edited by
                                    #18

                                    @IgKh Ok, thanks for the explanation.

                                    1 Reply Last reply
                                    0
                                    • AndyBriceA AndyBrice

                                      @Bonnie said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                                      Or you can link to icu and use its api your self just like what Qt did in its internal codes, or even use other thirdparty codec libraries.

                                      Are their prebuilt ICU binaries for Windows and Mac? I had a quick look on https://unicode-org.github.io/icu/, but didn't see them.

                                      Do you know what the licensing of the binaries is? If they are GPL, I won't be able to use them in my commercial product.

                                      S Offline
                                      S Offline
                                      SimonSchroeder
                                      wrote on last edited by
                                      #19

                                      @AndyBrice said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                                      Are their prebuilt ICU binaries for Windows and Mac? I had a quick look on https://unicode-org.github.io/icu/, but didn't see them.

                                      Binaries are located on their GitHub page under release: https://github.com/unicode-org/icu/releases/tag/release-76-rc. Upon a quick glance I'm not sure if any of these are for macOS, though.

                                      @AndyBrice said in Can I read/write Windows-1252 and other legacy encodings in Qt 6?:

                                      Do you know what the licensing of the binaries is? If they are GPL, I won't be able to use them in my commercial product.

                                      They have a lilst of all the licenses (including 3rd party) that apply: https://github.com/unicode-org/icu?tab=License-1-ov-file. ICU itself seems to be very permissive. Some of the 3rd party libs seem to require a mention with their copyright notice. Overall it should be useable for commercial products.

                                      1 Reply Last reply
                                      1
                                      • AndyBriceA Offline
                                        AndyBriceA Offline
                                        AndyBrice
                                        wrote on last edited by
                                        #20

                                        Thanks, Simon.

                                        1 Reply Last reply
                                        0

                                        • Login

                                        • Login or register to search.
                                        • First post
                                          Last post
                                        0
                                        • Categories
                                        • Recent
                                        • Tags
                                        • Popular
                                        • Users
                                        • Groups
                                        • Search
                                        • Get Qt Extensions
                                        • Unsolved