Can't access text codecs of ICU (Windows)
-
Hi all,
with the end of qt5compat module in sight the urgency to get rid of QTextCodec gets in focus. With a software that is used all over the world it needs to support stable codec conversion. Unfortunately the Qt people decided to reduce the number of non-unicode codecs directly supported to 1.
The documentation says that a QStringEncoder/Decoder can be created for any installed codec on t he system. I've tried this to check if any codec is found:
QStringList additionalEnc = {"ISO8859_2", "ISO8859_3", "CP256", "CP273", "CP276", "CP290", "CP297", "CP878", "CP1386"}; for (const QString &encName : additionalEnc) { QStringEncoder enc(encName.toLatin1().data()); qDebug() << "- " << encName << (enc.isValid() ? " valid" : " invalid"); }
But I only get invalid results. I searched the web for information if the provided binaries of Qt 6.7.3 are built for ICU support but couldn't find any information. For our software containing a text editor this is a major thread.
I even tried to build Qt from stack on Windows but after adding a lot of gnu tools I got stuck on building the WebEngine (we use the WebEngine too). I never reached the point where I could possibly ad the "icu" parameter.
Does anyone know if it's really necessary to build Qt from scratch to get codec support? Or did I misunderstood something? I'm puzzled that hint's to this topic are rare in the web.
Regards
Jazzco -
The prebuilt Qt6 of windows does not support ICU. You can check
mkspecs\modules\qt_lib_core_private.pri
and find "icu" inQT.core_private.disabled_features
.
When building Qt from source, to add ICU support, you need to add-icu
option when you runconfigure.bat
and make sure cmake can find ICU packages. (To see the full list of options, runconfigure -h
).
If you have difficulties in building Qt, maybe you can consider using other 3rd-party libraries to handle the text codec part, such aslibiconv
, which I think should be more lightweight than ICU.
Or you can try to build the Qt core module only and install it to the prebuilt Qt folder (to replace the prebuilt one with it). But you need to configure your newly built Qt to have all the features that the prebuilt one has. Not sure whether this can work though. -
@jazzco2 I know this seems to be the easiest, but the biggest challenge is to make it work. This case is rare and undocumented, so you may need to test repeatedly (remember to have backup so you can undo your replacement).
I haven't built Qt6 myself yet (only have experience in building Qt5), so I'm not sure about the configure options.
As i said you can runconfigure -h
to see the full option list and decide what to add. There is this-submodules
option may be what you need (or not...)
And what may be helpful is Qt's CI configration: https://wiki.qt.io/Qt_6.7_Tools_and_Versions, I think the prebuilt ones are from some of those. -
There might be an alternative: instead of building a new Qt6 from scratch:
since all of the encodings in your list: "ISO8859_2", "ISO8859_3", "CP256", "CP273", "CP276", "CP290", "CP297", "CP878", "CP1386" are all 8-bit, 256 characters encodings.Edit: I first suggested using ISO8859_1 as the "gateway" codepage, since it also exists in Qt6, but I realized that ISO8859_1 lacks a lot of the characters from the 9 codepages in your list (not good).
Instead try:
-
Write a Qt5 app that converts all the values 0x00 to 0xFF in each of the 9 encodings in your list to a table of QChars (i.e. UTF16 entries). That should end up with 9 tables of 256 entries of QChars, and save it to an .h file.
-
For Qt6, #include the .h file from 1) and use it to translate from/to the 9 codecs.
For example, say you have a textfile encoded in CP290, use the appropriate table to translate each byte into a QChar and then further into QStrings.
For translating from QStrings to one of the 9 codecs, for each QChar step thru the array from 1) and look for a match.
It will not be the world's fastest encoder/decoder but it would work :-)
-
-
@hskoglund The encodings were from a test if ICU is available. I picked up some encodings with a good chance to be supported on my European PC. In fact, we also need to support Cyrillic, Japanese, Arabic, and Chinese encodings, like, as much as possible.
And that all worked fine with Qt5. And I really cannot understand how the Qt guys could kill a working feature without having a usable replacement.
-
@jazzco2 I've had some free time so I tried to build.
I use the latest ICU Win64-MSVC2022 package from official github, so I test with Qt 6.7.3 MSVC2022.
Here's what I've done:- Open x64 Native Tools Command Prompt for VS 2022
- set PATH to make
cmake
,ninja
andpython
available - cd to a clean build folder
- path\to\source\configure.bat -debug-and-release -force-debug-info -headersclean -nomake examples -qt-zlib -submodules qtbase -icu -- -DFEATURE_msvc_obj_debug_info=ON -DOPENSSL_ROOT_DIR=path\to\OpenSSLv3 -DICU_ROOT=path\to\icu
- cmake --build . --target Core
- After building successfully, compare
qtbase\mkspecs\modules\qt_lib_core_private.pri
and the originalqt_lib_core_private.pri
in the prebuilt Qt folder, make sure that all theQT.core_private.*
properties are the same excepticu
is moved fromdisabled_features
toenabled_features
, andqt_lib_core.pri
should be exactly the same. - copy these files from build folder to the prebuilt Qt folder and replace the old ones
qtbase\bin\Qt6Core.dll
qtbase\bin\Qt6Core.pdb
qtbase\bin\Qt6Cored.dll
qtbase\bin\Qt6Cored.pdb
qtbase\lib\Qt6Core.lib
qtbase\lib\Qt6Core.prl
qtbase\lib\Qt6Cored.lib
qtbase\lib\Qt6Cored.prl
qtbase\mkspecs\modules\qt_lib_core.pri
qtbase\mkspecs\modules\qt_lib_core_private.pri - copy icudt**.dll icuin**.dll icuuc**.dll from ICU bin folder to Qt bin folder
I had some simple tests and seems it is working, using your above test code will give these results:
- "ISO8859_2" valid
- "ISO8859_3" valid
- "CP256" invalid
- "CP273" valid
- "CP276" invalid
- "CP290" valid
- "CP297" valid
- "CP878" valid
- "CP1386" valid -