Qt5 default text encoding has chaged
-
Hello,
I am using Qt 5.7.0 (MinGW 5.3.0 32bit) with QtCreator 4.9.0 on Windows 10.It was working fine for years, but a few days ago, QtCreator crashed.
Since when I want to open an existing project, the first time I open it, QtCreator tells me that the .pro.user file is not valid and asks me to reconfigure the project.
This is not a problem for me because I use the default configuration, so it's very fast.On the other hand, in some of the projects the program reads a text file with this simple function : fileToList.
Before the crash, Qt considered by default that the file was encoded in UTF-8.
Now it considers by default that the file is encoded in ANSI.
To read the text file correctly, I now need to add :flux.setCodec("UTF-8");
I have uninstalled and reinstalled QtCreator and Qt but the problem persists.
Please, do you know how to restore the normal behavior?bool MyClass::fileToList(QStringList &lst, QString fileName) { lst.clear(); QTextStream flux; QFile f; f.setFileName(fileName); if(! f.open(QIODevice::ReadOnly | QIODevice::Text)){ _erreur.append("Open fail " + fileName); return false; } flux.setDevice(&f); flux.setCodec("UTF-8"); // NOW I MUST ADD THIS LINE while(!flux.atEnd()){ lst.append(flux.readLine()); } f.close(); return true; }
-
@djentx From documentation: "By default, QTextCodec::codecForLocale() is used for reading and writing, but you can also set the codec by calling setCodec()" (https://doc.qt.io/archives/qt-5.7/qtextstream.html).
I would say it is anyway more safe to set the codec explicitly if you know what codec was used to write the file instead of depending on some guess work done by the framework. -
Hi, I sometimes have the same problem in Windows 10 when I have edited a file in Notepad and the file contains one or more letters that are not A-Z (for example ú, é, ö).
Notepad then adds a BOM 4 bytes sequence in the beginning of the file, stating whether the file is an ANSI file or an UTF8 file.
Perhaps you also by mistake opened and saved your file in Notepad?
If so, either remove the BOM 4 bytes sequence using a binary editor, or open the file again in Notepad and choose Save As, and then select UTF8. -
Hi, sorry for bad english.
Thank you for answer.
@jsulm I'm trying to restore the normal behavior because I'm afraid it will affect other things (like loading ini files...).
I use notepad++ to edit text files.
In notepad++ you can find out what the current encoding is, and convert it. I have used it dozens of times and it has always worked.
However, I just launched an old program and its behavior changed too (even though I didn't recompile it).
I wonder if it's not the Windows settings that have changed. -
@djentx said in Qt5 default text encoding has chaged:
However, I just launched an old program and its behavior changed too (even though I didn't recompile it).
This would seem to indicate that this isn't a Qt/QtCreator change that's causing the different behavior you are seeing, but a change in one of your computer's settings.
-
@djentx said in Qt5 default text encoding has chaged:
Beta: Use UTF-8 for worldwide language support
Wow, After 15 years they finally found out that local codepages are a mess...
Why do you call it a problem? It's a real way forward, also decades too late.
-
For anyone who needs to formulate cogent arguments with customers or teammates regarding a preference for UTF-8, I just want to give a free "plug" for this page: http://utf8everywhere.org/
I refer to it anytime I need help convincing a team to adopt UTF8 as the common encoding.
If you don't like UTF8, then I'm not here to shoot you down. I'm just saying that if anyone needs to gather together the arguments in favor, then don't reinvent the wheel because someone (creators of utf8everywhere.org) did that work for us already :)
-
Hi, thanks for the liink, some good info in that page, but I think 11.16 (How to write UTF-8 string literals in C++ code) is slightly incorrect, for years I've had these 3 lines in my utils lib:
QString rightArrow() { return u8"\u2192"; } QString leftArrow() { return u8"\u2190"; } QString upDownArrows() { return u8"\u21C5"; }
but when I recently updated my MSVC 2019 compiler I got the error (for all 3 lines):
Utils.h:173: error: C2440: 'return': cannot convert from 'const char8_t [4]' to 'QString'Those 3 lines still compile fine in MacOS with clang and Ubuntu 20.04 with g++, but since I work sometimes in windows I switched to:
QString rightArrow() { return QByteArray::fromHex("e2 86 92"); } QString leftArrow() { return QByteArray::fromHex("e2 86 90"); } QString upDownArrows() { return QByteArray::fromHex("e2 87 85"); }
not as nicely looking but now those lines compiles with MSVC 2019 version 16.11.9 :-)
-
@hskoglund said in Qt5 default text encoding has chaged:
Utils.h:173: error: C2440: 'return': cannot convert from 'const char8_t [4]' to 'QString'
Because of QT_NO_CAST_FROM_ASCII which should be enabled for all projects by default - a QString and a char* are two completely different types which should not implicitly convert into each other.