Replacing calls to QTextStream::setCodec() in Qt 6
-
So I am looking to port Easy Data Transform to Qt 6, but I have run into issues with QTextStream and QTextCodec. It seems that
- Qt 6 has replaced QTextCodec with a new class (QStringConverter) that supports a lot less encodings.
- QTextCodec is available in the Qt 5 compatibility module. But QTextStream::setCodec() isn't.
So what is the Qt 6 equivalent of this?:
QFile f( m_path ); if ( f.open( QIODevice::ReadOnly ) ) { QTextStream t( &f ); QTextCodec* codec = QTextCodec::codecForName( "KOI8-R" ); t.setCodec( codec ); QString s = t.readAll();
Do I have to do something like this?
QFile f( m_path ); if ( f.open( QIODevice::ReadOnly ) ) { QTextStream t( &f ); QTextCodec* codec = QTextCodec::codecForName( "KOI8-R" ); QString s = t.readAll(); s = codec->toUnicode( s );
-
Since you are using QTextStream only to read the entire file as a single blob then you can just read it into a QByteArray and avoid any possible conversion that QTextStream may attempt:
QTextCodec *codec = QTextCodec::codecForName("KOI8-R"); QByteArray encodedString = f.readAll(); QString string = codec->toUnicode(encodedString);
-
Hi, don't know about "KOI8-R" but if you want Utf8 you can try:
QFile f( m_path ); if ( f.open( QIODevice::ReadOnly ) ) { QTextStream t( &f ); t.setEncoding(QStringConverter::Utf8); QString s = t.readAll();
@hskoglund
UTF-8 is supported by the new class, so that isn't a problem. It is encodings like "KOI8-R", "windows-1252" and "windows-1256" that are the issue. -
Since you are using QTextStream only to read the entire file as a single blob then you can just read it into a QByteArray and avoid any possible conversion that QTextStream may attempt:
QTextCodec *codec = QTextCodec::codecForName("KOI8-R"); QByteArray encodedString = f.readAll(); QString string = codec->toUnicode(encodedString);
And does readLine() work as well? Can QTextStream tell where the line boundaries are without the correct encoding?
QFile f( m_path ); if ( f.open( QIODevice::ReadOnly ) ) { QTextStream t( &f ); QTextCodec *codec = QTextCodec::codecForName("KOI8-R"); while ( !t.atEnd() ) { QString line = t.readLine(); line = codec->toUnicode( line );
-
@hskoglund
UTF-8 is supported by the new class, so that isn't a problem. It is encodings like "KOI8-R", "windows-1252" and "windows-1256" that are the issue.@AndyBrice said in Replacing calls to QTextStream::setCodec() in Qt 6:
UTF-8 is supported by the new class, so that isn't a problem. It is encodings like "KOI8-R", "windows-1252" and "windows-1256" that are the issue.
It seems Qt 6.5 will get support for more encodings via ICU (again). Meanwhile, if you need proper support for KOI8-R, you can directly call ICU.
And does readLine() work as well? Can QTextStream tell where the line boundaries are without the correct encoding?
The definition of line end is not part of the encoding, but is rather an operating system thing. Files encoded in UTF-8 might e.g. have both \r\n and \n as line ending. If you check out qtextstream.cpp, you can see that the logic for detecting End of Line is indeed hardcoded to \r\n and \n (see QTextStreamPrivate::scan).
-
@AndyBrice said in Replacing calls to QTextStream::setCodec() in Qt 6:
UTF-8 is supported by the new class, so that isn't a problem. It is encodings like "KOI8-R", "windows-1252" and "windows-1256" that are the issue.
It seems Qt 6.5 will get support for more encodings via ICU (again). Meanwhile, if you need proper support for KOI8-R, you can directly call ICU.
And does readLine() work as well? Can QTextStream tell where the line boundaries are without the correct encoding?
The definition of line end is not part of the encoding, but is rather an operating system thing. Files encoded in UTF-8 might e.g. have both \r\n and \n as line ending. If you check out qtextstream.cpp, you can see that the logic for detecting End of Line is indeed hardcoded to \r\n and \n (see QTextStreamPrivate::scan).
-
ICU (https://icu.unicode.org/) is what Qt uses underneath, at least on Linux. It supports various exotic encodings.
-
ICU (https://icu.unicode.org/) is what Qt uses underneath, at least on Linux. It supports various exotic encodings.
-
@kkoehne
Do you know if/when Qt 6 (without the Qt 5 compatibility layer) is going to get support "KOI8-R", "windows-1252" and "windows-1256" etc on Windows and Mac?@AndyBrice said in Replacing calls to QTextStream::setCodec() in Qt 6:
Do you know if/when Qt 6 (without the Qt 5 compatibility layer) is going to get support "KOI8-R", "windows-1252" and "windows-1256" etc on Windows and Mac?
The patch for supporting other encodings via ICU is in
https://codereview.qt-project.org/c/qt/qtbase/+/393373 , and should be part of Qt 6.4 . Anyhow, this requires Qt to be compiled with ICU support, which is not the case so far on Windows, but on macOS. Unless this changes, you might be out of luck there.Note though that the mentioned encodings are already supported in Qt 6.3, if they are the default Windows encoding.