Problem with encoding

Saskic

Hello all,

I'm developing a mobile application for Symbian.
Since my application should be in Croatian (ISO 8859-2), I have to display characters like "č,š,ć". Therefore, in the main window, I have set up:

QTextCodec::setCodecForCStrings(QTextCodec::codecForName("ISO 8859-2"));
QTextCodec::setCodecForTr( QTextCodec::codecForName("ISO 8859-2") );

QTranslator qtTranslator;
qtTranslator.load("qt_" + QLocale::system().name(), QLibraryInfo::location(QLibraryInfo::TranslationsPath));

a.installTranslator(&qtTranslator);

but again, when I put in MainWindow.cpp the code like this one:

QTextCodec* codec = QTextCodec::codecForName("ISO 8859-2");
QByteArray temp2("ššš");
QString string = codec->toUnicode((temp2));
QTableWidgetItem* item1 = new QTableWidgetItem(string);
table->setItem(i,0,item1);

TableWidget in ui displays "ššš" like three square symbols, both on Mobile Simulator and my Phone (default phone language is set to Croatian).
What am I doing wrong? Should I put somewhere in the QTableWidget coding settings?
Thank you very much in advance.

goetz

If you set QTextCodec::setCodecForCStrings() correctly then you should instantiate your string like this:

@
QString s = QString::fromLocal8Bit("ššš”);
@

There is no need to go via QByteArray.

You might also add
@
CODECFORTR = ISO-8859-2
CODECFORSRC = ISO-8859-2
@
to your .pro file.

And, of course, your source file must be in Latin-2 :-)

Saskic

thanks for quick reply! Unfortunately,

QString s = QString::fromLocal8Bit("ššš”);

and

CODECFORTR = ISO-8859-2
CODECFORSRC = ISO-8859-2

added to .pro give another look to the character "š" - now it looks like letter "a" having "something" down, I think this comes from Portuguese language.
What do you mean by "And, of course, your source file must be in Latin-2"
.cpp in Latin-2? How do you do that :-)?

goetz

Your source files (.h and .cpp) must be saved in the right encoding (ISO 8859-2 in your case). Otherwise the text codec will fail in decoding the non-plain-ASCII characters. In Qt Creator you can set the encoding in the project view on the editor settings tab. You should check the setting and adjust, if needed. It uses the system encoding by default, which might be something other than latin-2 (UTF-8 is likely on Unix/Linux/Mac).

BTW - you might also consider switchoing to UTF-8. Most IDEs (including Creator and Visual Studio) support it without problems (check the settings!).

blex

Maybe, the example with putting "ššš” in source code is not realistic. Use tr() for all strings that should be presented to user, and no problem.

You can even include *.qm into the application resource to have only 1 file to distribute, but this approach has other limitations.

goetz

Even when using tr() the encoding of the source files must be correct if they contain non-ASCII characters. Coding the app in english and translating it afterwards to avoid encoding problems is not always a solution, e.g. for an application that is developed for a non-english speaking audience (or market :-)). Not to speak of the problems that non-native english speakers might have with the correct wording in the first place and the money the additional translation process costs.

blex

[quote author="Volker" date="1291318532"]Even when using tr() the encoding of the source files must be correct if they contain non-ASCII characters.[/quote]

Sure. And I agree with the note about translation cost.

But even in this case it is better to move all messages to resource. Developer may use Id's in tr() or non-perfect English in the sources and quickly translate messages to native language. The message itself and application code will be separated - it is a benefit during the verification phase of the project.

So, I recommend to move all non-ASCII symbols in the translation files.

goetz

As always, there is no one proper way of doing things and it always depends. E.g. on the number of developers. In small teams it may be even not possible mange the translation overhead.

Fortunately we're in the 2010's and it is no big problem with non-ASCII source code any more :-) We happily converted all of our sources to UTF-8 some two years ago, no problems so far, using the sources on Windows, Mac and Linux.