[Solved] QString unicode conversion to utf8
-
wrote on 18 Jul 2011, 06:51 last edited by
Hi,
As given a QByteArray with GBK encoded, there is a need to convert from it to a Utf-8 QString. However, the process_line_method1 did NOT work, while process_line_method2 did work. However, I have no idea that why method 1 did NOT work. Anyone know the reasons?
@
// did not work, utfStr display as messy code
void process_line_method1(QByteArray line)
{
QTextCodec *codec = QTextCodec::codecForName("GBK");
QString uc =QTextCodec::codecForName( "GBK")->toUnicode(line);QByteArray data = uc.toUtf8(); QString utfStr = codec->toUnicode(data); qDebug() << utfStr;
}
@@
// works well, uftStr output the readable Chinese string
void process_line_method2(QByteArray line)
{
QTextCodec *codec = QTextCodec::codecForName("GBK");
QString uc =QTextCodec::codecForName( "GBK")->toUnicode(line);
QString utfStr;
QTextStream streamFileOut(&uc;);
streamFileOut.setCodec("UTF-8");
streamFileOut >> utfStr;
qDebug() << utfStr;
}
@ -
wrote on 18 Jul 2011, 07:21 last edited by
Have a look at the comments I added.
[quote author="changsheng230" date="1310971873"]
@
void process_line_method1(QByteArray line)
{
QTextCodec *codec = QTextCodec::codecForName("GBK");
QString uc =QTextCodec::codecForName( "GBK")->toUnicode(line); // here you already have utf16 encoded dataQByteArray data = uc.toUtf8(); // here you have an utf8 encoded string in the byte array QString utfStr = codec->toUnicode(data); // here you are interpreting an utf8 encoded string as GBK, which will result in the mess you mention qDebug() << utfStr; // this prints the mess
}
@
[/quote] -
wrote on 18 Jul 2011, 07:30 last edited by
Thanks a lot! Method 1 works well following your comments.
@ // works well again, thank Franzk :)
void process_line(QByteArray line)
{
QTextCodec *codec = QTextCodec::codecForName("GBK");
QString uc =QTextCodec::codecForName( "GBK")->toUnicode(line);
QByteArray data = uc.toUtf8();
QTextCodec *codec2 = QTextCodec::codecForName("UTF-8");
QString utfStr = codec2->toUnicode(data);
qDebug() << utfStr;
}
@
[quote author="Franzk" date="1310973700"]Have a look at the comments I added.[quote author="changsheng230" date="1310971873"]
@
void process_line_method1(QByteArray line)
{
QTextCodec *codec = QTextCodec::codecForName("GBK");
QString uc =QTextCodec::codecForName( "GBK")->toUnicode(line); // here you already have utf16 encoded dataQByteArray data = uc.toUtf8(); // here you have an utf8 encoded string in the byte array QString utfStr = codec->toUnicode(data); // here you are interpreting an utf8 encoded string as GBK, which will result in the mess you mention qDebug() << utfStr; // this prints the mess
}
@
[/quote]
[/quote] -
wrote on 22 Aug 2014, 19:19 last edited by
Hello
Can you help me with one simple issue ?
I do not know how to work with unicode strings.
For example I want to write the name of dialog a chinese string.setWindowTitle( tr("國家") ) did not work !
How to work with clear unicode strings in source code and with qt framework ?
I did not find any exmple with internation unicode strings !
Thank you in advance.
-
wrote on 22 Aug 2014, 19:30 last edited by
I think I found a solution
QByteArray encodedString = "國家"; QTextCodec *codec = QTextCodec::codecForName( "UTF-8" ); QString string = codec->toUnicode( encodedString ); setWindowTitle( string );
But I do not like it.
I want to use plain unicode strings in my source code for a test.
-
wrote on 22 Aug 2014, 20:38 last edited by
tl;dr:
[quote author="aliosa_sbbv" date="1408735194"]Hello
How to work with clear unicode strings in source code and with qt framework ?[/quote]Don't. Keeping encoding in check is hard. Write in English; stick to ASCII. Then use Qt's (or any other) translation system for the result strings.
The long answer:
[quote author="aliosa_sbbv" date="1408735194"]
For example I want to write the name of dialog a chinese string.@setWindowTitle( tr("國家") )@
did not work ![/quote]
Your compiler has no clue what to do with these characters. It may not even be interpreting the characters in the same way your code editor does.
This also has the problem that I, a westerner with barely any knowledge of kanji, wouldn't have the slightest clue about what your string means.
[quote author="aliosa_sbbv" date="1408735830"]I think I found a solution
@
QByteArray encodedString = "國家";
QTextCodec *codec = QTextCodec::codecForName( "UTF-8" );
QString string = codec->toUnicode( encodedString );
setWindowTitle( string );
@But I do not like it.[/quote]
Neither do I. Here absolutely no one will understand what you mean. If you want to use unicode strings safely and directly in your code (while still suffering from some of the aforementioned problems), use
@setWindowTitle(trUtf8("\u750b\u5bb6")); /* 國家 */@
You basically shouldn't use any characters outside the ASCII range in your C or C++ code (comments are somewhat OK).
Your encodedString solution is quite possibly going to come back and haunt you as soon as you switch compilers. Even the file encoding may interfere with proper compilation. Even comments are tricky. While the compiler will not be bothered with them too much, your version control system may have trouble understanding what happens, as well as non-natives in the language.
The best solution, as practically always in programming, is to stick with pure English in your code:
@setWindowTitle(tr("My window text"));@
Then use Qt's translation system to turn "My window text" into your Chinese text. This has two advantages. The first is that almost anyone on this planet is going to be able to understand the text. The second is that you're making your application portable to other languages as well. Don't worry too much about making spelling mistakes in your code strings. The English should also be covered by a translation for serious applications.
Hope this clears something up.
See also: