Solved : Reading ASCII/UTF-8 file
-
Hi all,
I am using Qt to read a file generated by the method store of the Java Properties class. This method generates an ASCII file. However, special characters are written in their Utf-8 equivalent. In the end, the ascii file looks like that :
firstname = g\u00C9rard
lastname = normandNow, when I read and display the data, I would like to display the symbols itself (gérard normand). I tried using the toUtf8() method but nothing convincing came out. So how do I read this ascii file while handling the Utf-8 symbols ?
Thanks for your help
Thibaut
-
Hi, the \u00C9 is a utf32 representation of a capital letter é, so getting the small letter is going to be trouble some. Think as Andre mentioned you need to make your own conversion class to handle this one.
-
Thanks for the reply. The capital letter é is what I want. I went on http://www.fileformat.info/info/unicode/char/c9/index.htm and it says that \u00C9 is Java/C++ source code for the capital é character. So I was thinking there might be a way around it.
-
Finally found what I was looking for.
I need to use the following routine@QRegExp rx("(\\u[0-9a-fA-F]{4})");
int pos = 0;
while ((pos = rx.indexIn(str, pos)) != -1) {
str.replace(pos++, 6, QChar(rx.cap(1).right(4).toUShort(0, 16)));
}@Thanks a lot for your replies
-
Did you check out "fromUnicode?":http://qt-project.org/doc/qt-4.8/qtextcodec.html#fromUnicode
At least from the name and description it seems to be fitting. However, Andre may have more experience with this. -
[quote author="koahnig" date="1342687524"]Did you check out "fromUnicode?":http://qt-project.org/doc/qt-4.8/qtextcodec.html#fromUnicode
At least from the name and description it seems to be fitting. However, Andre may have more experience with this. [/quote]The problem is that the codec used is not a codec in the normal sense. It is a unicode escape sequence in an otherwise ASCII-encoded file. So to use this method, you first have to actually implement a codec that does that translation back and forth. If you have to work with these files, it is probably a good idea to implement such a codec. Doesn't seem all that hard to me...
Edit: though, I admit, I did not try doing it myself, so it might be harder than it seems by just looking at the docs...
-
[quote author="Andre" date="1342688046"]
[quote author="koahnig" date="1342687524"]Did you check out "fromUnicode?":http://qt-project.org/doc/qt-4.8/qtextcodec.html#fromUnicode
At least from the name and description it seems to be fitting. However, Andre may have more experience with this. [/quote]The problem is that the codec used is not a codec in the normal sense. It is a unicode escape sequence in an otherwise ASCII-encoded file. So to use this method, you first have to actually implement a codec that does that translation back and forth. If you have to work with these files, it is probably a good idea to implement such a codec. Doesn't seem all that hard to me...
[/quote]Andre, thanks for clarification