[SOLVED] [Qt 5.0.1 Multi Platform] How can I convert QNetworkReply::readAll() to UTF-8?
the title says it all:
I get an HTML-source from a german website by QNetworkReply::readAll(), which is first captured into a QByteArray.
Then this is converted to a QString, and then this is parsed.
The application was first developed under Qt 4.8.1 where everything worked fine; but after porting it to 5.0.1 the german characters are shown wrong.
Is there any option while receiving the stream to get the chars well coded?
Or can I convert the QByteArray to UTF-8 anyhow?
Thanks in advance for every hint.
Well, the title doesn't say it all actually ;)
First of all what is the original encoding of the website (I suspect utf-8)? Second - how do you convert the QByteArray to QString, and the third thing is why do you want to have a utf-8 QString in a first place?
To construct a QString from it you need to know how to interpret the bytes - are those single byte characters, double byte, varying lenght and what is the encoding. Assuming the website is utf-8 the QByteArray will contain the actual bytes representing unicode characters.
To create QString interpreting those bytes as utf-8 representation you can use QString::fromUtf8(yourByteArray). But lets be precise. This is not converting QByteArray to utf-8 as you wrote. This is converting from bytes representing utf-8 data into the QString, which has it's own internal representation (16bit wide unicode characters).
first of all: that issue is solved!!!
The encoding of the website is Latin1 (what I learned a few minutes ago...;-)) but as long as QString in 4.x.x versions handled it well everything was ok, i.e. there was no need to think about something like that.
O.k., these times are over now and I see myself being forced to write better code...
As we say in Cologne: "Who knows, what it is good for?" (Cologne's Fundamental Law; Chapter 5)
After reading your posting it was an easy task to transfer the QByteArray into a QString using QString::fromLatin1().
The reason to need a QString is very simple:
Some time ago I wrote a little library that is able to parse a stream of text data to extract any phrase needed.
The easiest way was to use QString and it's comfortable members.
...and I just did not want to refine the wheel... ;-)
Thanks a lot for that very valuable hint!!!
Great, glad I could help.
I didn't mean you shouldn't use QString. On the contrary, that's what it's for! I just meant that there's no such thing as UTF-8 QString. The internal QString encoding is UTF-16.