QWebKit access to the final encoding guessed
I'm trying to get the encoding from a page loaded with a QWebPage component, it's hard to achieve it like the same way a WebKit's browser based (Safari or Chrome) does. Basicly I would like to get the encoding string displayed in "View -> Text Encoding" for Safari or "View -> Encoding" for Chrome.
Tested several methods:
- reading directly the metas with metaData() // but this is not always the real encoding
- reading the QNetworkRequest::ContentTypeHeader from my own NetworkAccessManager class // but this is just the encoding from the server's response, not the final guessed by the browser (and isn't always present)
- using QTextCodec and analyzing the string encoding, but the same, not the final encoding guessed by QWebPage
Digging into the WebKit's source I found QtSources/4.7.4/src/3rdparty/webkit/WebCore/loader/TextResourceDecoder.cpp is it possible to use it in my own Qt application (how)? or I'm missing some api reference for getting the real QWebPage encoding?
You can retrieve the character encoding from the HTML meta tags of each page:
- HTML 5
- HTML 4
@<meta http-equiv="content-type" content="text/html; charset=UTF-8">@
- HTML 5
Did you read me Leon? (bullet 2 btw) You can't trust the encoding just reading the meta's. Certainly is a factor for guessing the encoding inside WebKit, but not the only one. I'm asking for the last codification chosen for WebKit.
Considering how hard it is to guess encodings correctly I would be surprised if webkit would really choose to ignore the HTML meta tags providing that information... but I am no expert there and can not provide more than a guess. Sorry.