QWebKit access to the final encoding guessed



  • I'm trying to get the encoding from a page loaded with a QWebPage component, it's hard to achieve it like the same way a WebKit's browser based (Safari or Chrome) does. Basicly I would like to get the encoding string displayed in "View -> Text Encoding" for Safari or "View -> Encoding" for Chrome.

    Tested several methods:

    • this->mainFrame()->evaluateJavaScript("window.document.characterSet"); // but I'm not always getting the same results as a standalone browser does
    • reading directly the metas with metaData() // but this is not always the real encoding
    • reading the QNetworkRequest::ContentTypeHeader from my own NetworkAccessManager class // but this is just the encoding from the server's response, not the final guessed by the browser (and isn't always present)
    • using QTextCodec and analyzing the string encoding, but the same, not the final encoding guessed by QWebPage

    Digging into the WebKit's source I found QtSources/4.7.4/src/3rdparty/webkit/WebCore/loader/TextResourceDecoder.cpp is it possible to use it in my own Qt application (how)? or I'm missing some api reference for getting the real QWebPage encoding?



  • You can retrieve the character encoding from the HTML meta tags of each page:

    • HTML 5
      @<meta charset="UTF-8">@
    • HTML 4
      @<meta http-equiv="content-type" content="text/html; charset=UTF-8">@


  • Did you read me Leon? (bullet 2 btw) You can't trust the encoding just reading the meta's. Certainly is a factor for guessing the encoding inside WebKit, but not the only one. I'm asking for the last codification chosen for WebKit.


  • Moderators

    Considering how hard it is to guess encodings correctly I would be surprised if webkit would really choose to ignore the HTML meta tags providing that information... but I am no expert there and can not provide more than a guess. Sorry.


Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.