Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. Qt WebKit
  4. Unicode/UTF-8 encoding while getting data from site via QWebElement

Unicode/UTF-8 encoding while getting data from site via QWebElement

Scheduled Pinned Locked Moved Qt WebKit
webkitwebview webkitqwebframewebelementcolleqwebelementunicodeutf-8encodingjsonjson parser
1 Posts 1 Posters 1.2k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Offline
    S Offline
    slesher
    wrote on last edited by slesher
    #1

    Hello
    I am getting list of films from the website, each of films is between <div class="header">...</div> tags.
    In the previous versions of Qt (now I have Qt 5.5) the data received were viewed appropriately, like in site.
    But now only Latin symbols are viewed directly, and the others (Cyrillic) are just in their encoded representation.
    Watching different topics and posts, which have some similar problems, i discovered that it was connected with Unicode, the web page had utf-8 encoding, and some people told that it should have been parsed using json parser.
    My code:

    Seeker::Seeker(QWidget *parent) :
    QDialog(parent),
    ui(new Ui::Seeker)

    {

       ui->setupUi(this);
        webView = new QWebView();
        webView->setUrl(QUrl("http://kino-butterfly.com.ua/cinema.php?sTheater=cosmopolite"));
    
       connect(webView, SIGNAL(loadFinished(bool)), this, SLOT(parse()));
    

    }

    void Seeker::parse() {

    QStringList * filmList = new QStringList();
    
    qDebug()<<"loading finished";
    
    qDebug()<<webView->title();
    QWebFrame *frame = webView->page()->mainFrame();
    
         QWebElement document = frame->documentElement();
         QWebElementCollection elements = document.findAll("div.header");
    
         foreach (QWebElement element, elements) {
             filmList->append(element.toPlainText());
    
         }
    
         foreach (QString film, *filmList)
            qDebug()<<film;
    

    }

    So, for example. I get "\u041B\u044E\u0434\u0438\u043D\u0430-\u043C\u0443\u0440\u0430\u0445\u0430 3D" instead of Людина-Мураха 3D . How should I get the true text?

    UPD: I really discovered, that these letters in Ukrainian language have the Unicode representation, like in the Unicode standards table.
    Can anyone tell me if there is any function or method of QString which can give me a simple letter?

    1 Reply Last reply
    0

    • Login

    • Login or register to search.
    • First post
      Last post
    0
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Get Qt Extensions
    • Unsolved