Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. Qt WebKit
  4. Can't load full html code from page
Forum Updated to NodeBB v4.3 + New Features

Can't load full html code from page

Scheduled Pinned Locked Moved Qt WebKit
5 Posts 2 Posters 3.1k Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Offline
    S Offline
    soloma_lviv
    wrote on last edited by
    #1

    If we look at a plain source code of a website we will see that all ads, or most of them (flash, Google, others) are inserted as a JavaScript code. But if you look at the code in for example Firefox Firebug you will see that the JavaScript have been replaced with the HTML code of the add.
    I want to load and parse this "full" html and I believed that Qt WebKit can do such stuff.

    I tried to do it in that way:

    @
    PageLoader::PageLoader(const QUrl &url)
    {
    mWebPage = new QWebPage();
    mWebPage->settings()->setAttribute(QWebSettings::JavascriptEnabled, true);
    mWebPage->settings()->setAttribute(QWebSettings::PluginsEnabled, false);
    mWebPage->settings()->setAttribute(QWebSettings::AutoLoadImages, false);
    mWebPage->settings()->setAttribute(QWebSettings::JavascriptCanOpenWindows, false);
    connect(mWebPage->mainFrame(),SIGNAL(loadFinished(bool)), this, SLOT(processPage()));
    mWebPage->currentFrame()->load(url);
    }

    void PageLoader::processPage()
    {
    QWebFrame* frame = mWebPage->currentFrame();
    QString webHtml = frame->toHtml();
    QFile file("/home/ostap/output.txt");
    file.open(QIODevice::WriteOnly | QIODevice::Text);
    QTextStream out(&file);
    out << webHtml;
    emit finished();
    }
    @

    But in output file I have only plain html with links to *.js files in script tags.

    Where is my problem?

    sorry for my terrible English...

    1 Reply Last reply
    0
    • J Offline
      J Offline
      Jake007
      wrote on last edited by
      #2

      You download and display html code only of the file that you access on server. All other files are only linked with html script and and located somewhere in ram or temp.
      You'll have to manually parse html code to get other files and download them in the same way.


      Code is poetry

      1 Reply Last reply
      0
      • S Offline
        S Offline
        soloma_lviv
        wrote on last edited by
        #3

        But when I tried to render the loaded page from mWebPage:

        @void PageLoader::render()
        {
        mWebPage->setViewportSize(mWebPage->mainFrame()->contentsSize());
        QImage image(mWebPage->viewportSize(), QImage::Format_ARGB32);
        QPainter painter(&image);
        mWebPage->mainFrame()->render(&painter);
        painter.end();
        QImage thumbnail = image.scaled(400, 400);
        thumbnail.save("thumbnail.png");
        emit finished();
        }@

        I get in thumbnail.png normal full view of web page. I think it means that QWebPage object has somewhere this full version of html with executed javascripts and can to render this web page.

        1 Reply Last reply
        0
        • J Offline
          J Offline
          Jake007
          wrote on last edited by
          #4

          It definitely has all js, images, css etc. files somewhere ( ram or internet temporary files). But as I looked through the docs, I didn't found any useful functions to access that data.

          So you'll probably have to write your own program that will strip those files out, download them and changed links between those files that will match those on your hard drive.

          And what you get is entire html file produced by server. For all the other files ( js, css, images), WebKit does the same for all the other files that are linked to your "main" file ( note: files can be included in files etc... ( recursion)).

          Regards,
          Jake


          Code is poetry

          1 Reply Last reply
          0
          • S Offline
            S Offline
            soloma_lviv
            wrote on last edited by
            #5

            Thank you! I'll try that:)

            1 Reply Last reply
            0

            • Login

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • Users
            • Groups
            • Search
            • Get Qt Extensions
            • Unsolved