Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. QtWebEngine
  4. QWebpage tohtml() get the wrong html
Qt 6.11 is out! See what's new in the release blog

QWebpage tohtml() get the wrong html

Scheduled Pinned Locked Moved Unsolved QtWebEngine
4 Posts 3 Posters 3.6k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • tkksT Offline
    tkksT Offline
    tkks
    wrote on last edited by
    #1

    poor in english ,i want to get web html ,save to disk . use QWebEngine->page()->toHtml();
    but i get " <html> <body></body> </html> ", the web use ajax to load data .
    how can i get the data html.
    i used 2 ways to get html.

    1. when set url , call QWebEngine->page()->toHtml() function . but get" <html> <body></body> </html>"
    2. when loadfinished , call QWebEngine->page()->toHtml() function , get nothing .
      test web is https://dangdang.tmall.com/search.htm?search=y&orderType=newOn_desc&pageNo=88

    version 5.7

    raven-worxR 1 Reply Last reply
    0
    • tkksT tkks

      poor in english ,i want to get web html ,save to disk . use QWebEngine->page()->toHtml();
      but i get " <html> <body></body> </html> ", the web use ajax to load data .
      how can i get the data html.
      i used 2 ways to get html.

      1. when set url , call QWebEngine->page()->toHtml() function . but get" <html> <body></body> </html>"
      2. when loadfinished , call QWebEngine->page()->toHtml() function , get nothing .
        test web is https://dangdang.tmall.com/search.htm?search=y&orderType=newOn_desc&pageNo=88

      version 5.7

      raven-worxR Offline
      raven-worxR Offline
      raven-worx
      Moderators
      wrote on last edited by
      #2

      @tkks
      please post some code.
      QWebPage::toHtml() should return the correct html as you are expecting it.

      --- SUPPORT REQUESTS VIA CHAT WILL BE IGNORED ---
      If you have a question please use the forum so others can benefit from the solution in the future

      tkksT 1 Reply Last reply
      0
      • T Offline
        T Offline
        ThatDud3
        wrote on last edited by
        #3

        Well if nothing else works you can always pull it through javascript

        document.getElementsByTagName('html')[0].innerHTML
        

        Get the html of the javascript-rendered page (after interacting with it)
        and if there are any IFRAMEs or any other part of page missing you should pass argument to application "--disable-web-security"
        QtWebEngine: “Not allowed to load local resource” for iframe, how to disable web security?

        1 Reply Last reply
        0
        • raven-worxR raven-worx

          @tkks
          please post some code.
          QWebPage::toHtml() should return the correct html as you are expecting it.

          tkksT Offline
          tkksT Offline
          tkks
          wrote on last edited by
          #4

          @raven-worx

          file :ca.h

          #ifndef CA_H
          #define CA_H
          
          #include <QObject>
          #include <string>
          #include <QtWebEngineWidgets/qwebengineview.h>
          #include <QtWebEngineWidgets/qwebenginesettings.h>
          
          class CA:public QObject
          {
              Q_OBJECT
          public:
              CA();
              void spider(const std::string& str);
          public slots:
              void finish(bool is_ok);
          
          private:
              QWebEngineView *view ;
          };
          
          #endif // CA_H
          
          

          file:ca.cpp

          #include "ca.h"
          #include <QString>
          
          CA::CA()
          {
              view = nullptr;
          }
          
          void CA::spider(const std::string& str)
          {
              qDebug("spider");
              if(view == nullptr)
              {
                  qDebug("new QWebEngineView");
                  view = new QWebEngineView;
                  view->setUrl(QUrl(QString::fromStdString(str)));
                  QObject::connect( view , &QWebEngineView::loadFinished , this , &CA::finish  );
                  QWebEngineSettings *setting = view->page()->settings();
                  setting->setAttribute( QWebEngineSettings::AutoLoadImages , false );
                  view->resize(1024, 750);
              }
              else
              {
                  view->setUrl(QUrl(QString(str.c_str())));
              }
              //view->show();
          }
          
          void CA::finish(bool is_ok)
          {
              if(is_ok)
              {
                  qDebug("load successed!");
                  view->page()->toHtml(
                  [](const QString &str )mutable -> void
                  {
                      QString html = str ;
                      qDebug()<<html;
                  }
                  );
              }
              else
              {
                  qDebug("load error!");
              }
          }
          
          

          file main.cpp

          #include <QCoreApplication>
          #include <QApplication>
          #include "ca.h"
          using namespace std;
          
          
          
          
          
          int main(int argc, char *argv[])
          {
              QApplication a(argc, argv);
          
              CA  ca;
              ca.spider("https://dangdang.tmall.com/search.htm?search=y&orderType=newOn_desc&pageNo=88");
          
              return a.exec();
          }
          
          

          when the slots funcion nothing happened just cout

          load successed!
          
          1 Reply Last reply
          0

          • Login

          • Login or register to search.
          • First post
            Last post
          0
          • Categories
          • Recent
          • Tags
          • Popular
          • Users
          • Groups
          • Search
          • Get Qt Extensions
          • Unsolved