Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. QtWebEngine
  4. Pyside2 toHtml()?
Forum Updated to NodeBB v4.3 + New Features

Pyside2 toHtml()?

Scheduled Pinned Locked Moved Unsolved QtWebEngine
10 Posts 4 Posters 2.5k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • V Offline
    V Offline
    v.n.lee
    wrote on last edited by
    #1

    def _loadFinished(self):
    self.page().toHtml(self.callable)

    def callable(self, data):
        self.html = data
        self.app.quit()
    

    I can find the fuction in PyQt5, but it not exist in Pyside2!

    How to get the source code of the loaded web page?

    I‘m tried use js code "var innerText = document.getElementsByTagName('html')[0].innerHTML", but how can I get innerText into python?

    JonBJ 1 Reply Last reply
    1
    • V v.n.lee

      def _loadFinished(self):
      self.page().toHtml(self.callable)

      def callable(self, data):
          self.html = data
          self.app.quit()
      

      I can find the fuction in PyQt5, but it not exist in Pyside2!

      How to get the source code of the loaded web page?

      I‘m tried use js code "var innerText = document.getElementsByTagName('html')[0].innerHTML", but how can I get innerText into python?

      JonBJ Offline
      JonBJ Offline
      JonB
      wrote on last edited by JonB
      #2

      @v-n-lee
      So far as I can see, if you mean For the Python, let's start with:

      • QWebEnginePage::loadFinished(bool ok), https://doc.qt.io/qt-5/qwebenginepage.html#loadFinished, takes an argument stating whether the load was successful. Your slot should accept & check that argument.

      • Is your _loadFinished(self) indeed attached as a slot to the signal? Have you checked it is actually being called?

      • It probably won't make any difference, but perhaps you should decorate your slot with @Slot?

      EDIT: Hmm, I see what you mean now, https://doc.qt.io/qtforpython/PySide2/QtWebEngineWidgets/QWebEnginePage.html#qwebenginepage seems to have setHtml() and mentions toHtml(), but unlike the C++ docs does not show the latter has been supplied at all.... Did you try to see if it works, even though not documented? Similarly, they don't show the overloads of QWebEnginePage.runJavascript() which can get at JS results. Everything which has a QWebEngineCallback seems not to be implemented....

      This is sad news for me, as I am moving from PyQt5 to PySide2, and I hoped PySide2 would by now have all the methods. You could raise this issue at the Qt bug board for PySide2 and see what the PySide2 folks have to say? They may be very helpful, it looks like there must be a reason these callbacks have not been implemented.

      Bad news I'm afraid :( I found https://bugreports.qt.io/browse/PYSIDE-474?focusedCommentId=365367&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-365367, from July 2017, stating:

      I think we should just blacklist or skip the test.

      QWebEngineCallback is a poor man's std::function, and thus not supported by PySide yet.

      You should perhaps lobby for an update on this.... I have just made a post there at https://bugreports.qt.io/browse/PYSIDE-474?focusedCommentId=494236&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-494236, I don't know whether devs will look at new comments on old issues....

      1 Reply Last reply
      2
      • V Offline
        V Offline
        v.n.lee
        wrote on last edited by
        #3
        def show_web_code(self):
            t = '''var test=document.getElementsByTagName('html')[0].innerHTML;alert(test);'''
            self._tab_widget.currentWidget().page().runJavaScript(t)
        

        The result:
        alt text
        But I don't know how to transmit the code to python

        JonBJ 1 Reply Last reply
        0
        • V v.n.lee
          def show_web_code(self):
              t = '''var test=document.getElementsByTagName('html')[0].innerHTML;alert(test);'''
              self._tab_widget.currentWidget().page().runJavaScript(t)
          

          The result:
          alt text
          But I don't know how to transmit the code to python

          JonBJ Offline
          JonBJ Offline
          JonB
          wrote on last edited by
          #4

          @v-n-lee said in Pyside2 toHtml()?:

          But I don't know how to transmit the code to python

          That is because you cannot do so, until QWebEngineCallback is dealt with. Did you understand what I wrote and the Qt bug posts it references?

          1 Reply Last reply
          0
          • V Offline
            V Offline
            v.n.lee
            wrote on last edited by
            #5

            All right

            JonBJ 1 Reply Last reply
            0
            • V v.n.lee

              All right

              JonBJ Offline
              JonBJ Offline
              JonB
              wrote on last edited by
              #6

              @v-n-lee
              It looks like the current, open but unresolved, issue for this is https://bugreports.qt.io/browse/PYSIDE-946. I see no current workaround, though I have asked there if there is supposed to be one.

              1 Reply Last reply
              0
              • eyllanescE Offline
                eyllanescE Offline
                eyllanesc
                wrote on last edited by eyllanesc
                #7

                @v-n-lee and @JonB A possible workaround is to use QWebChannel

                import sys
                
                from PySide2 import QtCore, QtWidgets, QtWebEngineWidgets, QtWebChannel
                
                
                class Backend(QtCore.QObject):
                    @QtCore.Slot(str)
                    def toHtml(self, html):
                        self._html = html
                        QtCore.QCoreApplication.quit()
                
                    @property
                    def html(self):
                        return self._html
                
                
                class WebEnginePage(QtWebEngineWidgets.QWebEnginePage):
                    def __init__(self, url):
                        app = QtWidgets.QApplication([])
                        super(WebEnginePage, self).__init__()
                        self.load(url)
                        self.loadFinished.connect(self.onLoadFinished)
                        self._backend = Backend()
                        app.exec_()
                
                    @property
                    def backend(self):
                        return self._backend
                
                    @QtCore.Slot(bool)
                    def onLoadFinished(self, ok):
                        if ok:
                            self.load_qwebchannel()
                            self.load_object()
                
                    def load_qwebchannel(self):
                        file = QtCore.QFile(":/qtwebchannel/qwebchannel.js")
                        if file.open(QtCore.QIODevice.ReadOnly):
                            content = file.readAll()
                            file.close()
                            self.runJavaScript(content.data().decode())
                        if self.webChannel() is None:
                            channel = QtWebChannel.QWebChannel(self)
                            self.setWebChannel(channel)
                
                    def load_object(self):
                        if self.webChannel() is not None:
                            self.webChannel().registerObject("backend", self.backend)
                            script = r"""
                            new QWebChannel(qt.webChannelTransport, function (channel) {
                                var backend = channel.objects.backend;
                                var html = document.getElementsByTagName('html')[0].innerHTML;
                                backend.toHtml(html);
                            });"""
                            self.runJavaScript(script)
                
                
                if __name__ == "__main__":
                    url = QtCore.QUrl("https://forum.qt.io/topic/110775/pyside2-tohtml")
                    page = WebEnginePage(url)
                    print(page.backend.html)
                

                If you want me to help you develop some work then you can write to my email: e.yllanescucho@gmal.com.

                JonBJ 1 Reply Last reply
                1
                • eyllanescE eyllanesc

                  @v-n-lee and @JonB A possible workaround is to use QWebChannel

                  import sys
                  
                  from PySide2 import QtCore, QtWidgets, QtWebEngineWidgets, QtWebChannel
                  
                  
                  class Backend(QtCore.QObject):
                      @QtCore.Slot(str)
                      def toHtml(self, html):
                          self._html = html
                          QtCore.QCoreApplication.quit()
                  
                      @property
                      def html(self):
                          return self._html
                  
                  
                  class WebEnginePage(QtWebEngineWidgets.QWebEnginePage):
                      def __init__(self, url):
                          app = QtWidgets.QApplication([])
                          super(WebEnginePage, self).__init__()
                          self.load(url)
                          self.loadFinished.connect(self.onLoadFinished)
                          self._backend = Backend()
                          app.exec_()
                  
                      @property
                      def backend(self):
                          return self._backend
                  
                      @QtCore.Slot(bool)
                      def onLoadFinished(self, ok):
                          if ok:
                              self.load_qwebchannel()
                              self.load_object()
                  
                      def load_qwebchannel(self):
                          file = QtCore.QFile(":/qtwebchannel/qwebchannel.js")
                          if file.open(QtCore.QIODevice.ReadOnly):
                              content = file.readAll()
                              file.close()
                              self.runJavaScript(content.data().decode())
                          if self.webChannel() is None:
                              channel = QtWebChannel.QWebChannel(self)
                              self.setWebChannel(channel)
                  
                      def load_object(self):
                          if self.webChannel() is not None:
                              self.webChannel().registerObject("backend", self.backend)
                              script = r"""
                              new QWebChannel(qt.webChannelTransport, function (channel) {
                                  var backend = channel.objects.backend;
                                  var html = document.getElementsByTagName('html')[0].innerHTML;
                                  backend.toHtml(html);
                              });"""
                              self.runJavaScript(script)
                  
                  
                  if __name__ == "__main__":
                      url = QtCore.QUrl("https://forum.qt.io/topic/110775/pyside2-tohtml")
                      page = WebEnginePage(url)
                      print(page.backend.html)
                  
                  JonBJ Offline
                  JonBJ Offline
                  JonB
                  wrote on last edited by
                  #8

                  @eyllanesc
                  Thank you for this. I assume it works! So QWebChannel is a component loadable from JS which can be used to communicate back to the Qt app host?

                  1 Reply Last reply
                  0
                  • JonBJ Offline
                    JonBJ Offline
                    JonB
                    wrote on last edited by
                    #9

                    UPDATE: I see https://bugreports.qt.io/browse/PYSIDE-946 has just moved to In Progress, which is good news.

                    1 Reply Last reply
                    0
                    • L Offline
                      L Offline
                      lokinou
                      wrote on last edited by lokinou
                      #10

                      Do not try to save to text or download the page using js, this is a lost cause (for security reasons).
                      One workaround: put the content of the page within the GET parameters of a form:

                      document.getElementById('name_of_my_element').innerHTML = '<form action="#" method="get"><input type="hidden" name="param1" value="'+  encodeURIComponent(getResultsHTML()) +'">' +'<input type="submit" value="Submit"></form>';
                      

                      I made a js function getResultsHTML() that calculates results, but it can be any text, even a document.getElementById('name_of_my_element').innerHTML of you whole page

                      In my python code it's simple:

                      from PySide2.QtWebEngineWidgets import QWebEngineView
                      from PySide2 import QtCore
                      
                      def run():
                          # ...
                          self.myview = QWebEngineView()
                          path_html_file = os.path.abspath(path_html_file)
                          url_tlx = QtCore.QUrl.fromLocalFile(path_html_file)
                          self.myview .load(url_tlx)
                          self.myview .show()  
                          self.myview.urlChanged.connect(self.callback_url_changed)  # this will trigger when url is changed
                      
                      def callback_url_changed(self):
                         print(myview.url().toString())
                      # > file:///index.html?param1=text%20forwared%20in%20get#
                      # continue with with re and urllib.parse to extract and decode from the string
                      

                      until a solution comes out...

                      1 Reply Last reply
                      0

                      • Login

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • Users
                      • Groups
                      • Search
                      • Get Qt Extensions
                      • Unsolved