Pyside2 toHtml()?
-
def _loadFinished(self):
self.page().toHtml(self.callable)def callable(self, data): self.html = data self.app.quit()
I can find the fuction in PyQt5, but it not exist in Pyside2!
How to get the source code of the loaded web page?
I‘m tried use js code "var innerText = document.getElementsByTagName('html')[0].innerHTML", but how can I get innerText into python?
-
def _loadFinished(self):
self.page().toHtml(self.callable)def callable(self, data): self.html = data self.app.quit()
I can find the fuction in PyQt5, but it not exist in Pyside2!
How to get the source code of the loaded web page?
I‘m tried use js code "var innerText = document.getElementsByTagName('html')[0].innerHTML", but how can I get innerText into python?
@v-n-lee
So far as I can see, if you mean For the Python, let's start with:-
QWebEnginePage::loadFinished(bool ok)
, https://doc.qt.io/qt-5/qwebenginepage.html#loadFinished, takes an argument stating whether the load was successful. Your slot should accept & check that argument. -
Is your
_loadFinished(self)
indeed attached as a slot to the signal? Have you checked it is actually being called? -
It probably won't make any difference, but perhaps you should decorate your slot with
@Slot
?
EDIT: Hmm, I see what you mean now, https://doc.qt.io/qtforpython/PySide2/QtWebEngineWidgets/QWebEnginePage.html#qwebenginepage seems to have
setHtml()
and mentionstoHtml()
, but unlike the C++ docs does not show the latter has been supplied at all.... Did you try to see if it works, even though not documented? Similarly, they don't show the overloads ofQWebEnginePage.runJavascript()
which can get at JS results. Everything which has aQWebEngineCallback
seems not to be implemented....This is sad news for me, as I am moving from PyQt5 to PySide2, and I hoped PySide2 would by now have all the methods. You could raise this issue at the Qt bug board for PySide2 and see what the PySide2 folks have to say? They may be very helpful, it looks like there must be a reason these callbacks have not been implemented.
Bad news I'm afraid :( I found https://bugreports.qt.io/browse/PYSIDE-474?focusedCommentId=365367&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-365367, from July 2017, stating:
I think we should just blacklist or skip the test.
QWebEngineCallback
is a poor man'sstd::function
, and thus not supported by PySide yet.You should perhaps lobby for an update on this.... I have just made a post there at https://bugreports.qt.io/browse/PYSIDE-474?focusedCommentId=494236&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-494236, I don't know whether devs will look at new comments on old issues....
-
-
def show_web_code(self): t = '''var test=document.getElementsByTagName('html')[0].innerHTML;alert(test);''' self._tab_widget.currentWidget().page().runJavaScript(t)
The result:
But I don't know how to transmit the code to python@v-n-lee said in Pyside2 toHtml()?:
But I don't know how to transmit the code to python
That is because you cannot do so, until
QWebEngineCallback
is dealt with. Did you understand what I wrote and the Qt bug posts it references? -
@v-n-lee
It looks like the current, open but unresolved, issue for this is https://bugreports.qt.io/browse/PYSIDE-946. I see no current workaround, though I have asked there if there is supposed to be one. -
@v-n-lee and @JonB A possible workaround is to use QWebChannel
import sys from PySide2 import QtCore, QtWidgets, QtWebEngineWidgets, QtWebChannel class Backend(QtCore.QObject): @QtCore.Slot(str) def toHtml(self, html): self._html = html QtCore.QCoreApplication.quit() @property def html(self): return self._html class WebEnginePage(QtWebEngineWidgets.QWebEnginePage): def __init__(self, url): app = QtWidgets.QApplication([]) super(WebEnginePage, self).__init__() self.load(url) self.loadFinished.connect(self.onLoadFinished) self._backend = Backend() app.exec_() @property def backend(self): return self._backend @QtCore.Slot(bool) def onLoadFinished(self, ok): if ok: self.load_qwebchannel() self.load_object() def load_qwebchannel(self): file = QtCore.QFile(":/qtwebchannel/qwebchannel.js") if file.open(QtCore.QIODevice.ReadOnly): content = file.readAll() file.close() self.runJavaScript(content.data().decode()) if self.webChannel() is None: channel = QtWebChannel.QWebChannel(self) self.setWebChannel(channel) def load_object(self): if self.webChannel() is not None: self.webChannel().registerObject("backend", self.backend) script = r""" new QWebChannel(qt.webChannelTransport, function (channel) { var backend = channel.objects.backend; var html = document.getElementsByTagName('html')[0].innerHTML; backend.toHtml(html); });""" self.runJavaScript(script) if __name__ == "__main__": url = QtCore.QUrl("https://forum.qt.io/topic/110775/pyside2-tohtml") page = WebEnginePage(url) print(page.backend.html)
-
@v-n-lee and @JonB A possible workaround is to use QWebChannel
import sys from PySide2 import QtCore, QtWidgets, QtWebEngineWidgets, QtWebChannel class Backend(QtCore.QObject): @QtCore.Slot(str) def toHtml(self, html): self._html = html QtCore.QCoreApplication.quit() @property def html(self): return self._html class WebEnginePage(QtWebEngineWidgets.QWebEnginePage): def __init__(self, url): app = QtWidgets.QApplication([]) super(WebEnginePage, self).__init__() self.load(url) self.loadFinished.connect(self.onLoadFinished) self._backend = Backend() app.exec_() @property def backend(self): return self._backend @QtCore.Slot(bool) def onLoadFinished(self, ok): if ok: self.load_qwebchannel() self.load_object() def load_qwebchannel(self): file = QtCore.QFile(":/qtwebchannel/qwebchannel.js") if file.open(QtCore.QIODevice.ReadOnly): content = file.readAll() file.close() self.runJavaScript(content.data().decode()) if self.webChannel() is None: channel = QtWebChannel.QWebChannel(self) self.setWebChannel(channel) def load_object(self): if self.webChannel() is not None: self.webChannel().registerObject("backend", self.backend) script = r""" new QWebChannel(qt.webChannelTransport, function (channel) { var backend = channel.objects.backend; var html = document.getElementsByTagName('html')[0].innerHTML; backend.toHtml(html); });""" self.runJavaScript(script) if __name__ == "__main__": url = QtCore.QUrl("https://forum.qt.io/topic/110775/pyside2-tohtml") page = WebEnginePage(url) print(page.backend.html)
@eyllanesc
Thank you for this. I assume it works! SoQWebChannel
is a component loadable from JS which can be used to communicate back to the Qt app host? -
UPDATE: I see https://bugreports.qt.io/browse/PYSIDE-946 has just moved to In Progress, which is good news.
-
Do not try to save to text or download the page using js, this is a lost cause (for security reasons).
One workaround: put the content of the page within the GET parameters of a form:document.getElementById('name_of_my_element').innerHTML = '<form action="#" method="get"><input type="hidden" name="param1" value="'+ encodeURIComponent(getResultsHTML()) +'">' +'<input type="submit" value="Submit"></form>';
I made a js function getResultsHTML() that calculates results, but it can be any text, even a document.getElementById('name_of_my_element').innerHTML of you whole page
In my python code it's simple:
from PySide2.QtWebEngineWidgets import QWebEngineView from PySide2 import QtCore def run(): # ... self.myview = QWebEngineView() path_html_file = os.path.abspath(path_html_file) url_tlx = QtCore.QUrl.fromLocalFile(path_html_file) self.myview .load(url_tlx) self.myview .show() self.myview.urlChanged.connect(self.callback_url_changed) # this will trigger when url is changed def callback_url_changed(self): print(myview.url().toString()) # > file:///index.html?param1=text%20forwared%20in%20get# # continue with with re and urllib.parse to extract and decode from the string
until a solution comes out...