Peculiar Qt WebKit Javascript behavior



  • I've run into a peculiar Qt WebKit JavaScript problem that has taken me awhile to find a simple reproducible example because it occurs intermittently. Because Qt's implementation of WebKit doesn't allow direct access to the nodes of an element I've been forced to use JavaScript to manipulate the HTML5 DOM. This has resulted in a problem that only occurs in the Qt WebKit implementation (it works in IE, Firefox, Chrome and Sarfari). The code that consistently reproduces the problem is as follows:

    main.cpp
    @#include <QtGui/QApplication>
    #include "mainwindow.h"
    int main(int argc, char *argv[])
    {
    QApplication a(argc, argv);
    MainWindow w;
    w.show();
    return a.exec();
    }
    @

    mainwindow.h
    @#ifndef MAINWINDOW_H
    #define MAINWINDOW_H
    #include <QMainWindow>
    namespace Ui {
    class MainWindow;
    }
    class MainWindow : public QMainWindow
    {
    Q_OBJECT
    public:
    explicit MainWindow(QWidget *parent = 0);
    ~MainWindow();
    private:
    Ui::MainWindow *ui;
    };
    #endif // MAINWINDOW_H@

    mainwindow.cpp (... indicates the path to augh.html)
    @#include "mainwindow.h"
    #include "ui_mainwindow.h"
    #include <QUrl>
    MainWindow::MainWindow(QWidget *parent) :
    QMainWindow(parent),
    ui(new Ui::MainWindow)
    {
    ui->setupUi(this);
    QUrl url("...augh.html");
    ui->webView->load(url);
    }
    MainWindow::~MainWindow()
    {
    delete ui;
    }@

    augh.html
    @<!DOCTYPE html5>
    <html>
    <head>
    <title>fragment test</title>
    [removed]
    var ca;
    window.onload=function(){
    var range = document.createRange();
    var mybody = document.body;
    var children=mybody.childNodes;
    var startOffset;
    for(var i=0;i<children.length;i++){
    if(children[i].nodeType===1){
    startOffset=i;
    break;
    }
    }
    range.setStart(mybody,startOffset);
    range.setEnd(mybody,startOffset+1);
    $ca = range.commonAncestorContainer;
    alert("direct:\n"+mybody[removed]);
    alert("common ancestor innerHTML:"+$ca[removed]);
    alert("ancestor nodeName:"+$ca.nodeName);
    alert("common ancestor type = "+$ca.nodeType);
    }
    [removed]
    </head>
    <body>
    <img id='myid' src="pict.png" />
    </body>
    </html>@

    (note the script tags are removed and image tag <img id="myid" src="pict.png" /> has been replaced by pict.png in the above html)
    The problem is that innerHTML sometimes replaces the the control characters (for example <>") with their entity representations as can be seen from the alert output in the above example. Elsewhere I've enclosed the img tag with a div in order to prevent this but it still occurs intermittently. As noted earlier the above code works as expected on Chrome, Safari, FF, and IE.

    My problem with this situation is two fold:

    • Is there something I don't understand about how to use JavaScript? That is,is there some procedure I can use that will insure consistent results.
    • Is it just a problem with innerHTML or do the results reflect a real problem in the JavaScript Dom node manipulation.

    Any help will be appreciated.



  • Just noticed that the innerHTML part of the alert box has been removed in the above code. Just replace [removed] with a period followed by innerHTML to get the correct alert JavaScript..



  • I've done some more investigating and the results just get weirder and weirder. It appears that the problem is inherent in Qt's JavaScript engine as the following html illustrates:

    @<!DOCTYPE html5>
    <html>
    <head>
    <title>Alert test</title>
    [removed]
    var ca;
    window.onload=function(){
    var tst="x&";
    alert("tst0="+tst);
    tst="<x&";
    alert("tst1="+tst);
    tst="<x&";
    alert("tst2=\n"+tst);
    tst="\n<&";
    alert("tst3="+tst);
    }
    [removed]
    </head>
    <body>
    </body>
    </html>@

    (note; the code block has inserted an extra backslash in tst3 which is not present in the actual code and replaced script tags by [removed])
    All of these alert boxes should just contain the strings defined in tst. That is, the four alert boxes should display:

    first alert box display:
    tst0=x&

    second alert box display:
    tst1=<x&

    third alert box display:
    tst2=
    <x&

    fourth alert box display:
    tst3=
    <&

    All the major browsers (IE,FF,CHROME,SAFARI) produce these results but the Qt script engine produces the anomalous results:
    first alert box display:
    tst0=x&amp;

    second alert box display:
    tst1=<x&

    third alert box display:
    tst2=
    &lt;x&amp;

    fourth alert box display:
    tst3=
    &lt;&amp;

    Only tst1 results are identical to the results produced by the major browsers. It appears that somehow the text string is parsed by the alert method but I haven't been able to figure out the parsing algorithm . As can be seen from these results preceding a string of characters with a newline seems to cause characters for greater than (>), less than (<) or ampersand (&) to be replaced by their entity representations. Unfortunately it seems to be somewhat more complicated than this as can be seen from the txt0 and txt1 results.

    If anyone know the algorithm used by the script engine or knows how to prevent the alert method from inserting entities please let me know as, for me, this is a showstopper.

    Since this relates to the script engine used by Qt not just to WebKit I'm also going to open thread in the general forum too.



  • In case anyone else runs into these problems I've found some workarounds. The alert method can be reliably emulated by a QMessageBox. Thus creating a QObject which has a Q_INVOKABLE alert method which is basically a QMessageBox then using the QtWebKit Bridge one can then replace the default window alert method with
    @myQObject.alert(text)@

    The innerHTML problem was more difficult to solve but by directly manipulating DOM nodes I was able to create the desired functionality.


Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.