Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

Gumbo Parser



  • Hi, I try the lib Gumbo Parser and
    had such difficulty.
    Ubuntu Qt 5.4.2
    my code:

    QString html("<html>"
                     "<head>"
                     " <title>Title</title>"
                     "</head>"
                     "<body>"
                     "<p>some string</p>"
                     "<div class = \'id\'><p>some dive ID</p></div>"
                     "</body>"
                     "</html>");
        GumboOutput* out = gumbo_parse(html.toUtf8().constData());
        GumboNode* node = out->root;
        qDebug() << node->v.element.children.length;
    

    Answer: 2 -- it's good and lib is working.
    but

    GumboVector* vec = &node->v.element.children;
        for (int x = 0; x < vec->length; x++){
           GumboNode* nodeX = &vec->data[x];
            qDebug() << nodeX->v.text.text;
        }
    

    Answer: cannot convert 'void**' to 'GumboNode* {aka GumboInternalNode*}' in initialization
    GumboNode* nodeX = &vec->data[x];
    ^
    Please tell me how to convert. Thanks



  • @Evge said:

    Please tell me how to convert. Thanks

    Well, this is not exactly related to Qt... You should try to ask the gumbo developpers.

    Edit:

    I read a bit fast your question, your problem is purely C++. Apparently vec->data[x] already gives you back a pointer, of which you are taking the address.



  • thanks @JohanSolo, you was right:

    your problem is purely C++. Apparently vec->data[x] already gives you back a pointer, of which you are taking the address.

    i try this >

    QString html("<html>"
    
                     "<head>"
                            "<title id = \'152\'>BIG title<title>"
                            "<script></script>"
                     "</head>"
                     "<body>"
                            "<p class=\'ID\'>15444552</p>"
                            "<div></div>"
                            "<p>after div</p>"
                            "<table></table>"
                     "</body>"
                     "</html>");
        GumboOutput* out = gumbo_parse(html.toUtf8().constData());
        GumboNode* node = out->root;
        QString htmlTag = QString::fromUtf8(gumbo_normalized_tagname(node->v.element.tag));
        GumboVector* vec = &node->v.element.children;
        qDebug() << htmlTag << " | children - " << vec->length;
    
        for (int x = 0; x < vec->length; x++){
            GumboNode* nodeX = static_cast<GumboNode*>(vec->data[x]);
            GumboVector *vecX = &nodeX->v.element.children;
            for (int y = 0; y < vecX->length; y++){
                GumboNode* nodeY = static_cast<GumboNode*>(vecX->data[y]);
                QString childTag = QString::fromUtf8(gumbo_normalized_tagname(nodeY->v.element.tag));
                qDebug() << childTag << " | children - "<< nodeX->v.element.children.length;
            }
        }
    

    and I got an unexpected result :/
    "html" | children - 2
    "title" | children - 1


  • Lifetime Qt Champion

    Gumbo parser is not related to Qt you should ask the authors.


Log in to reply