Gumbo Parser



  • Hi, I try the lib Gumbo Parser and
    had such difficulty.
    Ubuntu Qt 5.4.2
    my code:

    QString html("<html>"
                     "<head>"
                     " <title>Title</title>"
                     "</head>"
                     "<body>"
                     "<p>some string</p>"
                     "<div class = \'id\'><p>some dive ID</p></div>"
                     "</body>"
                     "</html>");
        GumboOutput* out = gumbo_parse(html.toUtf8().constData());
        GumboNode* node = out->root;
        qDebug() << node->v.element.children.length;
    

    Answer: 2 -- it's good and lib is working.
    but

    GumboVector* vec = &node->v.element.children;
        for (int x = 0; x < vec->length; x++){
           GumboNode* nodeX = &vec->data[x];
            qDebug() << nodeX->v.text.text;
        }
    

    Answer: cannot convert 'void**' to 'GumboNode* {aka GumboInternalNode*}' in initialization
    GumboNode* nodeX = &vec->data[x];
    ^
    Please tell me how to convert. Thanks



  • @Evge said:

    Please tell me how to convert. Thanks

    Well, this is not exactly related to Qt... You should try to ask the gumbo developpers.

    Edit:

    I read a bit fast your question, your problem is purely C++. Apparently vec->data[x] already gives you back a pointer, of which you are taking the address.



  • thanks @JohanSolo, you was right:

    your problem is purely C++. Apparently vec->data[x] already gives you back a pointer, of which you are taking the address.

    i try this >

    QString html("<html>"
    
                     "<head>"
                            "<title id = \'152\'>BIG title<title>"
                            "<script></script>"
                     "</head>"
                     "<body>"
                            "<p class=\'ID\'>15444552</p>"
                            "<div></div>"
                            "<p>after div</p>"
                            "<table></table>"
                     "</body>"
                     "</html>");
        GumboOutput* out = gumbo_parse(html.toUtf8().constData());
        GumboNode* node = out->root;
        QString htmlTag = QString::fromUtf8(gumbo_normalized_tagname(node->v.element.tag));
        GumboVector* vec = &node->v.element.children;
        qDebug() << htmlTag << " | children - " << vec->length;
    
        for (int x = 0; x < vec->length; x++){
            GumboNode* nodeX = static_cast<GumboNode*>(vec->data[x]);
            GumboVector *vecX = &nodeX->v.element.children;
            for (int y = 0; y < vecX->length; y++){
                GumboNode* nodeY = static_cast<GumboNode*>(vecX->data[y]);
                QString childTag = QString::fromUtf8(gumbo_normalized_tagname(nodeY->v.element.tag));
                qDebug() << childTag << " | children - "<< nodeX->v.element.children.length;
            }
        }
    

    and I got an unexpected result :/
    "html" | children - 2
    "title" | children - 1


  • Moderators

    Gumbo parser is not related to Qt you should ask the authors.


Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.