Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

Find any things except one words by QRegularExpression



  • String want to parse

    //there are many similar strings in the file, this is one of them
    <div class="rg_meta">{"id":"IuDOvhrPwcs2WM:","isu":"healthline.com","itg":0,"ity":"jpg","oh":728,"ou":"http://www.healthline.com/hlcmsresource/images/News/general-health/071916_unitsmoking_BODY.jpg","ow":1296,"pt":"Secondhand Smoke
    

    Expected result

    <div class="rg_meta">{"id":"IuDOvhrPwcs2WM:","isu":"healthline.com","itg":0,"ity":"jpg","oh":728,"ou"
    

    Real result is no match

    My solution

    QRegularExpression re("<div class=\"rg_meta\">{^(?!\"ou\").*\"ou\"");
    auto iter = re.globalMatch(contents);
    while(iter.hasMatch()){
      auto match = iter.next();
      qDebug()<<match.match(0);
    }else{
      qDebug()<<"no match";
    }
    

    How should I fix this error?Thanks



  • There are work around for my requirement, my ultimate goal is get the images address(ou and tu)

    QRegularExpression reg("<div class=\"rg_meta\">{[^}]*}");
    auto iter = reg.globalMatch(contents);        
    while(iter.hasNext()){  
        QRegularExpressionMatch match = iter.next();
        QRegularExpression link_big("\"ou\":\"([^\"]*)");
        QRegularExpression link_small("\"tu\":\"([^\"]*)");
        auto const bm = link_big.match(match.captured(0));
        auto const sm = link_small.match(match.captured(0));
        qDebug()<<bm.captured(0)<<","<<sm.captured(0);
    }
    

    This solution work, but I want to know why look ahead solution fail, how could I do it correctly in Qt?Thanks


  • Lifetime Qt Champion

    Hi,

    Shouldn't it rather be something like <div class=\"rg_meta\".+(?>"ou") or <div class=\"rg_meta\">[a-zA-Z0-9{":,.\/]+(?>"ou") ?



  • Thanks, you are right, I have wrong understanding about negative look ahead of regex.

    Looks like there are not easy way to express

    match some string > match anything but a pattern
    

    by regular expression. The "work around" I use maybe a nice choice for this problem.



  • I mark this post as solved, if anybody come out better solution, please post at here, thanks


Log in to reply