Qt World Summit: Register Today!

Find any things except one words by QRegularExpression

  • String want to parse

    //there are many similar strings in the file, this is one of them
    <div class="rg_meta">{"id":"IuDOvhrPwcs2WM:","isu":"healthline.com","itg":0,"ity":"jpg","oh":728,"ou":"http://www.healthline.com/hlcmsresource/images/News/general-health/071916_unitsmoking_BODY.jpg","ow":1296,"pt":"Secondhand Smoke

    Expected result

    <div class="rg_meta">{"id":"IuDOvhrPwcs2WM:","isu":"healthline.com","itg":0,"ity":"jpg","oh":728,"ou"

    Real result is no match

    My solution

    QRegularExpression re("<div class=\"rg_meta\">{^(?!\"ou\").*\"ou\"");
    auto iter = re.globalMatch(contents);
      auto match = iter.next();
      qDebug()<<"no match";

    How should I fix this error?Thanks

  • There are work around for my requirement, my ultimate goal is get the images address(ou and tu)

    QRegularExpression reg("<div class=\"rg_meta\">{[^}]*}");
    auto iter = reg.globalMatch(contents);        
        QRegularExpressionMatch match = iter.next();
        QRegularExpression link_big("\"ou\":\"([^\"]*)");
        QRegularExpression link_small("\"tu\":\"([^\"]*)");
        auto const bm = link_big.match(match.captured(0));
        auto const sm = link_small.match(match.captured(0));

    This solution work, but I want to know why look ahead solution fail, how could I do it correctly in Qt?Thanks

  • Lifetime Qt Champion


    Shouldn't it rather be something like <div class=\"rg_meta\".+(?>"ou") or <div class=\"rg_meta\">[a-zA-Z0-9{":,.\/]+(?>"ou") ?

  • Thanks, you are right, I have wrong understanding about negative look ahead of regex.

    Looks like there are not easy way to express

    match some string > match anything but a pattern

    by regular expression. The "work around" I use maybe a nice choice for this problem.

  • I mark this post as solved, if anybody come out better solution, please post at here, thanks

Log in to reply