[SOLVED]QRegExp is too greedy...
-
Hello
I have a QString, "data" that contains the source of a web page. Included in this is the following line:@<iframe width="640" height="360" src="http://www.mysite.com/embed/4Ieelwhw" frameborder="0" allowfullscreen></iframe>@
I'm trying to extract the path that the src attribute equals to using a regexp. What I'm doing is the following:
@QRegExp regx("<iframe .src="(.)(".></iframe>)"); //<b>[^<]</b>
regx.setMinimal(true);
regx.indexIn(strType);qDebug() << strType.mid(regx.pos(1),regx.pos(2)-regx.pos(1));@
My problem is that the regex is too "greedy" though the setMinimal is set to true... The output in the debug output is:
"http://www.mysite.com/embed/4Ieelwhw" frameborder="0"
How can I make it stop at the first " and not include the frameborder? Thanks
-RS
-
I think the '?' character in your regex disables the greedy mode.
-
!I have not tested this!
You could also try: "src="([^"]*)".
That is src=" followed by any character that is not a ".
-
Unfortunately neither suggestion helped...
I tried:
@QRegExp regx("<iframe .src="([^\”])(".*></iframe>)");
regx.setMinimal(true);
regx.indexIn(strType);qDebug() << strType.mid(regx.pos(1),regx.pos(2)-regx.pos(1));@
But the output is the same:
“http://www.mysite.com/embed/4Ieelwhw” frameborder=“0”
I also tried:
@QRegExp regx("<iframe .src="(.?)(".></iframe>)");
QRegExp regx("<iframe .src="(.?)(".></iframe>)");@
But neither matched anything...Any further suggestions? Thanks!
-RS -
ok, found the problem (when I posted my reply) Apparently there was a copy paste issue: [^\”] vs [^"]*.
it now works with:
@QRegExp regx("<iframe .src="([^"])(".*></iframe>)");@
Thank you!
-RS -
I would rather write (don't know whether it works though):
QRegExp regx("<iframe .*src=\"(.+?)(\".*></iframe>)");
Note the + instead of your *. + stands for at least one character and maybe more, whilst * stands for zero or one character.
[quote author="ThaRez" date="1340956318"]I also tried:
@QRegExp regx("<iframe .src="(.?)(".></iframe>)");
QRegExp regx("<iframe .src="(.?)(".></iframe>)");@
But neither matched anything...Any further suggestions? Thanks!
-RS[/quote]