Problem with regular expression
-
I need to parse web page to find musical chords and '#' is usually used to denote sharp (or diesis) note.
for example C# is a common sign I can found. -
My exp has to catch A# or C# only if they are "isolated" from the rest of the content page.
-
Yes but so RegExp engine catches the spaces too, and in my application this is not good,
-
Sorry it continues to catch the spaces at the boundary of expression.
-
You don't want the matched text, but the captions. Have a look at "QRegExp::cap() ":http://doc.qt.nokia.com/latest/qregexp.html#cap and the sample usage in the "Capturing Text":http://doc.qt.nokia.com/latest/qregexp.html#capturing-text seciton of the docs:
@
QRegExp exp("\s([a-gA-G][#bd]?)\s");
QString test("You like the chord C# very well!");
int pos = 0;
while((pos = exp.indexIn(test, pos)) != -1) {
qDebug() << "found '" + exp.cap(1) + "'";
pos += exp.matchedLength();
}
@ -
I'm using QRegExp with QTextDocument::find function and QTextCursor,
-
I doubt that this will be possible with regular expressions (you can non-match the word boundary with a "positive lookahead assertion":http://doc.qt.nokia.com/latest/qregexp.html#assertions, but unfortunately there is no similar "look back" assertion. There is an jira issue open with a suggestion for this ("QTBUGS-2371":http://bugreports.qt.nokia.com/browse/QTBUG-2371), you can vote for it.