More regular expression questions
-
I am making some progress , but for a life of me I cannot find the doc explaining INDIVIDUAL options .
Such as "w" matches word . "+" matches all words in the string etc.
I am also struggling with putting spaces in the expression....
The attached debug matches
"Menu space" and ":" - what I need is to match "main:"'
passed expression "(\w+ |:)"
passed to analyze "\u0001\u001B 1;39m\u0002Menu main:\u0001\u001B 0m\u0002"
Pattern function "(\w+ |:)"
Match (word) "Menu "
Global match (words) ("Menu ")
Match (word) ":"
Global match (words) ("Menu ", ":")' -
Hi,
Why do you have a | in your expression ?
As for the meaning of the various symbols, these are standard regexp items. QRegularExpression uses pcre2.
As already suggested in another of your thread, there's the QRegularExpression example that provides an easy to use interface to build and test your regular expressions.
-
Hi,
Why do you have a | in your expression ?
As for the meaning of the various symbols, these are standard regexp items. QRegularExpression uses pcre2.
As already suggested in another of your thread, there's the QRegularExpression example that provides an easy to use interface to build and test your regular expressions.
@SGaist I want to match "word space" and "word :" So I used "space| :" "space or : ".
-
Then you have an error in your expression.
You want the space in any case so what you are after is:(\w+ :?)And if you want the white space to be generic:
(\w+\s:?) -
Then you have an error in your expression.
You want the space in any case so what you are after is:(\w+ :?)And if you want the white space to be generic:
(\w+\s:?)@SGaist Either pattern still missing the word with ":" .
I think the problem is - I am actually missing the LAST valid word in the string.
There must be some other setting I am missing.
passed expression "(\\w+\\s:?)" passed to analyze "\u0001\u001B[1;39m\u0002Menu main:\u0001\u001B[0m\u0002" Pattern function "(\\w+\\s:?)" Match (word) "Menu " Global match (words) ("Menu ") -
@SGaist Either pattern still missing the word with ":" .
I think the problem is - I am actually missing the LAST valid word in the string.
There must be some other setting I am missing.
passed expression "(\\w+\\s:?)" passed to analyze "\u0001\u001B[1;39m\u0002Menu main:\u0001\u001B[0m\u0002" Pattern function "(\\w+\\s:?)" Match (word) "Menu " Global match (words) ("Menu ")@AnneRanch said in More regular expression questions:
"(\\w+\\s:?)"This matches one or more characters making a "word", then a mandatory whitespace character, then an optional colon character. Hence, for example
main:would not match, as it does mot have a space character prior to the colon. Instead it will only match against theMenu(Menuwith its following space character).- If you changed it to, say,
"(\\w+\\s*:?)"the whitespace after the "word" would be optional, and it should match themain:on the second call. - If you only want the words and not the punctuation/spaces then as said previously
"(\\w+)"should do that. - If you don't care about "individual words" and only want to pick out the
Menu main:part as one string try"[ -z]+"or"[ -~]+". However, this will produce multiple matches including e.g. the[1;39mand[0mfragments, since they are also "printable characters".
But since we don't know what you are trying to achieve I don't know what exactly you want.
- If you changed it to, say,
-
@AnneRanch said in More regular expression questions:
"(\\w+\\s:?)"This matches one or more characters making a "word", then a mandatory whitespace character, then an optional colon character. Hence, for example
main:would not match, as it does mot have a space character prior to the colon. Instead it will only match against theMenu(Menuwith its following space character).- If you changed it to, say,
"(\\w+\\s*:?)"the whitespace after the "word" would be optional, and it should match themain:on the second call. - If you only want the words and not the punctuation/spaces then as said previously
"(\\w+)"should do that. - If you don't care about "individual words" and only want to pick out the
Menu main:part as one string try"[ -z]+"or"[ -~]+". However, this will produce multiple matches including e.g. the[1;39mand[0mfragments, since they are also "printable characters".
But since we don't know what you are trying to achieve I don't know what exactly you want.
@JonB said in More regular expression questions:
If you changed it to, say, "(\w+\s*:?)" the whitespace after the "word" would be optional, and it should match the main: on the second call.
No go, tried that before and it also matched "NON words" AKA control characters.
OK, I'll say it ( to make sure ) - all I want is to match REAL words , NOT control characters.
I will pass on this for now, I am looking into a different , less convoluted , solution.
This "regular expression" is too "try this and see" , not very predicable.
Often wonder why "they" call it "regular" - that implies, to me , there must be "irregular" somewhere. - If you changed it to, say,
-
@JonB said in More regular expression questions:
If you changed it to, say, "(\w+\s*:?)" the whitespace after the "word" would be optional, and it should match the main: on the second call.
No go, tried that before and it also matched "NON words" AKA control characters.
OK, I'll say it ( to make sure ) - all I want is to match REAL words , NOT control characters.
I will pass on this for now, I am looking into a different , less convoluted , solution.
This "regular expression" is too "try this and see" , not very predicable.
Often wonder why "they" call it "regular" - that implies, to me , there must be "irregular" somewhere.@AnneRanch
I really would have thought"[ -~]+"would work for what you want, did you try that?
Should have picked outMenu main:.
I did say you will get "spurious" matches against e.g.[1;39m&[0mas well though, because they are "genuine" characters in the middle of the terminal control sequences..
Understand if you have had enough.