Regular expression QT style
-
wrote on 14 Aug 2022, 21:01 last edited by
Finally found a resource which explains the symbols used in regular repression.
As a result - here is what I know about using "w" - lower case .
this code matches a single ASCII letter character
"(\w)"this matches a multiple ASCII letters ( plural ) characters - hence a word
and also the first word in the a string
"(\w+)"now HOW do I code to match the entire string ? AKA multiple words ?
would placing "$" SOMEWHERE work ?And how do I match LAST ASCII word in the string?
PS It appears that "w+" will match the string terminated as " line" .
Hence a text file has "lines" and finding a last word in the file takes few additional lines of code.
I have not been able to duplicate this . -
Finally found a resource which explains the symbols used in regular repression.
As a result - here is what I know about using "w" - lower case .
this code matches a single ASCII letter character
"(\w)"this matches a multiple ASCII letters ( plural ) characters - hence a word
and also the first word in the a string
"(\w+)"now HOW do I code to match the entire string ? AKA multiple words ?
would placing "$" SOMEWHERE work ?And how do I match LAST ASCII word in the string?
PS It appears that "w+" will match the string terminated as " line" .
Hence a text file has "lines" and finding a last word in the file takes few additional lines of code.
I have not been able to duplicate this .wrote on 14 Aug 2022, 21:17 last edited by JonB@AnneRanch said in Regular expression QT style:
now HOW do I code to match the entire string ?
.*
matches an entire string.AKA multiple words ?
If you mean you want to capture words something like
((\w+)\s*)*
or((\w+)\W*)*
-
@AnneRanch said in Regular expression QT style:
now HOW do I code to match the entire string ?
.*
matches an entire string.AKA multiple words ?
If you mean you want to capture words something like
((\w+)\s*)*
or((\w+)\W*)*
wrote on 15 Aug 2022, 03:39 last edited by@JonB Thanks, I got it sort of working. The problem is the "source' is not formatted in nice lines ...
I'll try to "split" it by "\n" or just counting the characters.
This is all OK , but what I really need to detect the position of the last character in source and analyze / read next characters.
I am using "text changed" signal and it always reads the entire editText - I need to read only AFTER the end of the original text.
Maybe there is another way to 'skin the cat '... -
wrote on 15 Aug 2022, 15:50 last edited by
I am still not sure what expression to use to retrieve LAST word from the QString.
The doc is helping, but it would help ,ore if complex repression have better or some verbal description. -
I am still not sure what expression to use to retrieve LAST word from the QString.
The doc is helping, but it would help ,ore if complex repression have better or some verbal description.wrote on 15 Aug 2022, 16:42 last edited by JonB@AnneRanch
Depending, you might find(\w+)\W*$
works for you (e.g. try it at https://regex101.com/). -
wrote on 18 Aug 2022, 13:58 last edited by candy76041820
https://regex101.com/r/8zkvRF/1
QRegularExpression rx(R"(^\W*(\w+\W+)*(?<LastWord>\w+)\W*$)", QRegularExpression::DontCaptureOption); QString str="Capturing last word in a line."; qDebug()<<rx.match(str).captured("LastWord");
Or, just capture EVERY word and use the last one in match.capturedTexts(), if there isn't a reason to have regex do all the work.
-
@JonB Thanks, I got it sort of working. The problem is the "source' is not formatted in nice lines ...
I'll try to "split" it by "\n" or just counting the characters.
This is all OK , but what I really need to detect the position of the last character in source and analyze / read next characters.
I am using "text changed" signal and it always reads the entire editText - I need to read only AFTER the end of the original text.
Maybe there is another way to 'skin the cat '...wrote on 18 Aug 2022, 16:34 last edited by JoeCFD@AnneRanch QString::simplified() to get rid of all "\n" and internal whitespaces if needed.
https://doc.qt.io/qt-5/qstring.html#simplified -
@JonB Thanks, I got it sort of working. The problem is the "source' is not formatted in nice lines ...
I'll try to "split" it by "\n" or just counting the characters.
This is all OK , but what I really need to detect the position of the last character in source and analyze / read next characters.
I am using "text changed" signal and it always reads the entire editText - I need to read only AFTER the end of the original text.
Maybe there is another way to 'skin the cat '...@AnneRanch
Hi
Just as a side note.
I find you extra brave to try regular expressions or we could call them regular Depression 😜 as
the syntax is not what I consider intuitive but it's mostly due to the fact I used them rarely and hence
kind of forget about flow and usage.On that note, as a adventure, I tried out
https://github.com/VerbalExpressions/CppVerbalExpressionswhich allows me to write like. (from site)
verex expr = verex() .search_one_line() .start_of_line() .then( "http" ) .maybe( "s" ) .then( "://" ) .maybe( "www." ) .anything_but( " " ) .end_of_line();
which enabled me to write some tool code with less fuss getting the syntax right
In this case, the same would be ^(?:http)(?:s)?(?:://)(?:www.)?(?:[^ ]*)$
and call me stupid, but (for me) the first syntax just in another league of expressions 🤷♂️
( in your use case, it would be silly use to an external lib, just for this task, goes without saying ;) ) -
@AnneRanch
Hi
Just as a side note.
I find you extra brave to try regular expressions or we could call them regular Depression 😜 as
the syntax is not what I consider intuitive but it's mostly due to the fact I used them rarely and hence
kind of forget about flow and usage.On that note, as a adventure, I tried out
https://github.com/VerbalExpressions/CppVerbalExpressionswhich allows me to write like. (from site)
verex expr = verex() .search_one_line() .start_of_line() .then( "http" ) .maybe( "s" ) .then( "://" ) .maybe( "www." ) .anything_but( " " ) .end_of_line();
which enabled me to write some tool code with less fuss getting the syntax right
In this case, the same would be ^(?:http)(?:s)?(?:://)(?:www.)?(?:[^ ]*)$
and call me stupid, but (for me) the first syntax just in another league of expressions 🤷♂️
( in your use case, it would be silly use to an external lib, just for this task, goes without saying ;) )wrote on 19 Aug 2022, 09:12 last edited by@mrjj Part of the problem is - lack of easy access to symbols ( QT has two 'reg expressions classes and only one has such documentation ) and practically no verbose explanation HOW the expression is interpreted. When I started I received plenty of " do this ..." but no explanation WHY "do this". Such concept is OK if one does not care WHY it works and just blindly "follow the instruction ". That is not how I learn. The main irony of all of this - all I wanted to detect an add to a string - AKA last word entered by user....Since the user is dynamically entering the "word" I can keep analyzing the partial addition or just wait for "Enter" - '\n" - but my string already contains few "\n" ...
But I am getting off the subject ... -
@mrjj Part of the problem is - lack of easy access to symbols ( QT has two 'reg expressions classes and only one has such documentation ) and practically no verbose explanation HOW the expression is interpreted. When I started I received plenty of " do this ..." but no explanation WHY "do this". Such concept is OK if one does not care WHY it works and just blindly "follow the instruction ". That is not how I learn. The main irony of all of this - all I wanted to detect an add to a string - AKA last word entered by user....Since the user is dynamically entering the "word" I can keep analyzing the partial addition or just wait for "Enter" - '\n" - but my string already contains few "\n" ...
But I am getting off the subject ...wrote on 19 Aug 2022, 12:45 last edited by@AnneRanch For regex, the reason you don't find explanation is that they can be quite lengthy. Taking some compiler principle lessons to have a clearer understanding is very helpful.
As for the scenario here? If real-time responses aren't required, wait for Enter, ui.textBox->text()->split('\n'), then run the regex against each of the lines (w/out the multiline option). Again, no reason to have regex do all the work.
Or even, for each non-empty line, find the last index of whitespace as X (-1 if not found), the last index of 0~9/A~Z/a~z as Y (line.length()-1 if not found), then line.substr(X to Y, inclusive) is what you need. No regex at all.
1/10