How to match each number within alphanumeric string.
-
Hi there,
I have stings like this:
"Abc 1234"
"Abc 234"
"Abc 0234"and I want
a) to match/find the number 234 exactly
b) match to 0234 (0 is a leading 0 here)QRegularExpression rx("[0-9]+"); //This gives me the number only QRegularExpressionMatch match = rx.match(string, QtMatchOption); if (match.hasMatch()) { ... go on
I tried as Qt::MatchOption
- Qt::MatchContains = gives me all three
- Qt::MatchExactly = gives me nothing
-
@ademmler
Please thing about rephrasing. It is quite unclear what you want, e.g.b) match to 0234 (= is a leading = here)
There isn't any
=
sign anywhere, no idea what this means. And you want to match against0234
but not against1234
? Who knows?If you want to match the number
234
anywhere then (untried) a reg ex of"\\b234\\b"
will probably do it (usingQt::MatchContains
). But doubtless you want something other than this.... -
@JonB The "=2 was a type - should be "0"
Means sometimes the 234 is written as 0234
but should be interpreted as 234 while 1234
should be treated as a different numberActually I use this two steps apporach:
QString number = "234" Strings as above: "Abc 1234" "Abc 234" "Abc 0234" QRegularExpression rx(number); QRegularExpressionMatch matchno = rx.match(string, Qt::MatchContains); QRegularExpressionMatch match; match = rx.match(matchno.captured(0), Qt::MatchExactly); if ( match.hasMatch() ) { .. do something
-
@ademmler
I cannot see how this makes any effort to distinguish between0234
and1234
per your requirement. Indeed in general I cannot see how this would would pick out234
as a "complete number" rather than just a substring occurring anywhere in the input. -
Hello,
To achieve your desired matches for the strings with exact numbers "234" and "0234" (with a leading zero), you can use a modified regular expression and the Qt::MatchFlag properly. The Qt::MatchFlag that you should use is
Qt::MatchRegExp
, which allows you to perform a regular expression match on the strings.Here's how you can modify your code:
#include <QRegularExpression> #include <QDebug> int main() { QString string1 = "Abc 1234"; QString string2 = "Abc 234"; QString string3 = "Abc 0234"; QRegularExpression rx("\\b0?234\\b"); // Modified regular expression to match exactly "234" or "0234" Qt::MatchOptions options = Qt::MatchRegExp; // Use Qt::MatchRegExp for regular expression matching // Match string1 QRegularExpressionMatch match1 = rx.match(string1, 0, options); if (match1.hasMatch()) { qDebug() << "Found:" << match1.captured(); } // Match string2 QRegularExpressionMatch match2 = rx.match(string2, 0, options); if (match2.hasMatch()) { qDebug() << "Found:" << match2.captured(); } // Match string3 QRegularExpressionMatch match3 = rx.match(string3, 0, options); if (match3.hasMatch()) { qDebug() << "Found:" << match3.captured(); } return 0; }
With the above code, you should get the following output:
Found: "234" Found: "234" Found: "0234"
The
\\b
in the regular expression ensures that the match is bounded by word boundaries, so it will only match "234" or "0234" as whole words, not parts of larger numbers. The0?
allows for an optional leading zero before "234". -
@Aliviya said in How to match each number within alphanumeric string.:
QRegularExpression rx("\b0?234\b"); // Modified regular expression to match exactly "234" or "0234"
Qt::MatchOptions options = Qt::MatchRegExp; // Use Qt::MatchRegExp for regular expression matching// Match string1 QRegularExpressionMatch match1 = rx.match(string1, 0, options);
You gave me the right input how to achieve my needs.
but I slightly modified the code.
A "Qt::MatchOptions" is not available and 3 Arguments in rx.match had not been accepted. But this is how it worked for me:QRegularExpression rx("\\b0?"+ string + "\\b"); //Variable "string" is to be used in a loop. QRegularExpressionMatch match = rx.match(cn, Qt::MatchRegExp); if (match.hasMatch()) { ....
-
@ademmler said in How to match each number within alphanumeric string.:
I tried as Qt::MatchOption
Qt::MatchContains = gives me all three
Qt::MatchExactly = gives me nothingIf you really have only a structure like
[a-z, A-Z][Space][0-9]
and the text before the number doesn't matter, you could use string split.
Split your string in half, space as delimiter and just check the numbers in the 2nd substring.Then your inital matching process should give the right results.
Qt::MatchExactly = gives me nothing
It doesn't work because your
string
contained theabc
before... and234
isnt an "exact match" even if the number is correct, since there is more than just234
. -
Another wired situation in the same project:
I am looping a text file where each line contains a color and some color data.
I am looking for line where the Beginning matches a color name like this:QString searchString = "COLOR BLUE 072 C"; while (!file.atEnd()) { QString line = file.readLine(); QStringList list = line.split(QRegExp("(\\s+)|(\\t)")); String color = list[0]; color.replace(QRegularExpression("\\^"), " "); color.replace(QRegularExpression("\\t"), ""); QRegularExpression rx(searchString); QRegularExpressionMatch match = rx.match(color, Qt::MatchRegExp); if(match.hasMatch() { do some thing ...
match.hasMatch() is always "false" even if the searchString and color are identical (checked multiple times).
I also tried Qt::MatchExactly and Qt::MatchContains without success.
In oposite "list[0].contains(color)" returns "true".
What I am missing here.
-
QRegularExpressionMatch match = rx.match(color, Qt::MatchRegExp);
Can you please do us and yourself a favor a print out what
rx
andcolor
are when it fails?QRegularExpression rx(searchString); // searchString = "COLOR BLUE 072 C"
does not look like much of a regular expression to me....
On a separate matter:
QStringList list = line.split(QRegExp("(\\s+)|(\\t)"));
What version of Qt are you using? Using
QRegExp
, and mixing it withQRegularExpression
, is not a good idea. Additionally, what is the point/aim of using\s*|\t
as a (splitting) regular expression? I don't get it.QRegularExpressionMatch match = rx.match(color, Qt::MatchRegExp);
Again, I don't know what version of Qt you are using (Qt5??) but what overload of
QRegularExpression::match()
are you using which offersQt::MatchRegExp
as a value for what type? -
@JonB said in How to match each number within alphanumeric string.:
Dear John - thx for helping me. It took some time to come back here.
Let me try to answer your questions.On a separate matter:
QStringList list = line.split(QRegExp("(\s+)|(\t)"));What version of Qt are you using?
Version 5.15.13
Using QRegExp, and mixing it with QRegularExpression, is not a good idea.
OK - how to do better?
Additionally, what is the point/aim of using \s*|\t as a (splitting) regular >expression? I don't get it.
A line could look like this. Between the fields there can be one or multiple blank or tab. I want to get the filed values only - without tabs and spaces.
BLACK^BL PAPER_100 25.9815 0.287622 -1.5913 0.036 0.04 0.0453 0.0507 0.0535 0.0521 0.0503 0.0501 0.0508 0.0507 0.0493 0.0484 0.0479 0.0478 0.0481 0.0475 0.0464 0.0456 0.0456 0.0463 0.047 0.0475 0.0482 0.0482 0.048 0.0484 0.0488 0.0488 0.0489 0.0497 0.0493
QRegularExpression rx(searchString); // searchString = "COLOR BLUE 072 C"
does not look like much of a regular expression to me....Because this changes for every search, I want to pass (from somewhere else) the "search phrase/string" as variable. In this case it should match 100% the string. I thought it is the same as writing: QRegularExpression rx("COLOR BLUE 072 C");
I am very open to learn and improve - you may tell me how to do better.
-
@ademmler said in How to match each number within alphanumeric string.:
Using QRegExp, and mixing it with QRegularExpression, is not a good idea.
OK - how to do better?
Do not
#include <QRegExp>
and do not useQRegExp
. Use onlyQRegularExpression
.I thought it is the same as writing: QRegularExpression rx("COLOR BLUE 072 C");
I am not clear what you think you are doing, and I am not sure you are either! If you have a regular expression of
COLOR BLUE 072 C
how can that possibly match input anything likeBLACK^BL PAPER_100 25.9815 0.287622 ...
, and how could it split on whitespace and/or capture anything? I wonder whether you mean this is the input string to be matched by the regular expression...??If all you want to do is split a string on whitespace, and receive a list of the items in between, all you need is QStringList QString::split(const QRegularExpression &re, Qt::SplitBehavior behavior = Qt::KeepEmptyParts) const. The example there gives
str = "Some text\n\twith strange whitespace."; list = str.split(QRegularExpression("\\s+")); // list: [ "Some", "text", "with", "strange", "whitespace." ]
-
@JonB said in How to match each number within alphanumeric string.:
I am not clear what you think you are doing, and I am not sure you are either! If you have a regular expression of COLOR BLUE 072 C how can that possibly match input anything like BLACK^BL PAPER_100 25.9815 0.287622 ..., and how could it split on whitespace and/or capture anything? I wonder whether you mean this is the input string to be matched by the regular expression...??
Dear John, sorry for confusing you. in my file is 1000 of lines. Each represents a color. My example was of course not matching the search string.
It was just a sample for "how the lines is structured".Somewhere in this file should be a line like this - wich is the one I am searching for.
COLOR^BLUE^072^C PAPER_100 25.9815 0.287622 ...
or it is
BLUE^072^C PAPER_100 25.9815 0.287622 ...
or
COLOR^072 PAPER_100 25.9815 0.287622 ...And all I need to catch ...
-
@JonB said in How to match each number within alphanumeric string.:
Do not #include <QRegExp> and do not use QRegExp. Use only QRegularExpression.
Interesting I have searched the whole project - and I do not do #include <QRegExp> anywhere. How comes that "QRegExp" is working than ...
-
@ademmler
#include <QString>
is probably enough to callline.split(QRegExp(...))
. That does not even exist in Qt6. It may do at Qt5. But that still hasQString::split(const QRegularExpression &re, ...)
, use that instead.For the 3 lines you show I don't know what it is that you want to search for "commonality". I'm afraid your requirements/rules are not expressed clearly enough. Each of the 3 starts with something different. You need to stipulate precisely what you want to search for/exclude in order to design the right code or regular expression. Or you could just do the searching on the elements returned by
str.split(QRegularExpression("\\s+"))
without worrying about a complex regular expression to match the line. -
@JonB said in How to match each number within alphanumeric string.:
For the 3 lines you show I don't know what it is that you want to search for "commonality". I'm afraid your requirements/rules are not expressed clearly enough. Each of the 3 starts with something different. You need to stipulate precisely what you want to search for/exclude in order to design the right code or regular expression. Or you could just do the searching on the elements returned by str.split(QRegularExpression("\s+")) without worrying about a complex regular expression to match the line.
I had the same thought - I think it needs a loop going through the elements of the search string and loop through the elements of the "first line element" ... if one matches here we go.
I try to explain you the problem behind - it is coming form printing workflow.
A user creates a PDF using his custom color names for color separations:
Example: A file with two colors "Monkey Blue" and "Turtle Red 132"At the print plant they use other color names - for the same color:
Like "PT Blue" and "Color 132". In production those needs to be matched:
PT Blue = Monkey Blue
Turtle Red 132 = Color 132For the sake of illustration i used this simple color names.
Of course real life those are more complex to match ... -
@ademmler
Sorry, but I don't think this explanation has anything to say about why the 3 lines you give are of interest and others are not. Your 3 lines seem to be identical to the right; to the left they start with any ofCOLOR^BLUE
,BLUE
and/orCOLOR
. Come back when you have stipulated whatever rules you want to impose.If you have a lot of situations where you wish to "map" one string to another it might be easier not to try to do it with (lots of?) regular expression matches. Rather just read the strings and use
QMap
to set up mappings which you look up. -
@JonB Dear John,
Come back when you have stipulated whatever rules you want to impose.
There is no rule! That is the problem to solve. With a rule it would be easy.
Using QMap is not possible too - because in my case there is millions of unknown combinations. This means in any location different target values and from any print job other input values ... hence I try to develop a "guess it right" logic.
Don't worry - I will find a way. Thx for your valuable hints.
-
@ademmler said in How to match each number within alphanumeric string.:
There is no rule! That is the problem to solve. With a rule it would be easy.
Sorry, this makes no sense. If you have no "rule" for what you are looking for how can you possibly know what it is you want or what code to right?
If you show 3 lines you want to recognise, and not other lines, of course there is some rule for what you want to pick out.... It might be a "best guess" but you still have to have something you wish to implement for this.