Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Regular expressio to match "word"...
Forum Updated to NodeBB v4.3 + New Features

Regular expressio to match "word"...

Scheduled Pinned Locked Moved Unsolved General and Desktop
8 Posts 3 Posters 4.6k Views 2 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A Offline
    A Offline
    Anonymous_Banned275
    wrote on last edited by
    #1

    This is embarrassing and frustrating same time.
    I cannot find the thread I posted while back and missing the "Regular expression" test tool also.
    Anyway - the attached code retrieves capital "w" which happen to be a first ASCII character in the "result" source.
    I recall something about C requires "escape character '...
    I am trying to "match" all words in source.

    result = SubRegularExpressionExt("[/\w/]", result,1);

    JonBJ 1 Reply Last reply
    0
    • A Anonymous_Banned275

      This is embarrassing and frustrating same time.
      I cannot find the thread I posted while back and missing the "Regular expression" test tool also.
      Anyway - the attached code retrieves capital "w" which happen to be a first ASCII character in the "result" source.
      I recall something about C requires "escape character '...
      I am trying to "match" all words in source.

      result = SubRegularExpressionExt("[/\w/]", result,1);

      JonBJ Offline
      JonBJ Offline
      JonB
      wrote on last edited by JonB
      #2

      @AnneRanch
      \w in a regular expression matches a (single) "word" character.
      \w+ would match multiple word characters, e.g. a whole word.
      Of course if you want to put a literal \ into a C string you must double it, like "\\w+".

      A 1 Reply Last reply
      2
      • JonBJ JonB

        @AnneRanch
        \w in a regular expression matches a (single) "word" character.
        \w+ would match multiple word characters, e.g. a whole word.
        Of course if you want to put a literal \ into a C string you must double it, like "\\w+".

        A Offline
        A Offline
        Anonymous_Banned275
        wrote on last edited by Anonymous_Banned275
        #3

        A picture is worth thousands words

        @JonB 2226ac69-1153-4369-9588-c564b3691fb6-image.png

        and the result is - single "M" was "matched".

        The expression is defined as "[\w+]" and passed as such to the function.

        8b9d8b9b-6c21-4a63-8729-fa61e230ae8e-image.png

        a2afd27b-1f9b-49de-9a93-74234a4670a5-image.png

        Where is the problem ?

        JonBJ 1 Reply Last reply
        0
        • JohanSoloJ Offline
          JohanSoloJ Offline
          JohanSolo
          wrote on last edited by JohanSolo
          #4

          The problem lies in the definition of your pattern: no need for [ and ], and why do you have .+?
          If you want to match a single word, just use QString checkMatch( "\\w" ); or QString checkMatch( R"x(\w)x" );. What you wrote is actually matching each character of single words. You might want try to validate your regex first, e.g. using an online validator like this one (by the way, the default regex engine in C++ is ECMAScript if I'm not mistaken).

          `They did not know it was impossible, so they did it.'
          -- Mark Twain

          1 Reply Last reply
          0
          • A Anonymous_Banned275

            A picture is worth thousands words

            @JonB 2226ac69-1153-4369-9588-c564b3691fb6-image.png

            and the result is - single "M" was "matched".

            The expression is defined as "[\w+]" and passed as such to the function.

            8b9d8b9b-6c21-4a63-8729-fa61e230ae8e-image.png

            a2afd27b-1f9b-49de-9a93-74234a4670a5-image.png

            Where is the problem ?

            JonBJ Offline
            JonBJ Offline
            JonB
            wrote on last edited by JonB
            #5

            @AnneRanch said in Regular expressio to match "word"...:

            Where is the problem ?

            I said to use \w+ but you chose to use [\w.+] or [(\w)+] or [\w+]. (with appropriate doublings of \ in a C string).

            \w matches a single "word character". + requires one or more of these (consecutively), i.e. a whole word. That's my \w+.

            But as soon as you use [...] in a regular expression that means "any one of the characters inside the brackets". So your [\w.+] means: any single word character or a dot or a plus sign, just one of any of these.

            @JohanSolo said in Regular expressio to match "word"...:

            If you want to match a single word, just use QString checkMatch( "\\w" );

            I do not agree with this. That will match a single word-character. A single whole word will require QString checkMatch( "\\w+" );.

            You might want try to validate your regex first, e.g. using an online validator like this one (by the way, the default regex engine in C++ is ECMAScript if I'm not mistaken).

            As @JohanSolo says, you might want to play with reg exs at https://regex101.com/ while you develop them. Actually Qt does not use the "ECMAScript" variant, it uses "PCRE". Just leave the FLAVOR shown on the left-hand side of that page at its default value, which is PCRE2 (PHP >= 7.3). If copying something which works there back to Qt for a C string, don't forget to double any \ characters to \\.

            JohanSoloJ 1 Reply Last reply
            0
            • JonBJ JonB

              @AnneRanch said in Regular expressio to match "word"...:

              Where is the problem ?

              I said to use \w+ but you chose to use [\w.+] or [(\w)+] or [\w+]. (with appropriate doublings of \ in a C string).

              \w matches a single "word character". + requires one or more of these (consecutively), i.e. a whole word. That's my \w+.

              But as soon as you use [...] in a regular expression that means "any one of the characters inside the brackets". So your [\w.+] means: any single word character or a dot or a plus sign, just one of any of these.

              @JohanSolo said in Regular expressio to match "word"...:

              If you want to match a single word, just use QString checkMatch( "\\w" );

              I do not agree with this. That will match a single word-character. A single whole word will require QString checkMatch( "\\w+" );.

              You might want try to validate your regex first, e.g. using an online validator like this one (by the way, the default regex engine in C++ is ECMAScript if I'm not mistaken).

              As @JohanSolo says, you might want to play with reg exs at https://regex101.com/ while you develop them. Actually Qt does not use the "ECMAScript" variant, it uses "PCRE". Just leave the FLAVOR shown on the left-hand side of that page at its default value, which is PCRE2 (PHP >= 7.3). If copying something which works there back to Qt for a C string, don't forget to double any \ characters to \\.

              JohanSoloJ Offline
              JohanSoloJ Offline
              JohanSolo
              wrote on last edited by
              #6

              @JohanSolo said in Regular expressio to match "word"...:

              If you want to match a single word, just use QString checkMatch( "\\w" );

              I do not agree with this. That will match a single word-character. A single whole word will require QString checkMatch( "\\w+" );.

              Yes, my bad. Thanks your pointing this out.

              `They did not know it was impossible, so they did it.'
              -- Mark Twain

              1 Reply Last reply
              1
              • A Offline
                A Offline
                Anonymous_Banned275
                wrote on last edited by
                #7

                OK, looks like I went "full circle".
                Without going thru few "this is how you do it " and NOT really explaining WHY is it done that way
                such as the initial "w+" .
                **I was under erroneous believe that "w" means "word " and not a single character .

                The real "word" is "w+".

                I am not a fool enough to expect this forum to reach me how to interpret "[" and "(" in expression - I can RTFM, if I can find one which explains concepts and not just "this is how is this done ..."

                CASE SOLVED

                JonBJ 1 Reply Last reply
                0
                • A Anonymous_Banned275

                  OK, looks like I went "full circle".
                  Without going thru few "this is how you do it " and NOT really explaining WHY is it done that way
                  such as the initial "w+" .
                  **I was under erroneous believe that "w" means "word " and not a single character .

                  The real "word" is "w+".

                  I am not a fool enough to expect this forum to reach me how to interpret "[" and "(" in expression - I can RTFM, if I can find one which explains concepts and not just "this is how is this done ..."

                  CASE SOLVED

                  JonBJ Offline
                  JonBJ Offline
                  JonB
                  wrote on last edited by
                  #8

                  @AnneRanch said in Regular expressio to match "word"...:

                  **I was under erroneous believe that "w" means "word " and not a single character .

                  @JonB said in Regular expressio to match "word"...:

                  @AnneRanch
                  \w in a regular expression matches a (single) "word" character.
                  \w+ would match multiple word characters, e.g. a whole word.

                  1 Reply Last reply
                  3

                  • Login

                  • Login or register to search.
                  • First post
                    Last post
                  0
                  • Categories
                  • Recent
                  • Tags
                  • Popular
                  • Users
                  • Groups
                  • Search
                  • Get Qt Extensions
                  • Unsolved