Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Regular expression for *not* a *sequence* of characters
Forum Updated to NodeBB v4.3 + New Features

Regular expression for *not* a *sequence* of characters

Scheduled Pinned Locked Moved Solved General and Desktop
21 Posts 3 Posters 12.7k Views 2 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • VRoninV VRonin

    Something like this? https://regex101.com/r/gBqL4Y/3 or did I misunderstand the question?

    JonBJ Offline
    JonBJ Offline
    JonB
    wrote on last edited by JonB
    #3

    @VRonin
    Hmmm, I don't get how it looks like it works on my case #3. How greedy is that .+? in the middle? I don't want it to match a **, why doesn't it take longest match and eat up **s in the middle of a line all the way till it matches the final ** against the end of the reg exp, and giving only one match group?

    OOhhh. There's an explanation on the right!

    +? Quantifier — Matches between one and unlimited times, as few times as possible, expanding as needed

    OK, why as few times as possible?? What's happened to regular expressions, since when...? :(

    VRoninV 1 Reply Last reply
    0
    • JonBJ JonB

      @VRonin
      Hmmm, I don't get how it looks like it works on my case #3. How greedy is that .+? in the middle? I don't want it to match a **, why doesn't it take longest match and eat up **s in the middle of a line all the way till it matches the final ** against the end of the reg exp, and giving only one match group?

      OOhhh. There's an explanation on the right!

      +? Quantifier — Matches between one and unlimited times, as few times as possible, expanding as needed

      OK, why as few times as possible?? What's happened to regular expressions, since when...? :(

      VRoninV Offline
      VRoninV Offline
      VRonin
      wrote on last edited by
      #4

      @JonB said in Regular expression for *not* a *sequence* of characters:

      OK, why as few times as possible?

      That's the effect of ? after +. It make the match non-greedy. If you remove the question mark it will behave as you are expecting

      "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
      ~Napoleon Bonaparte

      On a crusade to banish setIndexWidget() from the holy land of Qt

      JonBJ 1 Reply Last reply
      2
      • VRoninV VRonin

        @JonB said in Regular expression for *not* a *sequence* of characters:

        OK, why as few times as possible?

        That's the effect of ? after +. It make the match non-greedy. If you remove the question mark it will behave as you are expecting

        JonBJ Offline
        JonBJ Offline
        JonB
        wrote on last edited by
        #5

        @VRonin
        I don't think sed accepts that construct --- regular expressions have got out of hand :)
        Thank you very much, that's a very useful one to know.

        1 Reply Last reply
        0
        • VRoninV Offline
          VRoninV Offline
          VRonin
          wrote on last edited by
          #6

          Python uses:

          regular expression matching operations similar to those found in Perl

          just as QRegularExpression does.

          https://regex101.com actually has an explicit python simulator

          "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
          ~Napoleon Bonaparte

          On a crusade to banish setIndexWidget() from the holy land of Qt

          JonBJ 2 Replies Last reply
          1
          • VRoninV VRonin

            Python uses:

            regular expression matching operations similar to those found in Perl

            just as QRegularExpression does.

            https://regex101.com actually has an explicit python simulator

            JonBJ Offline
            JonBJ Offline
            JonB
            wrote on last edited by
            #7

            @VRonin
            Yes, I do realize Python/Perl & others now use more advanced regular expressions than sed did. In my day we didn't even yet have the + operator, not sure about ?, but certainly not +? being something special. So I simply did not know about it. Being able to match fewest is really useful, of course.

            1 Reply Last reply
            0
            • VRoninV VRonin

              Python uses:

              regular expression matching operations similar to those found in Perl

              just as QRegularExpression does.

              https://regex101.com actually has an explicit python simulator

              JonBJ Offline
              JonBJ Offline
              JonB
              wrote on last edited by JonB
              #8

              @VRonin
              As an exercise, in terms of what I had had in mind without knowing about +?, how would you write, say, a matcher which wanted "2 asterisks followed by anything to end which is not another 2 asterisks?". That's what I thought we would need. So something like:

              abc ** this is a * match
              abc ** this does not match ** but I guess this * bit does
              

              ?

              1 Reply Last reply
              0
              • VRoninV Offline
                VRoninV Offline
                VRonin
                wrote on last edited by
                #9

                Something like https://regex101.com/r/VF5zir/1 ?

                "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
                ~Napoleon Bonaparte

                On a crusade to banish setIndexWidget() from the holy land of Qt

                JonBJ 1 Reply Last reply
                2
                • VRoninV VRonin

                  Something like https://regex101.com/r/VF5zir/1 ?

                  JonBJ Offline
                  JonBJ Offline
                  JonB
                  wrote on last edited by JonB
                  #10

                  @VRonin
                  Yep. I see how you've done that one, again I didn't think of doing it that way.

                  Let me try one more time: what I really want to know is just how you write "whole line [say] must not include a multi-char sequence"?

                  I know how to do "not a single char": [^abc]. How do you do "not a sequence of chars"? Sort of like ^(this sequence), which I know does not work. Hence the original title of this thread.

                  1 Reply Last reply
                  0
                  • VRoninV Offline
                    VRoninV Offline
                    VRonin
                    wrote on last edited by
                    #11

                    @JonB said in Regular expression for *not* a *sequence* of characters:

                    How do you do "not a sequence of chars"?

                    RegExp does not have (and probably never will) this construct. The argument is that it can easily be inverted from the calling code, i.e. write the regex that matches the sequence and then instead of if(regexp.match()) you'd use if(!regexp.match())

                    "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
                    ~Napoleon Bonaparte

                    On a crusade to banish setIndexWidget() from the holy land of Qt

                    JonBJ 1 Reply Last reply
                    1
                    • VRoninV VRonin

                      @JonB said in Regular expression for *not* a *sequence* of characters:

                      How do you do "not a sequence of chars"?

                      RegExp does not have (and probably never will) this construct. The argument is that it can easily be inverted from the calling code, i.e. write the regex that matches the sequence and then instead of if(regexp.match()) you'd use if(!regexp.match())

                      JonBJ Offline
                      JonBJ Offline
                      JonB
                      wrote on last edited by JonB
                      #12

                      @VRonin
                      Ah, now we're getting somewhere --- that might explain why I don't know how to do it! I thought it could be done using one of these new-fangled "negative lookahead/behind" constructs, but no? You've set me a challenge now... :)

                      It seems strange to me that reg exs can cope with "not one character" but not with "not multiple characters".

                      I know I can do it "in code" as you have shown. But Qt has various places which allow a reg ex filter/matcher, e.g. a QLineEdit validator which I think has to match for the validation to succeed. I could use [^*] to reject any line with * in it. But to reject lines which have ** in them, you're saying I cannot use a plain reg ex validator string and have to go write some kind of code (I think the Qt validators allow for that, but that's not my point)?

                      EDIT

                      (?<!foo) Negative Lookbehind Asserts that what immediately precedes the current position in the string is not foo

                      This is probably what I was thinking about. So, for example, I presume:

                      ^.*(?<!\*\*)$
                      

                      rejects lines which end with **, which is "rejecting by a sequence of characters"? [Yep, tested.] Can we expand on this to implement the "not" in-line instead?

                      kshegunovK 1 Reply Last reply
                      0
                      • JonBJ JonB

                        @VRonin
                        Ah, now we're getting somewhere --- that might explain why I don't know how to do it! I thought it could be done using one of these new-fangled "negative lookahead/behind" constructs, but no? You've set me a challenge now... :)

                        It seems strange to me that reg exs can cope with "not one character" but not with "not multiple characters".

                        I know I can do it "in code" as you have shown. But Qt has various places which allow a reg ex filter/matcher, e.g. a QLineEdit validator which I think has to match for the validation to succeed. I could use [^*] to reject any line with * in it. But to reject lines which have ** in them, you're saying I cannot use a plain reg ex validator string and have to go write some kind of code (I think the Qt validators allow for that, but that's not my point)?

                        EDIT

                        (?<!foo) Negative Lookbehind Asserts that what immediately precedes the current position in the string is not foo

                        This is probably what I was thinking about. So, for example, I presume:

                        ^.*(?<!\*\*)$
                        

                        rejects lines which end with **, which is "rejecting by a sequence of characters"? [Yep, tested.] Can we expand on this to implement the "not" in-line instead?

                        kshegunovK Offline
                        kshegunovK Offline
                        kshegunov
                        Moderators
                        wrote on last edited by
                        #13

                        Is this the thing you're after?

                        Read and abide by the Qt Code of Conduct

                        VRoninV JonBJ 2 Replies Last reply
                        2
                        • kshegunovK kshegunov

                          Is this the thing you're after?

                          VRoninV Offline
                          VRoninV Offline
                          VRonin
                          wrote on last edited by
                          #14

                          @kshegunov That works because of ^/$ you can't match abc ** this is matching ** but not this ** and this is a new one ** def where the sequence to exclude is **

                          "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
                          ~Napoleon Bonaparte

                          On a crusade to banish setIndexWidget() from the holy land of Qt

                          kshegunovK 1 Reply Last reply
                          0
                          • kshegunovK kshegunov

                            Is this the thing you're after?

                            JonBJ Offline
                            JonBJ Offline
                            JonB
                            wrote on last edited by JonB
                            #15

                            @kshegunov , @VRonin
                            The following is probably what you're both saying. But it is possible to "only match a complete line which does not contain ** anywhere in it" (e.g. for a QLineEdit validator) by (https://stackoverflow.com/a/406408/489865, also an example at https://www.regextester.com/15, they call it "Match string not containing string"):

                            ^((?!\*\*).)*$
                            

                            Which I certainly never knew!

                            @VRonin
                            I don't know what you mean by your last post (yes, the reg ex does include ^/$), would you care to clarify? I suspect it's to do with "group capturing as opposed to whole match", but not at all sure.

                            1 Reply Last reply
                            0
                            • VRoninV VRonin

                              @kshegunov That works because of ^/$ you can't match abc ** this is matching ** but not this ** and this is a new one ** def where the sequence to exclude is **

                              kshegunovK Offline
                              kshegunovK Offline
                              kshegunov
                              Moderators
                              wrote on last edited by
                              #16

                              I haven't tried to. As far as understood the question - match lines that do not contain.

                              @JonB
                              Pretty much the same idea as what I used.

                              Read and abide by the Qt Code of Conduct

                              JonBJ 1 Reply Last reply
                              1
                              • kshegunovK kshegunov

                                I haven't tried to. As far as understood the question - match lines that do not contain.

                                @JonB
                                Pretty much the same idea as what I used.

                                JonBJ Offline
                                JonBJ Offline
                                JonB
                                wrote on last edited by
                                #17

                                @kshegunov
                                Yes it is what you used (though your example really confused me with its [^t]|t in it, did you just complicate it to test me out? ;-) )

                                There is something in @VRonin 's final statement where he accepts use of ^/$ but then says "you can't match..." where I do not know what he is trying to convey...

                                kshegunovK 1 Reply Last reply
                                0
                                • JonBJ JonB

                                  @kshegunov
                                  Yes it is what you used (though your example really confused me with its [^t]|t in it, did you just complicate it to test me out? ;-) )

                                  There is something in @VRonin 's final statement where he accepts use of ^/$ but then says "you can't match..." where I do not know what he is trying to convey...

                                  kshegunovK Offline
                                  kshegunovK Offline
                                  kshegunov
                                  Moderators
                                  wrote on last edited by kshegunov
                                  #18

                                  @JonB said in Regular expression for *not* a *sequence* of characters:

                                  Yes it is what you used (though your example really confused me with its [^t]|t in it, did you just complicate it to test me out? ;-) )

                                  Surely not. It just seemed more natural to me - match anything but t OR t that's not followed by "[t]his thing" ... seemed like kind of the human way of doing it ;P

                                  There is something in @VRonin 's final statement where he accepts use of ^/$ but then says "you can't match..." where I do not know what he is trying to convey...

                                  I think he just misunderstood the question and wants to match stuff that's between ** pairs ...

                                  Read and abide by the Qt Code of Conduct

                                  JonBJ 1 Reply Last reply
                                  0
                                  • kshegunovK kshegunov

                                    @JonB said in Regular expression for *not* a *sequence* of characters:

                                    Yes it is what you used (though your example really confused me with its [^t]|t in it, did you just complicate it to test me out? ;-) )

                                    Surely not. It just seemed more natural to me - match anything but t OR t that's not followed by "[t]his thing" ... seemed like kind of the human way of doing it ;P

                                    There is something in @VRonin 's final statement where he accepts use of ^/$ but then says "you can't match..." where I do not know what he is trying to convey...

                                    I think he just misunderstood the question and wants to match stuff that's between ** pairs ...

                                    JonBJ Offline
                                    JonBJ Offline
                                    JonB
                                    wrote on last edited by JonB
                                    #19

                                    @kshegunov
                                    Surely. Have you heard of "KISS"? :-; When trying to illustrate your use of ((?!.....).)*, which is what I needed to learn as the solution, do you think adding the extra stuff would make it easy for me to understand which bit was the principle? :)

                                    I always respect what @VRonin writes. But when he said:

                                    RegExp does not have (and probably never will) this construct.

                                    it now seems to me that it does have such a construct, unless he explains just what he meant...

                                    VRoninV 1 Reply Last reply
                                    0
                                    • JonBJ JonB

                                      @kshegunov
                                      Surely. Have you heard of "KISS"? :-; When trying to illustrate your use of ((?!.....).)*, which is what I needed to learn as the solution, do you think adding the extra stuff would make it easy for me to understand which bit was the principle? :)

                                      I always respect what @VRonin writes. But when he said:

                                      RegExp does not have (and probably never will) this construct.

                                      it now seems to me that it does have such a construct, unless he explains just what he meant...

                                      VRoninV Offline
                                      VRoninV Offline
                                      VRonin
                                      wrote on last edited by VRonin
                                      #20

                                      @JonB said in Regular expression for *not* a *sequence* of characters:

                                      it now seems to me that it does have such a construct

                                      It does not have a generic way. It has a "line does not contain" or "document does not contain". Say you want to capture stuff inside ** (so \*\*(.+?)\*\*) but exclude the capture if .+? matches foo. I don't think that is possible.

                                      Forget what I said.

                                      "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
                                      ~Napoleon Bonaparte

                                      On a crusade to banish setIndexWidget() from the holy land of Qt

                                      JonBJ 1 Reply Last reply
                                      1
                                      • VRoninV VRonin

                                        @JonB said in Regular expression for *not* a *sequence* of characters:

                                        it now seems to me that it does have such a construct

                                        It does not have a generic way. It has a "line does not contain" or "document does not contain". Say you want to capture stuff inside ** (so \*\*(.+?)\*\*) but exclude the capture if .+? matches foo. I don't think that is possible.

                                        Forget what I said.

                                        JonBJ Offline
                                        JonBJ Offline
                                        JonB
                                        wrote on last edited by
                                        #21

                                        @VRonin , @kshegunov
                                        Thank you both very much for your time & input. I have learnt a lot about these "advanced" regular expressions now. I will not close this thread.

                                        1 Reply Last reply
                                        0

                                        • Login

                                        • Login or register to search.
                                        • First post
                                          Last post
                                        0
                                        • Categories
                                        • Recent
                                        • Tags
                                        • Popular
                                        • Users
                                        • Groups
                                        • Search
                                        • Get Qt Extensions
                                        • Unsolved