Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. [SOLVED]Regular Expresion and national letters
Forum Updated to NodeBB v4.3 + New Features

[SOLVED]Regular Expresion and national letters

Scheduled Pinned Locked Moved General and Desktop
11 Posts 5 Posters 5.1k Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B Offline
    B Offline
    BlackDante
    wrote on last edited by
    #1

    Hi, I have another problem :)
    In my aplication I wanna to catch everythink between quotation marks and I have problem with national letters like "ą ł ó ę ś ć ż ź"... so i write this regular expresion:
    @
    QRegExp("\"[A-Za-z0-9_+=,.:;'<>-/*+() ąśćźżęłóń]+\"");
    @
    but it isn't work... maybe not exactly not work, it work only to space and don't catch the national letters...
    Can anybody help?

    sorry for my broken english :)

    1 Reply Last reply
    0
    • D Offline
      D Offline
      dangelog
      wrote on last edited by
      #2

      Which encoding is your source file saved in? Are you using a proper encoding for it AND a proper QString decoding method for building the string you pass to QRegExp ctor, like QString::fromUtf8? For instance, you can save the source file as UTF8, or save it as ASCII and put the unicode encoding of those characters, like "\xc4\x85" for the literal "ą".

      Software Engineer
      KDAB (UK) Ltd., a KDAB Group company

      1 Reply Last reply
      0
      • G Offline
        G Offline
        goetz
        wrote on last edited by
        #3

        If you want everything between two quoatation marks you can use a simpler regex:

        @
        QRegExp re("".+"");
        re.setMinimal(true);
        @

        This matches if at least one character is between quotation marks.

        Also, what's the single backslash before your quotation mark in the regex for?

        @

        "\"[A" in C/C++

        is actually

        "[A

        @

        http://www.catb.org/~esr/faqs/smart-questions.html

        1 Reply Last reply
        0
        • B Offline
          B Offline
          BlackDante
          wrote on last edited by
          #4

          thank you Volker again, yours solutions is perfect :)
          [quote author="Volker" date="1294597497"]

          Also, what's the single backslash before your quotation mark in the regex for?

          @

          "\"[A" in C/C++

          is actually

          "[A

          @
          [/quote]

          When I looked in to QRegExp example, most of examples was started with "\" so I thought that in my case it's must to be, and it works except national letters ;)

          Peppe, I almost forgot about encoding QString and this was a problem ;) eh, still I am amateur, thanks for anwser, next time I will be remember to encodnig QString ;)

          sorry for my broken english :)

          1 Reply Last reply
          0
          • A Offline
            A Offline
            andre
            wrote on last edited by
            #5

            Note that you can get around most encoding issues by using the hexcodes instead for symbols outside the standard character range. That is less readable, but probably more relyable. The problem with text files (including source files) is that they carry no information on the encoding they are in. That means that trouble can arise as soon as somebody else, unaware of your encoding settings, start editing your file.

            1 Reply Last reply
            0
            • B Offline
              B Offline
              BlackDante
              wrote on last edited by
              #6

              thanks Andre for advice :) but if text files don't carry inforrmations about encoding, how can I get this information? Suppose that in my apllication user can open every text file and content of this file is displayed on QPlainTextEdit, so I don't have any chance to unearth innformation about encoding?

              sorry for my broken english :)

              1 Reply Last reply
              0
              • A Offline
                A Offline
                andre
                wrote on last edited by
                #7

                Nope, there is no relyable way. You can use some complicated routines that use some statistics or other heuristics to determine the likely encoding or something like that, but that's not all that relyable. Just hope that UFT-8 will soon replace all other local encodings that are in use...

                1 Reply Last reply
                0
                • B Offline
                  B Offline
                  BlackDante
                  wrote on last edited by
                  #8

                  oh, it's not good, but thanks for answer :)
                  [quote author="Andre" date="1294655465"]Just hope that UFT-8 will soon replace all other local encodings that are in use...
                  [/quote]
                  Yes, I will be prayed for this :)

                  sorry for my broken english :)

                  1 Reply Last reply
                  0
                  • G Offline
                    G Offline
                    goetz
                    wrote on last edited by
                    #9

                    If we write hex codes in sources no vendor will care for proper UTF-8 support in their products. Hexcodes are not the solution, they are the source of all that evil.

                    If you are thoroughly you can switch your entire code base to UTF-8 without problems in MS Visual Studio, Qt Creator and XCode.

                    Add to your .pro file
                    @
                    CODECFORTR = UTF-8
                    CODECFORSRC = UTF-8
                    @

                    and to your main.cpp
                    @
                    QTextCodec::setCodecForCStrings( QTextCodec::codecForName( "UTF-8" ) );
                    QTextCodec::setCodecForTr( QTextCodec::codecForName( "UTF-8" ) );
                    @

                    This way you just can tell your code editors to open the files in UTF-8 mode if not stated otherwise. It works like a charm here in our team, involving different operating systems, programming languages and IDEs.

                    We are in year 2k11, in times of mega-supercomputing and what the hell has see, and I simply refuse strictly to type hexcodes in a file to gain an 'ä' or 'ç'.

                    http://www.catb.org/~esr/faqs/smart-questions.html

                    1 Reply Last reply
                    0
                    • B Offline
                      B Offline
                      BlackDante
                      wrote on last edited by
                      #10

                      I am much grateful for this anwser :) This will be very helpful in my little project :)

                      sorry for my broken english :)

                      1 Reply Last reply
                      0
                      • I Offline
                        I Offline
                        ixSci
                        wrote on last edited by
                        #11

                        [quote]f we write hex codes in sources no vendor will care for proper UTF-8 support in their products. Hexcodes are not the solution, they are the source of all that evil.[/quote]
                        While it is a correct statement in general and I agree with you, it is not so right in regard to regexps. Regexps notion \uXXXX is a standard way to represent character in exact Unicode code point. And you have full control of what you are writing, thus you won't get any unexpected results if you use the hex notation in regexps. No encoding issues will bother you ever. BTW, there is \p{L} in regexps which is enough in the most cases.

                        1 Reply Last reply
                        0

                        • Login

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • Users
                        • Groups
                        • Search
                        • Get Qt Extensions
                        • Unsolved