Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Regular expression help needed
Forum Updated to NodeBB v4.3 + New Features

Regular expression help needed

Scheduled Pinned Locked Moved General and Desktop
10 Posts 4 Posters 3.9k Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A Offline
    A Offline
    aurora
    wrote on last edited by
    #1

    I have collection of files, the contents of all those files have the following format

    @-- File name

    -- listOne (L1)
    -- listTwo (L2)
    -- listThree (L3)
    -- HeaderLine (HE)
    -- listFour (L6)
    -- listFive (L2)
    -- listSix (L9)
    -- listSeven (L0)
    -- someline (SL)
    -- listeight (LL)

    --
    REMAINING CONTENTS OF THE LINE

    some more contents

    @
    Here i want to store only L1,L2,L3 etc in a list, except HE,SL and remaining lines of files
    How can i do that?
    Please help me, i went through QREgExp class defination also, and i wrote code but that seems to be very big and inserts some blank strings into stored list

    @
    while(!f.atEnd() && (!line.contains("------------------------------------------")))
    {

    if(!line.contains("-- "))
    {
    flag=1;
    QRegExp rx("[\(]([a-z]|[0-9]|[_]|[A-Z])+[\)]");
    rx.indexIn(line);
    QRegExp rx1("([a-z]|[0-9]|[_]|[A-Z])+");
    rx1.indexIn(rx.cap(0));
    captured.append(rx1.cap(0));
    line=f.readLine();
    }
    else if(flag==1)
    {
    flag++;
    captured.pop_back();
    QRegExp rx("[\(]([a-z]|[0-9]|[_]|[A-Z])+[\)]");
    rx.indexIn(line);
    QRegExp rx1("([a-z]|[0-9]|[_]|[A-Z])+");
    rx1.indexIn(rx.cap(0));
    captured.append(rx1.cap(0));
    line=f.readLine();
    }
     
    else if(flag>0)
    { flag++;
    QRegExp rx("[\(]([a-z]|[0-9]|[_]|[A-Z])+[\)]");
    rx.indexIn(line);
    QRegExp rx1("([a-z]|[0-9]|[_]|[A-Z])+");
    rx1.indexIn(rx.cap(0));
     
     
    captured.append(rx1.cap(0));
    line=f.readLine();
    }
     
    }
    

    @

    Please help me solve this problem

    1 Reply Last reply
    0
    • sierdzioS Offline
      sierdzioS Offline
      sierdzio
      Moderators
      wrote on last edited by
      #2

      All regexps seem to be the same, you can move this part of the code into a function, it would save you LOC and make maintenance easier.

      Also, if I get it right, all you need to do is store all whole lines containing "(XY)", except those with "HL" and "SL"? Then, why not do it like that:
      @
      if (line.contains(QRegExp("[(]\w\w[)]")) { // Get all lines with "(XY)"
      if (line.contains("HL") || line.contains("SL")) { // Throw away those with "HL" or "SL"
      continue;
      }
      // do your code here
      }
      @

      (Z(:^

      1 Reply Last reply
      0
      • sierdzioS Offline
        sierdzioS Offline
        sierdzio
        Moderators
        wrote on last edited by
        #3

        Regexp might be wrong, but I'm in a hurry now and don't have time to think it through. But you'll probably get the idea.

        (Z(:^

        1 Reply Last reply
        0
        • A Offline
          A Offline
          aurora
          wrote on last edited by
          #4

          Thank u.....but u misunderstood....may be i explained it wrongly...It is just a format, words are not same.....
          I dont want to store those lines, which has sub lines.....
          eg:
          @
          -- someline(kk)
          -- main line(mm)
          -- this is subline(ab)
          -- this is another subline(hh)
          in such case i want only sublines....@

          [quote author="sierdzio" date="1326455688"]All regexps seem to be the same, you can move this part of the code into a function, it would save you LOC and make maintenance easier.

          Also, if I get it right, all you need to do is store all whole lines containing "(XY)", except those with "HL" and "SL"? Then, why not do it like that:
          @
          if (line.contains(QRegExp("[(]\w\w[)]")) { // Get all lines with "(XY)"
          if (line.contains("HL") || line.contains("SL")) { // Throw away those with "HL" or "SL"
          continue;
          }
          // do your code here
          }
          @[/quote]

          1 Reply Last reply
          0
          • G Offline
            G Offline
            goetz
            wrote on last edited by
            #5

            Best way to describe your goal would be to show the input list and the result that you expect.

            http://www.catb.org/~esr/faqs/smart-questions.html

            1 Reply Last reply
            0
            • A Offline
              A Offline
              aurora
              wrote on last edited by
              #6

              [quote author="Volker" date="1326491850"]Best way to describe your goal would be to show the input list and the result that you expect.[/quote]

              ok...my input is file shown above,
              and regular expression must capture
              only L1,L2,L3,L6,L2,L9,L0,LL

              it should not capture the line which has subline, thats all...

              1 Reply Last reply
              0
              • G Offline
                G Offline
                goetz
                wrote on last edited by
                #7

                The following snippet should show you the basic principle:

                @
                QStringList l;
                l << "listOne (L1)";
                l << "listTwo (L2)";
                l << "listThree (L3)";
                l << "HeaderLine (HE)";
                l << "listFour (L6)";
                l << "listFive (L2)";
                l << "listSix (L9)";
                l << "listSeven (L0)";
                l << "someline (SL)";
                l << "listeight (LL)";

                QRegExp re("^.+\s+\((L[0-9L])\)$");
                foreach(const QString s, l) {
                qDebug() << "check string" << s;
                if(re.exactMatch(s)) {
                QString code = re.cap(1);
                qDebug() << " found mach" << code;
                } else {
                qDebug() << " no match";
                }
                }
                @

                Short explanation of the regex:

                • ^.+
                  matches everything at the start of the string
                • \s+
                  followed by at least one (or more) whitespace character(s) (space, tab, newlines)
                • \(
                  followed by a literal opening parenthesis. Actually it is (, but the backslash needs to be encoded for C string construction
                • (
                  start a caption group
                • L[0-9L]
                  followd by a literal L and exactly one of 0, 1, 2... 9 or L
                • )
                  end the caption gropu
                • \)
                  followed by a literal closing parenthesis
                • $
                  at the end of the string

                The caption group contains what has been matched in between, which will be one of L0, L1, L2... L9, LL.

                http://www.catb.org/~esr/faqs/smart-questions.html

                1 Reply Last reply
                0
                • A Offline
                  A Offline
                  aurora
                  wrote on last edited by
                  #8

                  Sorry Volker, not like that....

                  All texts inside round bracket, which is present at the end of all line.
                  And regular expression should not capture line which has sub line..
                  example input:
                  @
                  -- afgh hkjhkh(gk_6)
                  -- its main line (aa) <<--except this line capture remaining, as this has subline
                  -- its sub line(bb) <<----subline
                  -- its another subline(cc) <<-----subline
                  -- something(dd09)
                  -- this is also(tr_8787)@

                  And output should be: gk_6,aa,bb,cc,dd09,tr_8787

                  1 Reply Last reply
                  0
                  • A Offline
                    A Offline
                    aureshinite
                    wrote on last edited by
                    #9

                    Learn about regular expressions. Period.

                    1 Reply Last reply
                    0
                    • G Offline
                      G Offline
                      goetz
                      wrote on last edited by
                      #10

                      It is up to you to detect what's a "subline" and skip the regex on that alltogether.

                      I recommend to study the [[Doc:QString]] documentation. It has various helpful methods. Read through the method list and descriptions.

                      http://www.catb.org/~esr/faqs/smart-questions.html

                      1 Reply Last reply
                      0

                      • Login

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • Users
                      • Groups
                      • Search
                      • Get Qt Extensions
                      • Unsolved