Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. add Space character every 3 letters
Forum Updated to NodeBB v4.3 + New Features

add Space character every 3 letters

Scheduled Pinned Locked Moved Unsolved General and Desktop
15 Posts 7 Posters 6.7k Views 3 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D dridk2

    Hi,

    I am making a DNA viewer which display sequence like a hexadecimal viewer.
    So, I would like to add space every 3 letters in my QByteArray.

        QByteArray seq = "ACGTATAGTACGTACG"
        seq = transform(seq,3)
        seq = "ACG TAT AGT ACG TAC"
    

    What the most efficient way to do that ? QString / QByteArray have many methods

    Taz742T Offline
    Taz742T Offline
    Taz742
    wrote on last edited by Taz742
    #2

    @dridk2

    QByteArray seq = "ACGTATAGTACGTACG";
    
        int cnt = 0;
    
        for(int i = 3; i < seq.size() - 3; i++){
            if(i % 3 == 0){
                seq.insert(i + cnt++, ' ');
            }
        }
    
        qDebug() << seq; = "ACG TAT AGT ACG TAC G"
    

    Do what you want.

    1 Reply Last reply
    1
    • D dridk2

      Hi,

      I am making a DNA viewer which display sequence like a hexadecimal viewer.
      So, I would like to add space every 3 letters in my QByteArray.

          QByteArray seq = "ACGTATAGTACGTACG"
          seq = transform(seq,3)
          seq = "ACG TAT AGT ACG TAC"
      

      What the most efficient way to do that ? QString / QByteArray have many methods

      kshegunovK Offline
      kshegunovK Offline
      kshegunov
      Moderators
      wrote on last edited by
      #3

      I don't know if it's efficient enough, but it's certainly a one-liner:

      QString split = QString("ACGTATAGTACGTACG").replace(QRegularExpression("(.{3})"), "\\1 ");
      

      Read and abide by the Qt Code of Conduct

      Taz742T 1 Reply Last reply
      11
      • kshegunovK kshegunov

        I don't know if it's efficient enough, but it's certainly a one-liner:

        QString split = QString("ACGTATAGTACGTACG").replace(QRegularExpression("(.{3})"), "\\1 ");
        
        Taz742T Offline
        Taz742T Offline
        Taz742
        wrote on last edited by
        #4

        @kshegunov
        Your code is small and effective.
        My and your variant is probably the same in time?

        Do what you want.

        kshegunovK 1 Reply Last reply
        0
        • VRoninV Offline
          VRoninV Offline
          VRonin
          wrote on last edited by VRonin
          #5

          since you are using QByteArray (i.e. 1 character is 1 byte) you can probably optimise it using std::memcpy on the data() pointer.

          QByteArray tarnsform(const QByteArray& seq, int span){
              if(seq.isEmpty() || span<=0) return QByteArray();
              const int oldArrSize = seq.size();
              QByteArray result(oldArrSize  + (oldArrSize /span) - (oldArrSize %span==0),' ');
              auto sourceIter = seq.cbegin();
              auto destIter = result.data();
              const auto srcEnd=seq.cend();
              for(int dstnc = std::distance(sourceIter,srcEnd);dstnc>0;dstnc-=span){
                  std::memcpy(destIter,sourceIter,qMin(dstnc,span));
                  destIter+=span+1;
                  sourceIter+=span;
              }
              return result;
          }
          

          EDIT:

          The code I had before broke memory if seq.size()%span!=0

          "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
          ~Napoleon Bonaparte

          On a crusade to banish setIndexWidget() from the holy land of Qt

          1 Reply Last reply
          1
          • Taz742T Taz742

            @kshegunov
            Your code is small and effective.
            My and your variant is probably the same in time?

            kshegunovK Offline
            kshegunovK Offline
            kshegunov
            Moderators
            wrote on last edited by kshegunov
            #6

            @Taz742 said in add Space character every 3 letters:

            My and your variant is probably the same in time?

            I'd even speculate mine may be faster, even though it uses a regular expression. The problem with your piece of code is that at each insert of a new space you're copying the data after that position - the data has to be shifted, which might be rather heavy. The regular expression code (assuming it can optimize the expression well internally) can do it with a single memory allocation. In fact your code can be modified so it uses one allocation, by just using a resulting byte array and copying the data in chunks of 3 bytes, then setting a space, and then repeating.

            Edit: My view hadn't updated, basically what @VRonin wrote is what I was talking about.

            Read and abide by the Qt Code of Conduct

            1 Reply Last reply
            4
            • mrjjM Offline
              mrjjM Offline
              mrjj
              Lifetime Qt Champion
              wrote on last edited by
              #7

              Hi
              Fast test. Might have logical issues. Just for fun.

              using namespace std::chrono;
              void MainWindow::on_pushButton_clicked() {
                high_resolution_clock::time_point t1 = high_resolution_clock::now();
              
                for (int var = 0; var < 10000; ++var) {
                  int cnt = 0;
                  QByteArray seq = "ACGTATAGTACGTACG";
                  for(int i = 3; i < seq.size() - 3; i++) {
                    if(i % 3 == 0) {
                      seq.insert(i + cnt++, ' ');
                    }
                  }
                }
                high_resolution_clock::time_point t2 = high_resolution_clock::now();
              
                auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
              
                qDebug() << "time: " << duration ;
              }
              
              void MainWindow::on_pushButton_2_clicked() {
              
                high_resolution_clock::time_point t1 = high_resolution_clock::now();
              
                for (int var = 0; var < 10000; ++var) {
                  QString split = QString("ACGTATAGTACGTACG").replace(QRegularExpression("(.{3})"), "\\1 ");
                }
                high_resolution_clock::time_point t2 = high_resolution_clock::now();
              
                auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
              
                qDebug() << "time QRegularExpression: " << duration ;
              }
              

              Result:
              time: 9001
              time: 8002
              time: 8001
              time: 8004
              time: 8001
              time: 8001
              time: 7995
              time: 8001
              time: 8001
              time: 8001
              time QRegularExpression: 161033
              time QRegularExpression: 162033
              time QRegularExpression: 161032
              time QRegularExpression: 161032
              time QRegularExpression: 162032
              time QRegularExpression: 162032
              time QRegularExpression: 162033

              VRoninV E 2 Replies Last reply
              2
              • mrjjM mrjj

                Hi
                Fast test. Might have logical issues. Just for fun.

                using namespace std::chrono;
                void MainWindow::on_pushButton_clicked() {
                  high_resolution_clock::time_point t1 = high_resolution_clock::now();
                
                  for (int var = 0; var < 10000; ++var) {
                    int cnt = 0;
                    QByteArray seq = "ACGTATAGTACGTACG";
                    for(int i = 3; i < seq.size() - 3; i++) {
                      if(i % 3 == 0) {
                        seq.insert(i + cnt++, ' ');
                      }
                    }
                  }
                  high_resolution_clock::time_point t2 = high_resolution_clock::now();
                
                  auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
                
                  qDebug() << "time: " << duration ;
                }
                
                void MainWindow::on_pushButton_2_clicked() {
                
                  high_resolution_clock::time_point t1 = high_resolution_clock::now();
                
                  for (int var = 0; var < 10000; ++var) {
                    QString split = QString("ACGTATAGTACGTACG").replace(QRegularExpression("(.{3})"), "\\1 ");
                  }
                  high_resolution_clock::time_point t2 = high_resolution_clock::now();
                
                  auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
                
                  qDebug() << "time QRegularExpression: " << duration ;
                }
                

                Result:
                time: 9001
                time: 8002
                time: 8001
                time: 8004
                time: 8001
                time: 8001
                time: 7995
                time: 8001
                time: 8001
                time: 8001
                time QRegularExpression: 161033
                time QRegularExpression: 162033
                time QRegularExpression: 161032
                time QRegularExpression: 161032
                time QRegularExpression: 162032
                time QRegularExpression: 162032
                time QRegularExpression: 162033

                VRoninV Offline
                VRoninV Offline
                VRonin
                wrote on last edited by
                #8

                @mrjj Was that debug or release mode?

                "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
                ~Napoleon Bonaparte

                On a crusade to banish setIndexWidget() from the holy land of Qt

                mrjjM 1 Reply Last reply
                0
                • VRoninV VRonin

                  @mrjj Was that debug or release mode?

                  mrjjM Offline
                  mrjjM Offline
                  mrjj
                  Lifetime Qt Champion
                  wrote on last edited by mrjj
                  #9

                  @VRonin
                  both debug. ( but just ran it. not ran as debug)
                  You think it affects the result in uneven manner ??
                  I till try in release just to be sure.

                  1 Reply Last reply
                  0
                  • mrjjM mrjj

                    Hi
                    Fast test. Might have logical issues. Just for fun.

                    using namespace std::chrono;
                    void MainWindow::on_pushButton_clicked() {
                      high_resolution_clock::time_point t1 = high_resolution_clock::now();
                    
                      for (int var = 0; var < 10000; ++var) {
                        int cnt = 0;
                        QByteArray seq = "ACGTATAGTACGTACG";
                        for(int i = 3; i < seq.size() - 3; i++) {
                          if(i % 3 == 0) {
                            seq.insert(i + cnt++, ' ');
                          }
                        }
                      }
                      high_resolution_clock::time_point t2 = high_resolution_clock::now();
                    
                      auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
                    
                      qDebug() << "time: " << duration ;
                    }
                    
                    void MainWindow::on_pushButton_2_clicked() {
                    
                      high_resolution_clock::time_point t1 = high_resolution_clock::now();
                    
                      for (int var = 0; var < 10000; ++var) {
                        QString split = QString("ACGTATAGTACGTACG").replace(QRegularExpression("(.{3})"), "\\1 ");
                      }
                      high_resolution_clock::time_point t2 = high_resolution_clock::now();
                    
                      auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
                    
                      qDebug() << "time QRegularExpression: " << duration ;
                    }
                    

                    Result:
                    time: 9001
                    time: 8002
                    time: 8001
                    time: 8004
                    time: 8001
                    time: 8001
                    time: 7995
                    time: 8001
                    time: 8001
                    time: 8001
                    time QRegularExpression: 161033
                    time QRegularExpression: 162033
                    time QRegularExpression: 161032
                    time QRegularExpression: 161032
                    time QRegularExpression: 162032
                    time QRegularExpression: 162032
                    time QRegularExpression: 162033

                    E Offline
                    E Offline
                    Eeli K
                    wrote on last edited by
                    #10

                    @mrjj I think it's fair to optimize a bit:

                    QRegularExpression re{"(.{3})"};
                    high_resolution_clock::time_point t1 = high_resolution_clock::now();
                    ...
                    ...QString("ACGTATAGTACGTACG").replace(re, "\\1 ");
                    
                    mrjjM 1 Reply Last reply
                    2
                    • E Eeli K

                      @mrjj I think it's fair to optimize a bit:

                      QRegularExpression re{"(.{3})"};
                      high_resolution_clock::time_point t1 = high_resolution_clock::now();
                      ...
                      ...QString("ACGTATAGTACGTACG").replace(re, "\\1 ");
                      
                      mrjjM Offline
                      mrjjM Offline
                      mrjj
                      Lifetime Qt Champion
                      wrote on last edited by
                      #11

                      @Eeli-K
                      Yes more fair to take out construction of "re"
                      I will try that also.

                      kshegunovK 1 Reply Last reply
                      0
                      • mrjjM mrjj

                        @Eeli-K
                        Yes more fair to take out construction of "re"
                        I will try that also.

                        kshegunovK Offline
                        kshegunovK Offline
                        kshegunov
                        Moderators
                        wrote on last edited by
                        #12

                        Designing benchmarking tests isn't exactly trivial, but I'd suggest something too (probably the raw insert will outperform the rx, but still for the sake of argument):

                        Don't use the same fixed size input string; use input that ranges from very short to very long. And do the benchmarking in batches e.g. run the same benchmark for at least 30-40 times and record the time for each run, then you'd get data that can be put into a histogram and you can work it statistically.

                        Read and abide by the Qt Code of Conduct

                        mrjjM 1 Reply Last reply
                        0
                        • kshegunovK kshegunov

                          Designing benchmarking tests isn't exactly trivial, but I'd suggest something too (probably the raw insert will outperform the rx, but still for the sake of argument):

                          Don't use the same fixed size input string; use input that ranges from very short to very long. And do the benchmarking in batches e.g. run the same benchmark for at least 30-40 times and record the time for each run, then you'd get data that can be put into a histogram and you can work it statistically.

                          mrjjM Offline
                          mrjjM Offline
                          mrjj
                          Lifetime Qt Champion
                          wrote on last edited by
                          #13

                          @kshegunov
                          Yep varying input lengths might alter the result significantly so will try that too.

                          1 Reply Last reply
                          0
                          • D Offline
                            D Offline
                            dridk2
                            wrote on last edited by
                            #14

                            Oh, I was not notify by email of all your answers ! Thanks a lot ! I will try it .
                            By the way, you can join the team for this small project !
                            https://github.com/labsquare/cuteFasta
                            Preview on twitter : https://twitter.com/labsquare/status/884146483406266368

                            1 Reply Last reply
                            1
                            • R Offline
                              R Offline
                              reena jaus
                              wrote on last edited by
                              #15

                              try this : just modify ur for loop
                              i<seq.size()
                              thats it :- enjoy
                              QByteArray seq = "ACGTATAGTACGTACG";

                              int cnt = 0;
                              
                              for(int i = 3; i < seq.size(); i++){
                                  if(i % 3 == 0){
                                      seq.insert(i + cnt++, ' ');
                                  }
                              }
                              
                              qDebug() << seq; = "ACG TAT AGT ACG TAC G"
                              
                              1 Reply Last reply
                              0

                              • Login

                              • Login or register to search.
                              • First post
                                Last post
                              0
                              • Categories
                              • Recent
                              • Tags
                              • Popular
                              • Users
                              • Groups
                              • Search
                              • Get Qt Extensions
                              • Unsolved