Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. add Space character every 3 letters
Forum Updated to NodeBB v4.3 + New Features

add Space character every 3 letters

Scheduled Pinned Locked Moved Unsolved General and Desktop
15 Posts 7 Posters 6.6k Views 3 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D dridk2

    Hi,

    I am making a DNA viewer which display sequence like a hexadecimal viewer.
    So, I would like to add space every 3 letters in my QByteArray.

        QByteArray seq = "ACGTATAGTACGTACG"
        seq = transform(seq,3)
        seq = "ACG TAT AGT ACG TAC"
    

    What the most efficient way to do that ? QString / QByteArray have many methods

    kshegunovK Offline
    kshegunovK Offline
    kshegunov
    Moderators
    wrote on last edited by
    #3

    I don't know if it's efficient enough, but it's certainly a one-liner:

    QString split = QString("ACGTATAGTACGTACG").replace(QRegularExpression("(.{3})"), "\\1 ");
    

    Read and abide by the Qt Code of Conduct

    Taz742T 1 Reply Last reply
    11
    • kshegunovK kshegunov

      I don't know if it's efficient enough, but it's certainly a one-liner:

      QString split = QString("ACGTATAGTACGTACG").replace(QRegularExpression("(.{3})"), "\\1 ");
      
      Taz742T Offline
      Taz742T Offline
      Taz742
      wrote on last edited by
      #4

      @kshegunov
      Your code is small and effective.
      My and your variant is probably the same in time?

      Do what you want.

      kshegunovK 1 Reply Last reply
      0
      • VRoninV Offline
        VRoninV Offline
        VRonin
        wrote on last edited by VRonin
        #5

        since you are using QByteArray (i.e. 1 character is 1 byte) you can probably optimise it using std::memcpy on the data() pointer.

        QByteArray tarnsform(const QByteArray& seq, int span){
            if(seq.isEmpty() || span<=0) return QByteArray();
            const int oldArrSize = seq.size();
            QByteArray result(oldArrSize  + (oldArrSize /span) - (oldArrSize %span==0),' ');
            auto sourceIter = seq.cbegin();
            auto destIter = result.data();
            const auto srcEnd=seq.cend();
            for(int dstnc = std::distance(sourceIter,srcEnd);dstnc>0;dstnc-=span){
                std::memcpy(destIter,sourceIter,qMin(dstnc,span));
                destIter+=span+1;
                sourceIter+=span;
            }
            return result;
        }
        

        EDIT:

        The code I had before broke memory if seq.size()%span!=0

        "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
        ~Napoleon Bonaparte

        On a crusade to banish setIndexWidget() from the holy land of Qt

        1 Reply Last reply
        1
        • Taz742T Taz742

          @kshegunov
          Your code is small and effective.
          My and your variant is probably the same in time?

          kshegunovK Offline
          kshegunovK Offline
          kshegunov
          Moderators
          wrote on last edited by kshegunov
          #6

          @Taz742 said in add Space character every 3 letters:

          My and your variant is probably the same in time?

          I'd even speculate mine may be faster, even though it uses a regular expression. The problem with your piece of code is that at each insert of a new space you're copying the data after that position - the data has to be shifted, which might be rather heavy. The regular expression code (assuming it can optimize the expression well internally) can do it with a single memory allocation. In fact your code can be modified so it uses one allocation, by just using a resulting byte array and copying the data in chunks of 3 bytes, then setting a space, and then repeating.

          Edit: My view hadn't updated, basically what @VRonin wrote is what I was talking about.

          Read and abide by the Qt Code of Conduct

          1 Reply Last reply
          4
          • mrjjM Offline
            mrjjM Offline
            mrjj
            Lifetime Qt Champion
            wrote on last edited by
            #7

            Hi
            Fast test. Might have logical issues. Just for fun.

            using namespace std::chrono;
            void MainWindow::on_pushButton_clicked() {
              high_resolution_clock::time_point t1 = high_resolution_clock::now();
            
              for (int var = 0; var < 10000; ++var) {
                int cnt = 0;
                QByteArray seq = "ACGTATAGTACGTACG";
                for(int i = 3; i < seq.size() - 3; i++) {
                  if(i % 3 == 0) {
                    seq.insert(i + cnt++, ' ');
                  }
                }
              }
              high_resolution_clock::time_point t2 = high_resolution_clock::now();
            
              auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
            
              qDebug() << "time: " << duration ;
            }
            
            void MainWindow::on_pushButton_2_clicked() {
            
              high_resolution_clock::time_point t1 = high_resolution_clock::now();
            
              for (int var = 0; var < 10000; ++var) {
                QString split = QString("ACGTATAGTACGTACG").replace(QRegularExpression("(.{3})"), "\\1 ");
              }
              high_resolution_clock::time_point t2 = high_resolution_clock::now();
            
              auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
            
              qDebug() << "time QRegularExpression: " << duration ;
            }
            

            Result:
            time: 9001
            time: 8002
            time: 8001
            time: 8004
            time: 8001
            time: 8001
            time: 7995
            time: 8001
            time: 8001
            time: 8001
            time QRegularExpression: 161033
            time QRegularExpression: 162033
            time QRegularExpression: 161032
            time QRegularExpression: 161032
            time QRegularExpression: 162032
            time QRegularExpression: 162032
            time QRegularExpression: 162033

            VRoninV E 2 Replies Last reply
            2
            • mrjjM mrjj

              Hi
              Fast test. Might have logical issues. Just for fun.

              using namespace std::chrono;
              void MainWindow::on_pushButton_clicked() {
                high_resolution_clock::time_point t1 = high_resolution_clock::now();
              
                for (int var = 0; var < 10000; ++var) {
                  int cnt = 0;
                  QByteArray seq = "ACGTATAGTACGTACG";
                  for(int i = 3; i < seq.size() - 3; i++) {
                    if(i % 3 == 0) {
                      seq.insert(i + cnt++, ' ');
                    }
                  }
                }
                high_resolution_clock::time_point t2 = high_resolution_clock::now();
              
                auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
              
                qDebug() << "time: " << duration ;
              }
              
              void MainWindow::on_pushButton_2_clicked() {
              
                high_resolution_clock::time_point t1 = high_resolution_clock::now();
              
                for (int var = 0; var < 10000; ++var) {
                  QString split = QString("ACGTATAGTACGTACG").replace(QRegularExpression("(.{3})"), "\\1 ");
                }
                high_resolution_clock::time_point t2 = high_resolution_clock::now();
              
                auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
              
                qDebug() << "time QRegularExpression: " << duration ;
              }
              

              Result:
              time: 9001
              time: 8002
              time: 8001
              time: 8004
              time: 8001
              time: 8001
              time: 7995
              time: 8001
              time: 8001
              time: 8001
              time QRegularExpression: 161033
              time QRegularExpression: 162033
              time QRegularExpression: 161032
              time QRegularExpression: 161032
              time QRegularExpression: 162032
              time QRegularExpression: 162032
              time QRegularExpression: 162033

              VRoninV Offline
              VRoninV Offline
              VRonin
              wrote on last edited by
              #8

              @mrjj Was that debug or release mode?

              "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
              ~Napoleon Bonaparte

              On a crusade to banish setIndexWidget() from the holy land of Qt

              mrjjM 1 Reply Last reply
              0
              • VRoninV VRonin

                @mrjj Was that debug or release mode?

                mrjjM Offline
                mrjjM Offline
                mrjj
                Lifetime Qt Champion
                wrote on last edited by mrjj
                #9

                @VRonin
                both debug. ( but just ran it. not ran as debug)
                You think it affects the result in uneven manner ??
                I till try in release just to be sure.

                1 Reply Last reply
                0
                • mrjjM mrjj

                  Hi
                  Fast test. Might have logical issues. Just for fun.

                  using namespace std::chrono;
                  void MainWindow::on_pushButton_clicked() {
                    high_resolution_clock::time_point t1 = high_resolution_clock::now();
                  
                    for (int var = 0; var < 10000; ++var) {
                      int cnt = 0;
                      QByteArray seq = "ACGTATAGTACGTACG";
                      for(int i = 3; i < seq.size() - 3; i++) {
                        if(i % 3 == 0) {
                          seq.insert(i + cnt++, ' ');
                        }
                      }
                    }
                    high_resolution_clock::time_point t2 = high_resolution_clock::now();
                  
                    auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
                  
                    qDebug() << "time: " << duration ;
                  }
                  
                  void MainWindow::on_pushButton_2_clicked() {
                  
                    high_resolution_clock::time_point t1 = high_resolution_clock::now();
                  
                    for (int var = 0; var < 10000; ++var) {
                      QString split = QString("ACGTATAGTACGTACG").replace(QRegularExpression("(.{3})"), "\\1 ");
                    }
                    high_resolution_clock::time_point t2 = high_resolution_clock::now();
                  
                    auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
                  
                    qDebug() << "time QRegularExpression: " << duration ;
                  }
                  

                  Result:
                  time: 9001
                  time: 8002
                  time: 8001
                  time: 8004
                  time: 8001
                  time: 8001
                  time: 7995
                  time: 8001
                  time: 8001
                  time: 8001
                  time QRegularExpression: 161033
                  time QRegularExpression: 162033
                  time QRegularExpression: 161032
                  time QRegularExpression: 161032
                  time QRegularExpression: 162032
                  time QRegularExpression: 162032
                  time QRegularExpression: 162033

                  E Offline
                  E Offline
                  Eeli K
                  wrote on last edited by
                  #10

                  @mrjj I think it's fair to optimize a bit:

                  QRegularExpression re{"(.{3})"};
                  high_resolution_clock::time_point t1 = high_resolution_clock::now();
                  ...
                  ...QString("ACGTATAGTACGTACG").replace(re, "\\1 ");
                  
                  mrjjM 1 Reply Last reply
                  2
                  • E Eeli K

                    @mrjj I think it's fair to optimize a bit:

                    QRegularExpression re{"(.{3})"};
                    high_resolution_clock::time_point t1 = high_resolution_clock::now();
                    ...
                    ...QString("ACGTATAGTACGTACG").replace(re, "\\1 ");
                    
                    mrjjM Offline
                    mrjjM Offline
                    mrjj
                    Lifetime Qt Champion
                    wrote on last edited by
                    #11

                    @Eeli-K
                    Yes more fair to take out construction of "re"
                    I will try that also.

                    kshegunovK 1 Reply Last reply
                    0
                    • mrjjM mrjj

                      @Eeli-K
                      Yes more fair to take out construction of "re"
                      I will try that also.

                      kshegunovK Offline
                      kshegunovK Offline
                      kshegunov
                      Moderators
                      wrote on last edited by
                      #12

                      Designing benchmarking tests isn't exactly trivial, but I'd suggest something too (probably the raw insert will outperform the rx, but still for the sake of argument):

                      Don't use the same fixed size input string; use input that ranges from very short to very long. And do the benchmarking in batches e.g. run the same benchmark for at least 30-40 times and record the time for each run, then you'd get data that can be put into a histogram and you can work it statistically.

                      Read and abide by the Qt Code of Conduct

                      mrjjM 1 Reply Last reply
                      0
                      • kshegunovK kshegunov

                        Designing benchmarking tests isn't exactly trivial, but I'd suggest something too (probably the raw insert will outperform the rx, but still for the sake of argument):

                        Don't use the same fixed size input string; use input that ranges from very short to very long. And do the benchmarking in batches e.g. run the same benchmark for at least 30-40 times and record the time for each run, then you'd get data that can be put into a histogram and you can work it statistically.

                        mrjjM Offline
                        mrjjM Offline
                        mrjj
                        Lifetime Qt Champion
                        wrote on last edited by
                        #13

                        @kshegunov
                        Yep varying input lengths might alter the result significantly so will try that too.

                        1 Reply Last reply
                        0
                        • D Offline
                          D Offline
                          dridk2
                          wrote on last edited by
                          #14

                          Oh, I was not notify by email of all your answers ! Thanks a lot ! I will try it .
                          By the way, you can join the team for this small project !
                          https://github.com/labsquare/cuteFasta
                          Preview on twitter : https://twitter.com/labsquare/status/884146483406266368

                          1 Reply Last reply
                          1
                          • R Offline
                            R Offline
                            reena jaus
                            wrote on last edited by
                            #15

                            try this : just modify ur for loop
                            i<seq.size()
                            thats it :- enjoy
                            QByteArray seq = "ACGTATAGTACGTACG";

                            int cnt = 0;
                            
                            for(int i = 3; i < seq.size(); i++){
                                if(i % 3 == 0){
                                    seq.insert(i + cnt++, ' ');
                                }
                            }
                            
                            qDebug() << seq; = "ACG TAT AGT ACG TAC G"
                            
                            1 Reply Last reply
                            0

                            • Login

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • Users
                            • Groups
                            • Search
                            • Get Qt Extensions
                            • Unsolved