Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Special Interest Groups
  3. C++ Gurus
  4. casts, if's and array-access.
Forum Updated to NodeBB v4.3 + New Features

casts, if's and array-access.

Scheduled Pinned Locked Moved Unsolved C++ Gurus
16 Posts 5 Posters 4.0k Views 4 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • sierdzioS Offline
    sierdzioS Offline
    sierdzio
    Moderators
    wrote on last edited by
    #2

    Assuming your variables are unsigned, this should be enough:

    if (x < 16 && y < 16 && z < 256)
    

    All in a single statement. To gain a little bit more time, you can pass x, y, z as const refs or even do perfect forwarding. And of course, you can compile with -O3 to get more compiler optimisations.

    However, what I recommend you do first (before even starting any implementation) is to write and run a benchmark on your existing code. Only when you have statistically significant data of your current solution, you will be able to test and see if other solutions really win you anything.

    (Z(:^

    J.HilkJ 1 Reply Last reply
    2
    • sierdzioS sierdzio

      Assuming your variables are unsigned, this should be enough:

      if (x < 16 && y < 16 && z < 256)
      

      All in a single statement. To gain a little bit more time, you can pass x, y, z as const refs or even do perfect forwarding. And of course, you can compile with -O3 to get more compiler optimisations.

      However, what I recommend you do first (before even starting any implementation) is to write and run a benchmark on your existing code. Only when you have statistically significant data of your current solution, you will be able to test and see if other solutions really win you anything.

      J.HilkJ Offline
      J.HilkJ Offline
      J.Hilk
      Moderators
      wrote on last edited by J.Hilk
      #3

      @sierdzio

      HI, thanks for the answer.

      Are

      if (x < 16 && y < 16 && z < 256)
      

      and

      if(x <16){
          if(y<16){
              if(z<256){
                   ....
              }
          }
      }
      

      handled differntly? I assumed the compiler would reduce both down to the same process steps.

      I still have in the back of my mind, from one of the earliest informatics lessions, that comparing is one of the heaviest operations, as in it takes the most steps.
      Where as casting - e.g int to short - is one step, simply dropping part of the allocated memory.

      But that might also be remnants of my assambler days...

      You're right, I should do some testing before going forward.


      Be aware of the Qt Code of Conduct, when posting : https://forum.qt.io/topic/113070/qt-code-of-conduct


      Q: What's that?
      A: It's blue light.
      Q: What does it do?
      A: It turns blue.

      sierdzioS jsulmJ 2 Replies Last reply
      0
      • J.HilkJ J.Hilk

        @sierdzio

        HI, thanks for the answer.

        Are

        if (x < 16 && y < 16 && z < 256)
        

        and

        if(x <16){
            if(y<16){
                if(z<256){
                     ....
                }
            }
        }
        

        handled differntly? I assumed the compiler would reduce both down to the same process steps.

        I still have in the back of my mind, from one of the earliest informatics lessions, that comparing is one of the heaviest operations, as in it takes the most steps.
        Where as casting - e.g int to short - is one step, simply dropping part of the allocated memory.

        But that might also be remnants of my assambler days...

        You're right, I should do some testing before going forward.

        sierdzioS Offline
        sierdzioS Offline
        sierdzio
        Moderators
        wrote on last edited by
        #4

        @J.Hilk said in casts, if's and array-access.:

        Are [...] handled differntly? I assumed the compiler would reduce both down to the same process steps.

        Check assembly output to verify, I'm not sure. If you compress them into single statement you definitely get the result.

        I still have in the back of my mind, from one of the earliest informatics lessions, that comparing is one of the heaviest operations, as in it takes the most steps.
        Where as casting - e.g int to short - is one step, simply dropping part of the allocated memory.

        I think modern CPUs are heavily optimised in that area. Maybe you are right, I have not checked. That's why I recommend benchmarking, both the execution time and CPU ticks.

        Another, perhaps a bit crazy, idea: use SIMD instructions, namely vec_cmpgt(a, b) to compare all 3 values in a single CPU operation. More info here.

        (Z(:^

        1 Reply Last reply
        3
        • J.HilkJ J.Hilk

          @sierdzio

          HI, thanks for the answer.

          Are

          if (x < 16 && y < 16 && z < 256)
          

          and

          if(x <16){
              if(y<16){
                  if(z<256){
                       ....
                  }
              }
          }
          

          handled differntly? I assumed the compiler would reduce both down to the same process steps.

          I still have in the back of my mind, from one of the earliest informatics lessions, that comparing is one of the heaviest operations, as in it takes the most steps.
          Where as casting - e.g int to short - is one step, simply dropping part of the allocated memory.

          But that might also be remnants of my assambler days...

          You're right, I should do some testing before going forward.

          jsulmJ Offline
          jsulmJ Offline
          jsulm
          Lifetime Qt Champion
          wrote on last edited by
          #5

          @J.Hilk "maybe casting the values to something in range would do the trick" - in this case if the caller passes an invalid index it will still get something, but is this intended behaviour? What would you do with this three if statements if one of the indexes is out of range? Throw an exception? Return default value? ...?

          https://forum.qt.io/topic/113070/qt-code-of-conduct

          J.HilkJ 1 Reply Last reply
          3
          • jsulmJ jsulm

            @J.Hilk "maybe casting the values to something in range would do the trick" - in this case if the caller passes an invalid index it will still get something, but is this intended behaviour? What would you do with this three if statements if one of the indexes is out of range? Throw an exception? Return default value? ...?

            J.HilkJ Offline
            J.HilkJ Offline
            J.Hilk
            Moderators
            wrote on last edited by
            #6

            @jsulm it returns a default value, 0 is this case.

            @sierdzio I'll look into SIMD instructions, sounds at the very least, interesting;-)


            Be aware of the Qt Code of Conduct, when posting : https://forum.qt.io/topic/113070/qt-code-of-conduct


            Q: What's that?
            A: It's blue light.
            Q: What does it do?
            A: It turns blue.

            jsulmJ kshegunovK 2 Replies Last reply
            0
            • J.HilkJ J.Hilk

              @jsulm it returns a default value, 0 is this case.

              @sierdzio I'll look into SIMD instructions, sounds at the very least, interesting;-)

              jsulmJ Offline
              jsulmJ Offline
              jsulm
              Lifetime Qt Champion
              wrote on last edited by
              #7

              @J.Hilk I don't think SIMD will improve anything in this case. SIMD helps if you have many iterations but in this case it is only one. It will probably take longer to load the data into appropriate SIMD registers then do those 3 if's.

              https://forum.qt.io/topic/113070/qt-code-of-conduct

              1 Reply Last reply
              3
              • J.HilkJ J.Hilk

                @jsulm it returns a default value, 0 is this case.

                @sierdzio I'll look into SIMD instructions, sounds at the very least, interesting;-)

                kshegunovK Offline
                kshegunovK Offline
                kshegunov
                Moderators
                wrote on last edited by kshegunov
                #8

                You could in principle go with one if, but as @sierdzio said: benchmark your code first and make sure this is a bottle neck. Here's one suggestion how you could "simplify" the checking:

                T m_data[15][15][255]
                static const T * const end = &m_data[14][14][254];
                
                // Then in the function that makes the check:
                T * item = &m_data[x][y][z]; //< You may need to modify this, so there's no dereferencing, but you get the point
                return item > end ? defaultValue : *item;
                

                Read and abide by the Qt Code of Conduct

                1 Reply Last reply
                3
                • J.HilkJ Offline
                  J.HilkJ Offline
                  J.Hilk
                  Moderators
                  wrote on last edited by
                  #9

                  Thanks to everyone for the contribution.

                  as suggested I did some testing, with the 3 methodes discuss in this thread.

                  I'll post my code, and my result for further discussions:
                  Its my first test in this direction so I'm not sure if I've done everything like I should have.

                  I forced the random numbers to be valid, to guarantee the max time for each validation check in each method. I also fell back to Qt for this...

                  //storage  Class 
                  //-----------------------------------------headers-------------------------
                  #ifndef DATASTORAGE_H
                  #define DATASTORAGE_H
                  
                  #include <QObject>
                  
                  class dataStorage : public QObject
                  {
                      Q_OBJECT
                  public:
                      explicit dataStorage(QObject *parent = nullptr);
                  
                      uint64_t &getvalueMethod_1(const uint &x, const uint &y, const uint &z);
                      uint64_t &getvalueMethod_2(const uint &x, const uint &y, const uint &z);
                      uint64_t &getvalueMethod_3(const uint &x, const uint &y, const uint &z);
                  signals:
                  
                  public slots:
                  
                  protected:
                      uint64_t m_data[16][16][256];
                      const uint64_t *m_last;
                  };
                  
                  #endif // DATASTORAGE_H
                  
                  //------------------------------Sources------------------------
                  
                  #include "datastorage.h"
                  
                  dataStorage::dataStorage(QObject *parent) : QObject(parent)
                  {
                      for(int x(0); x < 16;x++)
                          for(int y(0); y < 16; y++)
                              for(int z(0); z<256;z++)
                                  m_data[x][y][z] = qrand();
                      m_last = &m_data[15][15][255];
                  }
                  
                  uint64_t &dataStorage::getvalueMethod_1(const uint &x, const uint &y, const uint &z)
                  {
                      if(x < 16)
                          if(y <16)
                              if(z < 256)
                                  return m_data[x][y][z];
                      return m_data[0][0][0];
                  }
                  
                  uint64_t &dataStorage::getvalueMethod_2(const uint &x, const uint &y, const uint &z)
                  {
                      if(x <16 && y < 16 && z < 256)
                          return m_data[x][y][z];
                      return m_data[0][0][0];
                  }
                  
                  uint64_t &dataStorage::getvalueMethod_3(const uint &x, const uint &y, const uint &z)
                  {
                      uint64_t *item = &m_data[x][y][z];
                      return item > m_last ? m_data[0][0][0] : *item;
                  }
                  
                  // -------------------------------------Main---------------------------------
                  #include "datastorage.h"
                  
                  #include <QDebug>
                  #include <QVector>
                  #include <QElapsedTimer>
                  
                  uint getRand(int max, int min){
                      return rand()%(max-min+1)+min;
                  }
                  
                  int main(int argc, char *argv[])
                  {
                      QCoreApplication a(argc, argv);
                  
                      dataStorage dStorage;
                  
                      QVector<QVector<uint> > vindex;
                  
                      for(uint32_t i = 0; i < 10000000; i++ )
                          vindex.append(QVector<uint>({getRand(15,0),getRand(15,0),getRand(255,0)}));
                  
                      QElapsedTimer eTime;
                  
                      uint64_t value;
                      qint64 average1(0),average2(0),average3(0);
                  
                      for(int k(0); k <20; k++){
                          qDebug() << "Test-cycle:" << k+1;
                          eTime.start();
                          for(int j(0); j <10; j++)
                              for(uint32_t i = 0; i < 10000000;i++){
                                  value = dStorage.getvalueMethod_1(vindex.at(i).at(0),vindex.at(i).at(1), vindex.at(i).at(2));
                              }
                          qDebug() << "Time for Method 1: "<< eTime.elapsed();
                          average1 += eTime.elapsed();
                  
                          eTime.start();
                          for(int j(0); j <10; j++)
                              for(uint32_t i = 0; i < 10000000;i++){
                                  value = dStorage.getvalueMethod_2(vindex.at(i).at(0),vindex.at(i).at(1), vindex.at(i).at(2));
                              }
                          qDebug() << "Time for Method 2: "<< eTime.elapsed();
                          average2 += eTime.elapsed();
                  
                          eTime.start();
                          for(int j(0); j <10; j++)
                              for(uint32_t i = 0; i < 10000000;i++){
                                  value = dStorage.getvalueMethod_3(vindex.at(i).at(0),vindex.at(i).at(1), vindex.at(i).at(2));
                              }
                          qDebug() << "Time for Method 3: "<< eTime.elapsed();
                          average3 += eTime.elapsed();
                      }
                  
                      average1 = average1/20; average2 = average2/20; average3 = average3/20;
                      qDebug() << "Average Time for Method 1:" << average1;
                      qDebug() << "Average Time for Method 2:" << average2;
                      qDebug() << "Average Time for Method 3:" << average3;
                  
                      return a.exec();
                  }
                  
                  

                  the result was:

                  • Average Time for Method 1: 11757ms
                  • Average Time for Method 2: 11587ms
                  • Average Time for Method 3: 12012ms

                  My conclusion of this would be:
                  Overall the 3 methods are equally fast!
                  With method 3, the pointer check, being slightly slower, and the inline if-check beeing the fastest


                  Be aware of the Qt Code of Conduct, when posting : https://forum.qt.io/topic/113070/qt-code-of-conduct


                  Q: What's that?
                  A: It's blue light.
                  Q: What does it do?
                  A: It turns blue.

                  kshegunovK 1 Reply Last reply
                  1
                  • J.HilkJ J.Hilk

                    Thanks to everyone for the contribution.

                    as suggested I did some testing, with the 3 methodes discuss in this thread.

                    I'll post my code, and my result for further discussions:
                    Its my first test in this direction so I'm not sure if I've done everything like I should have.

                    I forced the random numbers to be valid, to guarantee the max time for each validation check in each method. I also fell back to Qt for this...

                    //storage  Class 
                    //-----------------------------------------headers-------------------------
                    #ifndef DATASTORAGE_H
                    #define DATASTORAGE_H
                    
                    #include <QObject>
                    
                    class dataStorage : public QObject
                    {
                        Q_OBJECT
                    public:
                        explicit dataStorage(QObject *parent = nullptr);
                    
                        uint64_t &getvalueMethod_1(const uint &x, const uint &y, const uint &z);
                        uint64_t &getvalueMethod_2(const uint &x, const uint &y, const uint &z);
                        uint64_t &getvalueMethod_3(const uint &x, const uint &y, const uint &z);
                    signals:
                    
                    public slots:
                    
                    protected:
                        uint64_t m_data[16][16][256];
                        const uint64_t *m_last;
                    };
                    
                    #endif // DATASTORAGE_H
                    
                    //------------------------------Sources------------------------
                    
                    #include "datastorage.h"
                    
                    dataStorage::dataStorage(QObject *parent) : QObject(parent)
                    {
                        for(int x(0); x < 16;x++)
                            for(int y(0); y < 16; y++)
                                for(int z(0); z<256;z++)
                                    m_data[x][y][z] = qrand();
                        m_last = &m_data[15][15][255];
                    }
                    
                    uint64_t &dataStorage::getvalueMethod_1(const uint &x, const uint &y, const uint &z)
                    {
                        if(x < 16)
                            if(y <16)
                                if(z < 256)
                                    return m_data[x][y][z];
                        return m_data[0][0][0];
                    }
                    
                    uint64_t &dataStorage::getvalueMethod_2(const uint &x, const uint &y, const uint &z)
                    {
                        if(x <16 && y < 16 && z < 256)
                            return m_data[x][y][z];
                        return m_data[0][0][0];
                    }
                    
                    uint64_t &dataStorage::getvalueMethod_3(const uint &x, const uint &y, const uint &z)
                    {
                        uint64_t *item = &m_data[x][y][z];
                        return item > m_last ? m_data[0][0][0] : *item;
                    }
                    
                    // -------------------------------------Main---------------------------------
                    #include "datastorage.h"
                    
                    #include <QDebug>
                    #include <QVector>
                    #include <QElapsedTimer>
                    
                    uint getRand(int max, int min){
                        return rand()%(max-min+1)+min;
                    }
                    
                    int main(int argc, char *argv[])
                    {
                        QCoreApplication a(argc, argv);
                    
                        dataStorage dStorage;
                    
                        QVector<QVector<uint> > vindex;
                    
                        for(uint32_t i = 0; i < 10000000; i++ )
                            vindex.append(QVector<uint>({getRand(15,0),getRand(15,0),getRand(255,0)}));
                    
                        QElapsedTimer eTime;
                    
                        uint64_t value;
                        qint64 average1(0),average2(0),average3(0);
                    
                        for(int k(0); k <20; k++){
                            qDebug() << "Test-cycle:" << k+1;
                            eTime.start();
                            for(int j(0); j <10; j++)
                                for(uint32_t i = 0; i < 10000000;i++){
                                    value = dStorage.getvalueMethod_1(vindex.at(i).at(0),vindex.at(i).at(1), vindex.at(i).at(2));
                                }
                            qDebug() << "Time for Method 1: "<< eTime.elapsed();
                            average1 += eTime.elapsed();
                    
                            eTime.start();
                            for(int j(0); j <10; j++)
                                for(uint32_t i = 0; i < 10000000;i++){
                                    value = dStorage.getvalueMethod_2(vindex.at(i).at(0),vindex.at(i).at(1), vindex.at(i).at(2));
                                }
                            qDebug() << "Time for Method 2: "<< eTime.elapsed();
                            average2 += eTime.elapsed();
                    
                            eTime.start();
                            for(int j(0); j <10; j++)
                                for(uint32_t i = 0; i < 10000000;i++){
                                    value = dStorage.getvalueMethod_3(vindex.at(i).at(0),vindex.at(i).at(1), vindex.at(i).at(2));
                                }
                            qDebug() << "Time for Method 3: "<< eTime.elapsed();
                            average3 += eTime.elapsed();
                        }
                    
                        average1 = average1/20; average2 = average2/20; average3 = average3/20;
                        qDebug() << "Average Time for Method 1:" << average1;
                        qDebug() << "Average Time for Method 2:" << average2;
                        qDebug() << "Average Time for Method 3:" << average3;
                    
                        return a.exec();
                    }
                    
                    

                    the result was:

                    • Average Time for Method 1: 11757ms
                    • Average Time for Method 2: 11587ms
                    • Average Time for Method 3: 12012ms

                    My conclusion of this would be:
                    Overall the 3 methods are equally fast!
                    With method 3, the pointer check, being slightly slower, and the inline if-check beeing the fastest

                    kshegunovK Offline
                    kshegunovK Offline
                    kshegunov
                    Moderators
                    wrote on last edited by kshegunov
                    #10

                    Can you pull the assembly for those three.
                    Additional notes:

                    1. You will want to make them methods inline, otherwise I feel the call instruction cost is pretty significant compared to the function's body (and also the QVector::at where the arguments are taken from the precomputed array).
                      Edit: Have also a brief look at this table

                    2. I couldn't see getRand defined, but if it does what I think it does, then you have a systematic error - you're accessing only real elements of the array (not out-of-bound ones). Which may put test 3 at a disadvantage (or it may not, I'm not sure either way).

                    3. There's no difference and can never be any between:

                    if(x < 16)
                            if(y <16)
                                if(z < 256)
                                    return m_data[x][y][z];
                    
                    if(x <16 && y < 16 && z < 256)
                    

                    which is due to the short-circuit evaluation of && and ||. That's also the most probable reason you get virtually identical results for test 1 and 2.

                    Read and abide by the Qt Code of Conduct

                    J.HilkJ 1 Reply Last reply
                    1
                    • kshegunovK kshegunov

                      Can you pull the assembly for those three.
                      Additional notes:

                      1. You will want to make them methods inline, otherwise I feel the call instruction cost is pretty significant compared to the function's body (and also the QVector::at where the arguments are taken from the precomputed array).
                        Edit: Have also a brief look at this table

                      2. I couldn't see getRand defined, but if it does what I think it does, then you have a systematic error - you're accessing only real elements of the array (not out-of-bound ones). Which may put test 3 at a disadvantage (or it may not, I'm not sure either way).

                      3. There's no difference and can never be any between:

                      if(x < 16)
                              if(y <16)
                                  if(z < 256)
                                      return m_data[x][y][z];
                      
                      if(x <16 && y < 16 && z < 256)
                      

                      which is due to the short-circuit evaluation of && and ||. That's also the most probable reason you get virtually identical results for test 1 and 2.

                      J.HilkJ Offline
                      J.HilkJ Offline
                      J.Hilk
                      Moderators
                      wrote on last edited by
                      #11

                      hi, @kshegunov

                      like you suggested, I wrapped the getvalueMethod_X functions in an inline statement.

                      getRand is defined right above main:

                      uint getRand(int max, int min){
                          return rand()%(max-min+1)+min;
                      }
                      

                      But you're right it would be better to also have values that are outside the intended scope:
                      so I made the following changes:

                      for(uint32_t i = 0; i < 3333333; i++ )
                              vindex.append(QVector<uint>({getRand(15,0),getRand(15,0),getRand(255,0)}));
                          for(uint32_t i = 3333333; i < 6666666; i++ )
                              vindex.append(QVector<uint>({getRand(30,15),getRand(15,0),getRand(255,0)}));
                          for(uint32_t i = 6666666; i < 10000000; i++ )
                              vindex.append(QVector<uint>({getRand(15,0),getRand(15,0),getRand(400,254)}));
                      

                      and pushed the number of cycles to 60.

                      I'm however not sure, what you mean with:

                      and also the QVector::at where the arguments are taken from the precomputed array
                      

                      AFAIK uint64_t m_data[16][16][256]; doesn't have an at opperator. I changed the initializaion however, for the CPU to have one less step:

                      inline uint64_t &getvalueMethod_3(const uint &x, const uint &y, const uint &z)
                          {
                              uint64_t *item(&m_data[x][y][z]);
                              return item > m_last ? m_data[0][0][0] : *item;
                          }
                      

                      The result with those changes:

                      • Average Time for Method 1: 10813
                      • Average Time for Method 2: 10819
                      • Average Time for Method 3: 11313

                      Overall 10% faster than previously, so the call instruction did indeed take up a significant amount of time!
                      Method 1 and 2 are even more equal, like expected.
                      But no change for method 3.


                      Be aware of the Qt Code of Conduct, when posting : https://forum.qt.io/topic/113070/qt-code-of-conduct


                      Q: What's that?
                      A: It's blue light.
                      Q: What does it do?
                      A: It turns blue.

                      kshegunovK 1 Reply Last reply
                      1
                      • J.HilkJ J.Hilk

                        hi, @kshegunov

                        like you suggested, I wrapped the getvalueMethod_X functions in an inline statement.

                        getRand is defined right above main:

                        uint getRand(int max, int min){
                            return rand()%(max-min+1)+min;
                        }
                        

                        But you're right it would be better to also have values that are outside the intended scope:
                        so I made the following changes:

                        for(uint32_t i = 0; i < 3333333; i++ )
                                vindex.append(QVector<uint>({getRand(15,0),getRand(15,0),getRand(255,0)}));
                            for(uint32_t i = 3333333; i < 6666666; i++ )
                                vindex.append(QVector<uint>({getRand(30,15),getRand(15,0),getRand(255,0)}));
                            for(uint32_t i = 6666666; i < 10000000; i++ )
                                vindex.append(QVector<uint>({getRand(15,0),getRand(15,0),getRand(400,254)}));
                        

                        and pushed the number of cycles to 60.

                        I'm however not sure, what you mean with:

                        and also the QVector::at where the arguments are taken from the precomputed array
                        

                        AFAIK uint64_t m_data[16][16][256]; doesn't have an at opperator. I changed the initializaion however, for the CPU to have one less step:

                        inline uint64_t &getvalueMethod_3(const uint &x, const uint &y, const uint &z)
                            {
                                uint64_t *item(&m_data[x][y][z]);
                                return item > m_last ? m_data[0][0][0] : *item;
                            }
                        

                        The result with those changes:

                        • Average Time for Method 1: 10813
                        • Average Time for Method 2: 10819
                        • Average Time for Method 3: 11313

                        Overall 10% faster than previously, so the call instruction did indeed take up a significant amount of time!
                        Method 1 and 2 are even more equal, like expected.
                        But no change for method 3.

                        kshegunovK Offline
                        kshegunovK Offline
                        kshegunov
                        Moderators
                        wrote on last edited by kshegunov
                        #12

                        @J.Hilk said in casts, if's and array-access.:

                        I'm however not sure, what you mean with

                        I meant here:

                        value = dStorage.getvalueMethod_1(vindex.at(i).at(0),vindex.at(i).at(1), vindex.at(i).at(2));
                        

                        but I just realized QVector::at returns a const reference so nevermind that. However you may want to consider using a statically sized c-style array, as you may be getting a lot of cache misses due to the implicit sharing pointer indirection from the vector -
                        another thing to try.

                        But no change for method 3.

                        I never gave any guarantees it'd be faster; that's why you're profiling to begin with ;)
                        And while on the subject, did you get that assembly? And do say how do you compile your tests, with -g -O2?

                        Read and abide by the Qt Code of Conduct

                        J.HilkJ 1 Reply Last reply
                        1
                        • kshegunovK kshegunov

                          @J.Hilk said in casts, if's and array-access.:

                          I'm however not sure, what you mean with

                          I meant here:

                          value = dStorage.getvalueMethod_1(vindex.at(i).at(0),vindex.at(i).at(1), vindex.at(i).at(2));
                          

                          but I just realized QVector::at returns a const reference so nevermind that. However you may want to consider using a statically sized c-style array, as you may be getting a lot of cache misses due to the implicit sharing pointer indirection from the vector -
                          another thing to try.

                          But no change for method 3.

                          I never gave any guarantees it'd be faster; that's why you're profiling to begin with ;)
                          And while on the subject, did you get that assembly? And do say how do you compile your tests, with -g -O2?

                          J.HilkJ Offline
                          J.HilkJ Offline
                          J.Hilk
                          Moderators
                          wrote on last edited by
                          #13

                          @kshegunov

                          I actually tested it with '[]' calls and it doubled the execution time.

                          may be getting a lot of cache misses due to the implicit sharing pointer indirection from the vector

                          the variables are passed as:
                          inline uint64_t &getvalueMethod_2(const uint &x, const uint &y, const uint &z) with QVector::at returning a const reference I'm directly passing the memory, am I not?

                          And while on the subject, did you get that assembly?

                          I'm sorry, but I'm not familiar with that term! Its something I would do to start my lawn mower. But thats probably just me as a non native speaker x)

                          And do say how do you compile your tests, with -g -O2?

                          Haven't touch the default compiler settings. My project, that is potentially using this class, would use the default settings as well, so I thought better not to touch it.

                          I did however make an embarrassing mistake, that I shall not comment further on...

                          but I'm now down to 300-1000 nsec per function call. The 3 functions are virtually the same in access time.

                          Save to say: "Use whatever is better for your OCD-Code-structure!"


                          Be aware of the Qt Code of Conduct, when posting : https://forum.qt.io/topic/113070/qt-code-of-conduct


                          Q: What's that?
                          A: It's blue light.
                          Q: What does it do?
                          A: It turns blue.

                          kshegunovK 1 Reply Last reply
                          1
                          • sierdzioS Offline
                            sierdzioS Offline
                            sierdzio
                            Moderators
                            wrote on last edited by
                            #14

                            By assembly, @kshegunov meant this: https://stackoverflow.com/questions/137038/how-do-you-get-assembler-output-from-c-c-source-in-gcc

                            By comparing assembly code that the compiler produces we can really see how well optimized the code is after the compiler had done it's magic.

                            I did however make an embarrassing mistake, that I shall not comment further on...
                            but I'm now down to 300-1000 nsec per function call. The 3 functions are virtually the same in access time.

                            Oh now you've made me super curious! :-)

                            All in all, good to know that all these solutions are equally valid. We've all learned a bit here.

                            (Z(:^

                            1 Reply Last reply
                            1
                            • J.HilkJ J.Hilk

                              @kshegunov

                              I actually tested it with '[]' calls and it doubled the execution time.

                              may be getting a lot of cache misses due to the implicit sharing pointer indirection from the vector

                              the variables are passed as:
                              inline uint64_t &getvalueMethod_2(const uint &x, const uint &y, const uint &z) with QVector::at returning a const reference I'm directly passing the memory, am I not?

                              And while on the subject, did you get that assembly?

                              I'm sorry, but I'm not familiar with that term! Its something I would do to start my lawn mower. But thats probably just me as a non native speaker x)

                              And do say how do you compile your tests, with -g -O2?

                              Haven't touch the default compiler settings. My project, that is potentially using this class, would use the default settings as well, so I thought better not to touch it.

                              I did however make an embarrassing mistake, that I shall not comment further on...

                              but I'm now down to 300-1000 nsec per function call. The 3 functions are virtually the same in access time.

                              Save to say: "Use whatever is better for your OCD-Code-structure!"

                              kshegunovK Offline
                              kshegunovK Offline
                              kshegunov
                              Moderators
                              wrote on last edited by
                              #15

                              @J.Hilk said in casts, if's and array-access.:

                              I actually tested it with '[]' calls and it doubled the execution time.

                              This is very odd, are you sure you didn't do anything strange. I'd expect it to be a bit faster, but noticeable.

                              I'm directly passing the memory, am I not?

                              Yes, but the QVector has a data pointer inside which is there to implement implicit sharing. So you're actually getting something like:

                              QVectorData * d; //< Already in the vector implementation
                              return d->elements[index]; //< Indirection through the data pointer
                              

                              albeit most of it is probably inlined by the compiler. Still the memory fetch is there.

                              I'm sorry, but I'm not familiar with that term! Its something I would do to start my lawn mower. But thats probably just me as a non native speaker x)

                              Okay, I mean the compiled binary code - the assembly. You can get that from Creator by selecting Debug > Operate by instruction from the menu when you have an active debug session.

                              Haven't touch the default compiler settings.

                              Well, perhaps you should or rather create a profile target for Creator. It'd include -g -O2 switches if you work with gcc, meaning the compiler will generate debug info, but will also perform the standard release optimizations. The idea is to have the optimized code (what you'd have in release) without stripping all the debug info - allowing for a reasonable profiling.

                              but I'm now down to 300-1000 nsec per function call.

                              That's in agreement with the table I posted earlier - a couple or so (short) instructions; a few clock ticks.

                              The 3 functions are virtually the same in access time.

                              Don't rush it, we would still love to do some more dissecting ;P
                              I share @sierdzio's curiosity.

                              Read and abide by the Qt Code of Conduct

                              1 Reply Last reply
                              0
                              • SGaistS Offline
                                SGaistS Offline
                                SGaist
                                Lifetime Qt Champion
                                wrote on last edited by
                                #16

                                Hi,

                                Shouldn't it rather be:

                                QVector<int> paramVector = vindex.at(i);
                                value = dStorage.getvalueMethod_1(paramVector.at(0), paramVector.at(1), paramVector.at(2));
                                

                                to ensure the compiler can optimise access a bit ?

                                Interested in AI ? www.idiap.ch
                                Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

                                1 Reply Last reply
                                1

                                • Login

                                • Login or register to search.
                                • First post
                                  Last post
                                0
                                • Categories
                                • Recent
                                • Tags
                                • Popular
                                • Users
                                • Groups
                                • Search
                                • Get Qt Extensions
                                • Unsolved