Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. How to update a large model efficiently

How to update a large model efficiently

Scheduled Pinned Locked Moved Solved General and Desktop
24 Posts 3 Posters 4.4k Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • ozcanayO Offline
    ozcanayO Offline
    ozcanay
    wrote on last edited by ozcanay
    #1

    I am storing hundreds of thousands of rows in a table model, and they are constantly updated frequently. To be able to update them, I provided a public API to the client code that basically iterates over QVector<MyObject> vector and updates the relevant object identifying it with "ID" field when the update arrives. So, this is an O(n) operation I believe so. I thought about having an additional helper data structure that is map of <ID, MyObject> so that I would be able to access the object associated with the given ID in O(1), however then I will not be able to know its row index that I will need to emit dataChanged signal. What is the best approach here so that my application does not slow down with the updates? Is it feasible or even possible to update the model in a separate thread, maybe?

    JonBJ 1 Reply Last reply
    0
    • ozcanayO ozcanay

      I am storing hundreds of thousands of rows in a table model, and they are constantly updated frequently. To be able to update them, I provided a public API to the client code that basically iterates over QVector<MyObject> vector and updates the relevant object identifying it with "ID" field when the update arrives. So, this is an O(n) operation I believe so. I thought about having an additional helper data structure that is map of <ID, MyObject> so that I would be able to access the object associated with the given ID in O(1), however then I will not be able to know its row index that I will need to emit dataChanged signal. What is the best approach here so that my application does not slow down with the updates? Is it feasible or even possible to update the model in a separate thread, maybe?

      JonBJ Offline
      JonBJ Offline
      JonB
      wrote on last edited by
      #2

      @ozcanay
      Not sure I have any answers, but a couple of comments.

      I am storing hundreds of thousands of rows in a table model

      All in memory? That's quite a lot! Just saying.... Do you have any database/backing store for the model, or is it temporary, in-memory only?

      So, this is an O(n) operation I believe so

      Yes.

      map of <ID, MyObject> so that I would be able to access the object associated with the given ID in O(1)

      For the record, while the final object access once located is O(1) the map lookup imposes O(log(n)) (binary search).

      however then I will not be able to know its row index that I will need to emit dataChanged signal

      I assume you do insert/delete rows, so that normal QModelIndexes may change? I have not used it, but is this a case for QPersistentModelIndex Class:

      This class is used as an index into item models derived from QAbstractItemModel. The index is used by item views, delegates, and selection models to locate an item in the model.

      ? If this means that once an item is assigned a QPersistentModelIndex that never changes and remains valid, you might maintain a map of <ID, {MyObject, QPersistentModelIndex}> (or possibly store the QPersistentModelIndex in the MyObject) so that you can fetch the index immediately. From that you should be able to construct a QModelIndex for dataChanged() from QPersistentModelIndex::row/column(). That is all I know, you would have to verify/read up to see if it works for your situation.

      Another possibility would be to maintain a QSortFilterProxyModel wrapper around your underlying model. If you sort that by ID you should be able to look up in O(log(n)), hopefully. I do not see a "find by sorted key value" method on QSortFilterProxyModel, but presumably you can do a binary search on its row numbers to locate a key, and then mapToSource() for the index into the underlying model.

      Is it feasible or even possible to update the model in a separate thread, maybe?

      You can access/update the model in a different thread from the UI thread. However, then I think the model needs to live in that thread and all operations on it should be performed in that thread. I am then hazy on the consequences of having e.g. a view in the UI thread which wants to access model->data() from that other thread. Someone may be more knowledgeable than I on this.

      1 Reply Last reply
      2
      • ozcanayO Offline
        ozcanayO Offline
        ozcanay
        wrote on last edited by
        #3

        The model is only in-memory. Model basically represents orders from the exchange and displays fields like quantity, price, status, id, stock name etc. As an order is filled or canceled it is obvious that it will not be updated anymore, and maybe due to this, this might be an opportunity to save those orders to a backing store maybe somehow (I do not know how this would work out). However, if I were to have a persistent store, how would I be able to display them on demand? Users of the application basically have a large list of all the orders and I cannot think any other way than to have all the data embedded into the model in memory.

        JonBJ 2 Replies Last reply
        0
        • ozcanayO ozcanay

          The model is only in-memory. Model basically represents orders from the exchange and displays fields like quantity, price, status, id, stock name etc. As an order is filled or canceled it is obvious that it will not be updated anymore, and maybe due to this, this might be an opportunity to save those orders to a backing store maybe somehow (I do not know how this would work out). However, if I were to have a persistent store, how would I be able to display them on demand? Users of the application basically have a large list of all the orders and I cannot think any other way than to have all the data embedded into the model in memory.

          JonBJ Offline
          JonBJ Offline
          JonB
          wrote on last edited by JonB
          #4

          @ozcanay said in How to update a large model efficiently:

          However, if I were to have a persistent store, how would I be able to display them on demand?

          Umm, by fetching them from persistent store? Like a file/database, e.g. Qt has QSqlDatabase class for this.

          Anyway, that was only an observation. It would likely only be slower with a database versus in-memory, and you seem to be looking for speed.

          I hope I gave at least a couple of approaches to investigate for your speed-lookup issue.

          1 Reply Last reply
          1
          • ozcanayO ozcanay

            The model is only in-memory. Model basically represents orders from the exchange and displays fields like quantity, price, status, id, stock name etc. As an order is filled or canceled it is obvious that it will not be updated anymore, and maybe due to this, this might be an opportunity to save those orders to a backing store maybe somehow (I do not know how this would work out). However, if I were to have a persistent store, how would I be able to display them on demand? Users of the application basically have a large list of all the orders and I cannot think any other way than to have all the data embedded into the model in memory.

            JonBJ Offline
            JonBJ Offline
            JonB
            wrote on last edited by
            #5

            @ozcanay
            I see you have marked my previous post as the solution to your issue. Which is fine :) But for my own/others' interest, did you go down the QPersistentModelIndex or the QSortFilterProxyModel route, I'd like to know?

            1 Reply Last reply
            0
            • ozcanayO Offline
              ozcanayO Offline
              ozcanay
              wrote on last edited by
              #6

              I actually do not have a solution right now, so maybe I should not have marked this as a solution, but I liked your approach. I had a QSortFilterProxyModel already that sorts the model by a column and does some filtering stuff according to user's needs. I was trying out QPersistentModelndex option, that seemed interesting. However, I am still trying to make sense of it. I will probably update the post when I come up (if I can) with a working solution.

              1 Reply Last reply
              1
              • ozcanayO Offline
                ozcanayO Offline
                ozcanay
                wrote on last edited by ozcanay
                #7

                QSortFilterProxyModel might be easier to implement since I am already familiar with it and have a working proxy model at hand. So, I will first try to get it work. I had a public API on my model class named edit() that basically modifies the model data, maybe I should move editing functionality to the proxy model and update the model via the proxy. For now, even though I have a proxy model, widget edit()s the wrapped model directly, not via proxy model. I feel like what I am currently doing defeats the purpose of the proxy.

                I am still thinking about how I can acquire QModelIndex after modifying the model via the proxy, though.

                Model stores data as QVector<MyObject>. Suppose that proxy got wrapped model via sourceModel() and iterated over the data wrapped by the model and found out that the object we want to modify sits at index 300 (out of 1000 let's say). How does this vector index translate to QModelIndex? I am going to need that to emit dataChanged()

                JonBJ 1 Reply Last reply
                0
                • ozcanayO ozcanay

                  QSortFilterProxyModel might be easier to implement since I am already familiar with it and have a working proxy model at hand. So, I will first try to get it work. I had a public API on my model class named edit() that basically modifies the model data, maybe I should move editing functionality to the proxy model and update the model via the proxy. For now, even though I have a proxy model, widget edit()s the wrapped model directly, not via proxy model. I feel like what I am currently doing defeats the purpose of the proxy.

                  I am still thinking about how I can acquire QModelIndex after modifying the model via the proxy, though.

                  Model stores data as QVector<MyObject>. Suppose that proxy got wrapped model via sourceModel() and iterated over the data wrapped by the model and found out that the object we want to modify sits at index 300 (out of 1000 let's say). How does this vector index translate to QModelIndex? I am going to need that to emit dataChanged()

                  JonBJ Offline
                  JonBJ Offline
                  JonB
                  wrote on last edited by
                  #8

                  @ozcanay
                  My theory is:

                  • When you edit() directly it is passed an ID to find the row to edit.
                  • That requires a linear search of the base model, which you say is (too) slow.
                  • The QSortFilterProxyModel, assuming it is sorted by row ID, can service a binary search lookup via its row numbers (not those of the source model), which should be "fast".
                  • Once you have found the row index in the proxy model, you can update the source model/make it emit dataChanged() either by using mapToSource() which maps proxy QModelIndexes to source QModelIndexes or by calling the proxy's setData() directly which will use that internally.
                  1 Reply Last reply
                  0
                  • ozcanayO Offline
                    ozcanayO Offline
                    ozcanay
                    wrote on last edited by ozcanay
                    #9

                    As I mentioned before, I was storing "Order" objects in my model. Instead of QSortFilterProxyModel approach, I went for QPersistentModelIndex approach to keep proxy model as thin as possible. I added following data structure to the model (not the proxy):

                    std::unordered_map<QString, QPair<Order, QPersistentModelIndex>> order_index_map_;

                    This is how data is stored in the model:

                    QVector<Order> orders_;

                    void OrdersModel::process(const Order& order) {
                        const auto& order_id = order.order_id_;
                    
                        if(order_index_map_.find(order_id) == order_index_map_.end()) {
                            beginInsertRows({}, orders_.count(), orders_.count());
                            orders_.push_back(order);
                            endInsertRows();
                    
                            order_index_map_[order_id] = {order, std::move(QPersistentModelIndex(index(orders_.size() - 1, 0)))};
                            emit orderEntryAdded(&order);
                        } else {
                            if(auto it = order_index_map_.find(order_id); it != order_index_map_.end()) {
                                const auto p_model_index = it->second.second;
                                const int row_index = p_model_index.row();
                    
                                orders_[row_index] = order;
                                emit dataChanged(index(row_index, 0), index(row_index, column_count_));
                            } else {
                                qWarning() << "This should not have happened.";
                            }
                        }
                    }
                    

                    This seems to be working so far. However, I am not sure how much faster this version is or even faster at all.

                    JonBJ 1 Reply Last reply
                    0
                    • ozcanayO ozcanay

                      As I mentioned before, I was storing "Order" objects in my model. Instead of QSortFilterProxyModel approach, I went for QPersistentModelIndex approach to keep proxy model as thin as possible. I added following data structure to the model (not the proxy):

                      std::unordered_map<QString, QPair<Order, QPersistentModelIndex>> order_index_map_;

                      This is how data is stored in the model:

                      QVector<Order> orders_;

                      void OrdersModel::process(const Order& order) {
                          const auto& order_id = order.order_id_;
                      
                          if(order_index_map_.find(order_id) == order_index_map_.end()) {
                              beginInsertRows({}, orders_.count(), orders_.count());
                              orders_.push_back(order);
                              endInsertRows();
                      
                              order_index_map_[order_id] = {order, std::move(QPersistentModelIndex(index(orders_.size() - 1, 0)))};
                              emit orderEntryAdded(&order);
                          } else {
                              if(auto it = order_index_map_.find(order_id); it != order_index_map_.end()) {
                                  const auto p_model_index = it->second.second;
                                  const int row_index = p_model_index.row();
                      
                                  orders_[row_index] = order;
                                  emit dataChanged(index(row_index, 0), index(row_index, column_count_));
                              } else {
                                  qWarning() << "This should not have happened.";
                              }
                          }
                      }
                      

                      This seems to be working so far. However, I am not sure how much faster this version is or even faster at all.

                      JonBJ Offline
                      JonBJ Offline
                      JonB
                      wrote on last edited by
                      #10

                      @ozcanay said in How to update a large model efficiently:

                      However, I am not sure how much faster this version is or even faster at all.

                      :) :(

                      if(order_index_map_.find(order_id) == order_index_map_.end())

                      } else {
                      if(auto it = order_index_map_.find(order_id); it != order_index_map_.end()) {

                      Save one lookup here, assign auto it = order_index_map_.find(order_id) right at the start. Not that I think lookups will be slow.

                      Go back to basics and find out just what took long in your original code. Did you even test anything? Might be a case of "premature optimization"....

                      1 Reply Last reply
                      0
                      • ozcanayO Offline
                        ozcanayO Offline
                        ozcanay
                        wrote on last edited by
                        #11

                        Yes, I realized redundant lookups that you pointed. Another approach I tried:

                        void OrdersModel::process(const Order& order) {
                            const auto& order_id = order.order_id_;
                        
                            if(auto it = order_index_map_.find(order_id); it == order_index_map_.end()) {
                                auto order_ptr = std::make_shared<Order>(order);
                                beginInsertRows({}, orders_.count(), orders_.count());
                                orders_.push_back(order_ptr);
                                endInsertRows();
                        
                                order_index_map_[order_id] = {order_ptr, std::move(QPersistentModelIndex(index(orders_.size() - 1, 0)))};
                        
                                emit orderEntryAdded(&order);
                            } else {
                                *((*it).second.first) = order; 
                                int row_index = (*it).second.second.row();
                                emit dataChanged(index(row_index, 0), index(row_index, column_count_));
                            }
                        }
                        

                        with data structures changed to:

                        QVector<std::shared_ptr<Order>> orders_;
                        std::unordered_map<QString, QPair<std::shared_ptr<Order>, QPersistentModelIndex>> order_index_map_;
                        

                        However, this is not fast as well. Data structures have pointers so there is probably performance hit from indirection and also cache locality is probably suffering. But I wanted to give it a try anyways.

                        I will probably profile the code with linux perf. Also instead of std::unordered_map, I will go for absl::flat_hash_map to see if there are any performance improvements.

                        JonBJ 1 Reply Last reply
                        0
                        • ozcanayO ozcanay

                          Yes, I realized redundant lookups that you pointed. Another approach I tried:

                          void OrdersModel::process(const Order& order) {
                              const auto& order_id = order.order_id_;
                          
                              if(auto it = order_index_map_.find(order_id); it == order_index_map_.end()) {
                                  auto order_ptr = std::make_shared<Order>(order);
                                  beginInsertRows({}, orders_.count(), orders_.count());
                                  orders_.push_back(order_ptr);
                                  endInsertRows();
                          
                                  order_index_map_[order_id] = {order_ptr, std::move(QPersistentModelIndex(index(orders_.size() - 1, 0)))};
                          
                                  emit orderEntryAdded(&order);
                              } else {
                                  *((*it).second.first) = order; 
                                  int row_index = (*it).second.second.row();
                                  emit dataChanged(index(row_index, 0), index(row_index, column_count_));
                              }
                          }
                          

                          with data structures changed to:

                          QVector<std::shared_ptr<Order>> orders_;
                          std::unordered_map<QString, QPair<std::shared_ptr<Order>, QPersistentModelIndex>> order_index_map_;
                          

                          However, this is not fast as well. Data structures have pointers so there is probably performance hit from indirection and also cache locality is probably suffering. But I wanted to give it a try anyways.

                          I will probably profile the code with linux perf. Also instead of std::unordered_map, I will go for absl::flat_hash_map to see if there are any performance improvements.

                          JonBJ Offline
                          JonBJ Offline
                          JonB
                          wrote on last edited by
                          #12

                          @ozcanay
                          In outline, I still wonder what evidence you have any of this lookup is a performance bottleneck? Just how much looking up are you doing anyway, thousands/hundreds of thousands per second? What else does you code do which might take time rather than assuming/blaming it on lookup?

                          I will probably profile the code with linux perf.

                          Good luck! That (perf) would actually be something you could teach me! :) I tried recently and could not make head nor tail of its output/statistics. In the past I used gprof and was very happy with that, could understand its output fine; perf just gave me headache, with lots of "graphs" and "hotspots" I couldn't fathom... :(

                          1 Reply Last reply
                          0
                          • Christian EhrlicherC Offline
                            Christian EhrlicherC Offline
                            Christian Ehrlicher
                            Lifetime Qt Champion
                            wrote on last edited by
                            #13

                            Simply holding a QVector<data> and a QHash<ID, int> in the model where the value of the QHash<ID, int> is the index in the vector would have been much easier... but good when only the programmer who did the work understand it - then noone else can replace him :D

                            Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
                            Visit the Qt Academy at https://academy.qt.io/catalog

                            1 Reply Last reply
                            0
                            • ozcanayO Offline
                              ozcanayO Offline
                              ozcanay
                              wrote on last edited by
                              #14

                              "QHash<ID, int> is the index in the vector would have been much easier" -> would that even work? Then why is QPersistentModelIndex needed?

                              Christian EhrlicherC 1 Reply Last reply
                              0
                              • ozcanayO ozcanay

                                "QHash<ID, int> is the index in the vector would have been much easier" -> would that even work? Then why is QPersistentModelIndex needed?

                                Christian EhrlicherC Offline
                                Christian EhrlicherC Offline
                                Christian Ehrlicher
                                Lifetime Qt Champion
                                wrote on last edited by Christian Ehrlicher
                                #15

                                @ozcanay said in How to update a large model efficiently:

                                would that even work?

                                Why not? You're the owner of the model so you know when you modify the order and can update the hash. Don't see why you need persistent indexes here at all. They're needed for QSFPM to make sure that they're properly mapped to the new index after a sorting/filtering

                                Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
                                Visit the Qt Academy at https://academy.qt.io/catalog

                                1 Reply Last reply
                                0
                                • ozcanayO Offline
                                  ozcanayO Offline
                                  ozcanay
                                  wrote on last edited by
                                  #16

                                  My worry was that Qt itself under the hood could be messing up with the model indices, changing them from time to time maybe. Especially considering that I wrap my model via QSortFilterProxyModel to filter and sort, I thought that I would need QPersistentModelIndex.

                                  Christian EhrlicherC 1 Reply Last reply
                                  0
                                  • ozcanayO ozcanay

                                    My worry was that Qt itself under the hood could be messing up with the model indices, changing them from time to time maybe. Especially considering that I wrap my model via QSortFilterProxyModel to filter and sort, I thought that I would need QPersistentModelIndex.

                                    Christian EhrlicherC Offline
                                    Christian EhrlicherC Offline
                                    Christian Ehrlicher
                                    Lifetime Qt Champion
                                    wrote on last edited by Christian Ehrlicher
                                    #17

                                    @ozcanay said in How to update a large model efficiently:

                                    My worry was that Qt itself under the hood could be messing up with the model indices, changing them from time to time maybe.

                                    Again: you are the owner, you create the indexes, you modify the data - how should Qt be able to modify something in your structures without your knowing?

                                    And QSortFilterProxyModel is there to NOT have the need to modify the source model.

                                    Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
                                    Visit the Qt Academy at https://academy.qt.io/catalog

                                    1 Reply Last reply
                                    0
                                    • ozcanayO Offline
                                      ozcanayO Offline
                                      ozcanay
                                      wrote on last edited by ozcanay
                                      #18

                                      Yes, you are right.

                                      I made the following changes and it works:

                                      QVector<Order> orders_;
                                      std::unordered_map<QString, QPair<Order, std::size_t>> order_index_map_;
                                      
                                      void OrdersModel::process(const Order& order) {
                                          const auto& order_id = order.order_id_;
                                      
                                          if(auto it = order_index_map_.find(order_id); it == order_index_map_.end()) {
                                              beginInsertRows({}, orders_.count(), orders_.count());
                                              orders_.push_back(order);
                                              endInsertRows();
                                      
                                              order_index_map_[order_id] = {order, orders_.size() - 1};
                                      
                                              emit orderEntryAdded(&order);
                                          } else {
                                              const int row_index = (*it).second.second;
                                              orders_[row_index] = order;
                                              emit dataChanged(index(row_index, 0), index(row_index, column_count_));
                                          }
                                      }
                                      

                                      However, I aim to do this processing in a different thread other than UI thread to not to freeze UI, when there are lots of updates. I want to use QtConcurrent::run for this purpose.

                                      Christian EhrlicherC 1 Reply Last reply
                                      0
                                      • ozcanayO ozcanay

                                        Yes, you are right.

                                        I made the following changes and it works:

                                        QVector<Order> orders_;
                                        std::unordered_map<QString, QPair<Order, std::size_t>> order_index_map_;
                                        
                                        void OrdersModel::process(const Order& order) {
                                            const auto& order_id = order.order_id_;
                                        
                                            if(auto it = order_index_map_.find(order_id); it == order_index_map_.end()) {
                                                beginInsertRows({}, orders_.count(), orders_.count());
                                                orders_.push_back(order);
                                                endInsertRows();
                                        
                                                order_index_map_[order_id] = {order, orders_.size() - 1};
                                        
                                                emit orderEntryAdded(&order);
                                            } else {
                                                const int row_index = (*it).second.second;
                                                orders_[row_index] = order;
                                                emit dataChanged(index(row_index, 0), index(row_index, column_count_));
                                            }
                                        }
                                        

                                        However, I aim to do this processing in a different thread other than UI thread to not to freeze UI, when there are lots of updates. I want to use QtConcurrent::run for this purpose.

                                        Christian EhrlicherC Offline
                                        Christian EhrlicherC Offline
                                        Christian Ehrlicher
                                        Lifetime Qt Champion
                                        wrote on last edited by
                                        #19

                                        @ozcanay said in How to update a large model efficiently:

                                        However, I aim to do this processing in a different thread other than UI

                                        This will not work.

                                        Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
                                        Visit the Qt Academy at https://academy.qt.io/catalog

                                        1 Reply Last reply
                                        0
                                        • ozcanayO Offline
                                          ozcanayO Offline
                                          ozcanay
                                          wrote on last edited by
                                          #20

                                          Can you elaborate on why that won't work? If there is heavy processing to do on UI thread what should I do then?

                                          Christian EhrlicherC 1 Reply Last reply
                                          0

                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • Users
                                          • Groups
                                          • Search
                                          • Get Qt Extensions
                                          • Unsolved