Suspected excessive calls to QAbstractTableModel.data & related questions
-
Good morning.
We are developing a QAbstractTableModel to represent log messages. Log messages are appended at the end (bottom of the view), with the eldest being removed if the item count is too large.
During development, we noted that larger data sets result in very slow views. We have reduced our problem to the simple example below. The example adds a log message twice per second to the model through entry_added. The model is shown in a QTableView.
Our key problem is related to the number of data() calls:
- On my screen, the QTableView has 6 visible rows. Once the first 6 rows are filled, and before data is being removed, we expected no calls to LogModel.data(), as the QTableView has all the information it needs.
In practice, we see a call to data() for every visible cell. We also see a spectacular number of calls to rowCount (O(n²) with the number of visible rows!).
In the simple example, this is obviously not problematic, but in our real code the computation of the UI text (through data()) is fairly complex. We had hoped that with QTableView we would not need to implement caching.
Can anyone point to errors in the model that would fix this problem?
The question at https://forum.qt.io/topic/69225/how-do-i-debug-excessive-qabstractitemmodel-data-callbacks-and-general-extreme-inefficiency-pyqt5/6 is very similar, but it has no solution.
Secondary questions:
-
What is the canonical method to construct the root index? I see examples simply calling QModelIndex(), but then the index is not linked to the model. Is this ok, or do we need to call createIndex()? What is the difference?
-
What is the canonical method to check that an index is the root index? Should we use index.isValid(), index.parent() == index, or index.row() == -1, or anything else?
Thanks in advance for any hints.
We are working with PySide6, but I assume the same would show up in C++, so I posted it here.
"""Reduced test case to illustrate excessive calls to QAbstractTableModel.data(). The model stores logged messages in a read-only bounded queue. When the number of messages exceeds the maximum, the oldest messages (appearing at the top of the QTableView) are removed. """ from PySide6.QtCore import QAbstractTableModel, QTimer from PySide6.QtCore import Qt, QModelIndex from PySide6.QtWidgets import QHeaderView, QMainWindow, QApplication, QTableView def fmt_index(index): """Convert a QIndex to a string for debugging. """ parent = "root" if index.parent() == index else "child" return "<" + ("{},{} {}".format(index.row(), index.column(), parent)) + ">" class LogModel(QAbstractTableModel): """Model interface for the log records. """ def __init__(self, parent=None): super().__init__(parent) self.logs = [] self.max_items = 20 def rowCount(self, parentModelIndex=QModelIndex()): """Return the number of logging records stored in the model. """ assert self.checkIndex(parentModelIndex), parentModelIndex if parentModelIndex.row() != -1: result = 0 else: result = len(self.logs) # Disabled because the number of calls is very high: ##print("rowCount", fmt_index(parentModelIndex), result) return result def columnCount(self, _parentModelIndex=QModelIndex()): return 1 def headerData(self, section, orientation, role): if role == Qt.DisplayRole and orientation == Qt.Horizontal: return "Count" return None def data(self, index, role=Qt.DisplayRole): assert self.checkIndex(index), index if index.row() != -1 and index.column() == 0 and role == Qt.DisplayRole: print("data(Qt.DisplayRole)", index.row(), role) log_msg = self.logs[index.row()] return log_msg def index(self, row, column, parent=None): if 0 <= row <= len(self.logs) and column == 0: result = self.createIndex(row, column, self.logs[row]) else: result = QModelIndex() print("index({},{}):".format(row, column), fmt_index(result)) return result def entry_added(self, message): parent = self.createIndex(-1, -1, None) if True: print("=" * 60, "Adding entry.") N = len(self.logs) parent = QModelIndex() self.beginInsertRows(parent, N, N) self.logs.append(message) print("== insertRow completing.") self.endInsertRows() print("== insertRow complete.") if True: to_be_removed = len(self.logs) - self.max_items if to_be_removed > 0: print("=" * 60, "Removing", to_be_removed) # Rows are always removed at the start of the list. self.beginRemoveRows(parent, 0, to_be_removed - 1) del self.logs[:to_be_removed] print("== removeRow completing.") self.endRemoveRows() print("== removeRow complete.") class LogViewerWindow(QMainWindow): def __init__(self, parent=None, model=None): super().__init__(parent) self.tvLogs = QTableView(self) # Use fixed column-width sizes to speed up rendering. self.tvLogs.horizontalHeader().setSectionResizeMode(QHeaderView.Fixed) self.tvLogs.verticalHeader().setSectionResizeMode(QHeaderView.Fixed) self.tvLogs.setModel(model) self.setCentralWidget(self.tvLogs) if __name__ == "__main__": application = QApplication() model = LogModel() n = 0 def generate_logs(): """Add a log message. """ global n model.entry_added(str(n)) n += 1 timer = QTimer() timer.setInterval(400) timer.timeout.connect(generate_logs) timer.start() main_wnd = LogViewerWindow(None, model) main_wnd.show() application.exec()
- On my screen, the QTableView has 6 visible rows. Once the first 6 rows are filled, and before data is being removed, we expected no calls to LogModel.data(), as the QTableView has all the information it needs.
-
Hi and welcome to devnet,
The big amount of rowCount is indeed something unusual.
For the root index, unless you have a tree model, there should be none.
Why are you re-implementing the index method ? It's already done in QAbstractTableModel.
As for data and headerData, I usually return the base class implementation for the case not explicitly handled by the custom model.
When implementing a custom model, it's a good idea to run it through the QAbstractItemModelTester.
-
@wpietro
Mainly, please follow @SGaist's suggestions. Especially the QAbstractItemModelTester before you go any further.I don't know about Qt6/PySide6, but at least at 5 I am confident models & views work well without your reported behaviour.
One thing I noted in your code, probably not relevant, but:
if 0 <= row <= len(self.logs) and column == 0: result = self.createIndex(row, column, self.logs[row])
This seems to go too far. If it were called with
row == len(self.logs)
(maybe it never is) thenself.logs[row]
does not look good. -
Thanks for your comments.
"0 <= row <= len(self.logs)" is a stupid typo indeed — corrected. I think we would have seen the error if it had occurred, so I believe has no influence.
index() was overridden to use the internal data pointer, but that is no longer used. I removed index().
I changed headerData() as @SGaist recommended, that makes sense. I had not put a print() in headerData(), so I only notice now that that too gets called about 40 times per iteration (for 6 rows visible).
data() seems a pure virtual method, no base implementation.
We used the stand-alone ModelTest script floating around, but the built-in class is new to me, thanks. It seems to have similar functions. I added it to the reduced test case, no errors are reported. I see additional method calls, so the tester is active.
Thanks again for your time.
For completeness' sake, updated case below:
"""Reduced test case to illustrate excessive calls to QAbstractTableModel.data(). The model stores logged messages in a read-only bounded queue. When the number of messages exceeds the maximum, the oldest messages (appearing at the top of the QTableView) are removed. """ from PySide6.QtCore import QAbstractTableModel, QTimer from PySide6.QtCore import Qt, QModelIndex from PySide6.QtWidgets import QHeaderView, QMainWindow, QApplication, QTableView from PySide6.QtTest import QAbstractItemModelTester def fmt_index(index): """Convert a QIndex to a string for debugging. """ parent = "root" if index.parent() == index else "child" return "<" + ("{},{} {}".format(index.row(), index.column(), parent)) + ">" class LogModel(QAbstractTableModel): """Model interface for the log records. """ def __init__(self, parent=None): super().__init__(parent) self.logs = [] self.max_items = 20 def rowCount(self, parentModelIndex=QModelIndex()): """Return the number of logging records stored in the model. """ assert self.checkIndex(parentModelIndex), parentModelIndex if parentModelIndex.row() != -1: result = 0 else: result = len(self.logs) # Disabled because the number of calls is very high: ##print("rowCount", fmt_index(parentModelIndex), result) return result def columnCount(self, _parentModelIndex=QModelIndex()): return 1 def headerData(self, section, orientation, role): if role == Qt.DisplayRole and orientation == Qt.Horizontal: result = "Count" else: result = super().headerData(section, orientation, role) # Disabled because the number of calls is very high: ##print("headerData" ,section, orientation, role, repr(result)) return result def data(self, index, role=Qt.DisplayRole): assert self.checkIndex(index), index if index.row() != -1 and index.column() == 0 and role == Qt.DisplayRole: result = self.logs[index.row()] print("data({})".format(role), index.row()) else: result = None #super().data(index, role) print("data({})".format(role), index.row(), result) return result def entry_added(self, message): parent = self.createIndex(-1, -1, None) if True: print("=" * 60, "Adding entry.") N = len(self.logs) parent = QModelIndex() self.beginInsertRows(parent, N, N) self.logs.append(message) print("== insertRow completing.") self.endInsertRows() print("== insertRow complete.") if True: to_be_removed = len(self.logs) - self.max_items if to_be_removed > 0: print("=" * 60, "Removing", to_be_removed) # Rows are always removed at the start of the list. self.beginRemoveRows(parent, 0, to_be_removed - 1) del self.logs[:to_be_removed] print("== removeRow completing.") self.endRemoveRows() print("== removeRow complete.") class LogViewerWindow(QMainWindow): def __init__(self, parent=None, model=None): super().__init__(parent) self.tvLogs = QTableView(self) # Use fixed column-width sizes to speed up rendering. self.tvLogs.horizontalHeader().setSectionResizeMode(QHeaderView.Fixed) self.tvLogs.verticalHeader().setSectionResizeMode(QHeaderView.Fixed) self.tvLogs.setModel(model) self.setCentralWidget(self.tvLogs) if __name__ == "__main__": application = QApplication() model = LogModel() tester = QAbstractItemModelTester(model, QAbstractItemModelTester.FailureReportingMode.Fatal) n = 0 def generate_logs(): """Add a log message. """ global n model.entry_added(str(n)) n += 1 timer = QTimer() timer.setInterval(400) timer.timeout.connect(generate_logs) timer.start() main_wnd = LogViewerWindow(None, model) main_wnd.show() application.exec()
-
I just realized something, you are using a single column model, so why not use QAbstractListModel as base class ?
-
Sorry but are you sure using a pre-built model (like
QStandardItemModel
) does not meet your goals?It looks like you are using Qt6 so you can use
multiData
(https://doc.qt.io/qt-6/qabstractitemmodel.html#multiData) to limit the number of calls todata
-
Hello @VRonin,
We have so many rows that we prefer the dedicated model. Otherwise we'd need to convert every cell to a string, in many cases without the user ever seeing them.
Thanks for pointer to multiData, I'll give that a try. I do believe the original question remains valid. Perhaps I should file this as a bug report? I know that officially the number of calls to data() is not guaranteed, but surely this is sub-optimal.
-
Hi @VRonin , sorry I do not understand what you mean. As I understand it, with a QStandardItemModel we would need to translate every logging object to a set of cells in advance, regardless of how many rows are visible. If this understanding is correct, surely we would need our (say) 1000 active rows to text even though only 10 rows are visible?
-
Even though QStandardItemModel has a strange ctor which takes a QString instead a QVariant you can create a QStandardItem and set the Edit/DisplayRole via QStandardItem::setData()
-
Hi @Christian-Ehrlicher , thanks for the hint, but I am afraid it still will not help:
- When I noted "converting to string" I was referring to a complex object-to-string translation, not a single integer-to-string.
- I believe we would need to create every single row in our logging data, even if it never gets used. We easily have thousands of rows in our log.
-
@wpietro
A few comments/observations.The example adds a log message twice per second to the model through entry_added.
That of course is so negligible that it should be ignorable.
if to_be_removed > 0:
Try removing the removal of rows, temporarily. Any difference to the behaviour?
self.tvLogs = QTableView(self)
Subclass the
QTableView
. You could find out a lot about what is going on by puttingprint
/counters statements in its slots/overridable virtual methods.On my screen, the QTableView has 6 visible rows. Once the first 6 rows are filled, and before data is being removed, we expected no calls to LogModel.data(), as the QTableView has all the information it needs.
In practice, we see a call to data() for every visible cell
These seem to contradict each other. Anyway I would not expect the
QTableView
to only access the "visible" rows from the model, I would likely expect to access all of them. Unless someone says otherwise. Can you provide a reference (or an expert confirm) thatQTableView
stops looking at model rows/data as soon as it has "filled all visible rows on the widget?We also see a spectacular number of calls to rowCount (O(n²) with the number of visible rows!).
Since your
n
is so small,6
, this may not actually beO(n²)
. It might ben * m
, wherem
is small, like6
. IIRCQTreeView
s calldata
for each cell about 6 times with particular roles, there are about half-a-dozen it needs to display an item (don't forget in addition to data, there are font, color, alignment, ....).In itself your
rowCount()
implementation is "fast". So I wouldn't worry about these calls. Callingdata()
a large number of times maybe, but why worry aboutrowCount()
itself?The question at https://forum.qt.io/topic/69225/how-do-i-debug-excessive-qabstractitemmodel-data-callbacks-and-general-extreme-inefficiency-pyqt5/6 is very similar, but it has no solution.
That talks about
I'm still getting lots of data() calls against TextAlignmentRole.
That is terribly relevant. Are you experiencing a lot of calls to your
data()
for certain roles?We had hoped that with QTableView we would not need to implement caching.
How large do you intend
rowCount()
to get? If it is, say, 10,000, do you (really) wish your user to be able to scroll up and down all 10,000 in theQTableView
without filtering/paging?What exactly is your current issue, out of those you have mentioned? For example, latest you have been talking about "data conversion to string", which is very different from row counts. If you find performance fine at the moment and assume it will be terrible with 10,000 log rows, have you given that a 1 second test to see how it actually performs, before deciding whether/what problem you have?
-
Hi @JonB,
Thanks for your comments.
That of course is so negligible that it should be ignorable.
Sure, but this only the reduced example, where I limited the speed so the print's can be followed.
Try removing the removal of rows, temporarily. Any difference to the behaviour?
No. The behaviour already starts before maximum number of log items (here 20) is reached. Disabling removing is equivalent; I checked this.
Subclass the QTableView. You could find out a lot about what is going on by putting print/counters statements in its slots/overridable virtual methods.
We could indeed do this, but we're not Qt experts and I am afraid time is lacking. We were hoping to avoid having to dig into the underlying code.
These seem to contradict each other. Anyway I would not expect the QTableView to only access the "visible" rows from the model, I would likely expect to access all of them. Unless someone says otherwise. Can you provide a reference (or an expert confirm) that QTableView stops looking at model rows/data as soon as it has "filled all visible rows on the widget?
Sorry, I don't see the contradiction?
The QTableView documentation never guarantees that the view only makes the minimum number of data() calls necessary. However, we had expected this, since (as we understand it) the point of the dynamic QAbstractItemModel is to avoid calculations on data that will never be presented.
In this case, a row is being added that is not visible. It seems to us that the calls to data() are superfluous, since the QTableView will not be updated visually (neglecting the resizing of the scroll-bar handle). Since it is not guaranteed, it is not a bug, but we tend to believe that it is a sub-optimal implementation.
Since your n is so small, 6, this may not actually be O(n²). It might be n * m, where m is small, like 6. IIRC QTreeViews call data for each cell about 6 times with particular roles, there are about half-a-dozen it needs to display an item (don't forget in addition to data, there are font, color, alignment, ....).
It indeed seems to be somewhere in-between: for 20 visible rows and 1 column, the number of calls to rowCount() is 162 for a single add-remove pair. For 6 visible rows, the number of 49.
In itself your rowCount() implementation is "fast". So I wouldn't worry about these calls. Calling data() a large number of times maybe, but why worry about rowCount() itself?
I am not so much worried about the number of calls to rowCount, but rather surprised. To me this suggests that we either are doing something wrong, or that the QTableView is not as optimised as we had hoped.
That is terribly relevant. Are you experiencing a lot of calls to your data() for certain roles?
No, we see something different — exactly one call to data() for each role. There is no role that is queried more than others.
How large do you intend rowCount() to get? If it is, say, 10,000, do you (really) wish your user to be able to scroll up and down all 10,000 in the QTableView without filtering/paging?
1000 is the default setting, 100'000 would certainly be useful. It was our expectation that relatively smooth scrolling would be possible, because (as we see it) the QTableView has sufficient information to know that it does not need to re-query everything (or indeed, anything) if a row is added to the end where it is not visible. If it turns out that we are not doing anything wrong, than it seems to me QTableView is not as optimised as we had hoped. As noted, this is fine, then we need to implement caching.
What exactly is your current issue, out of those you have mentioned? For example, latest you have been talking about "data conversion to string", which is very different from row counts. If you find performance fine at the moment and assume it will be terrible with 10,000 log rows, have you given that a 1 second test to see how it actually performs, before deciding whether/what problem you have?
My original question was for the number of calls to data(), this is the key problem. It is here (for the DisplayRole) that we convert the log object to a string. The rowCount is simply mentioned as a side observation that (to us) seems to point a somewhat non-optimised implementation.
It is definitely a problem in the real application; as noted in the first question, we have create this reduced case exactly because we saw slow-downs even for moderate number of data.
Again, if it turns out that the QTableView implementation is somewhat sub-optimal (which I tend to conclude), we wouldn't consider this a bug, more an expectation mismatch. My question is mostly to understand if we did something wrong in our LogModel implementation.