QList vs QHash - faster "contains" with QHash?
-
Hi,
I want to store a List of integer QList<int>.
I'm not sure how fast is ".contains" when using a QList.I want it to be very fast (O(n)), because the list will grow big, is that the case with QList?
Or maybe I should look at QHash instead?example of use.
lst.insert(23);
lst.insert(35);if (lst.contains(35)) {
do something
} -
Hi,
QVector would be recommended for that use case. How big can it become ?
-
The List could be between 1 and 1000.
I'm more concerned about the time to execute if (lst.contains(x))
There will be a lot of check with that kind of if, so I would prefer to have a constant check, I know QHash .contains is constant, but haven't found info on QList lookupThanks
-
Algorithm complexity is only half of the story. The other, significantly more important on hardware today, is memory access pattern. As @SGaist said, a
QVector
and itsindexOf
might be a lot faster because of cache friendliness. QList is almost always a worse idea because of the reverse - it's extremely cache unfriendly.
The only way to be sure though is by actually measuring. Theoretical algorithm complexity is not a good indicator of anything these days, especially on a small data set like this (yes, i consider 1000 elements small). -
Thanks for the insight,
I think I may be overthinking this one. I will start with a QList<int> and change later if my data set get bigger.
An example of my data set:
Data is received from Sensors over the air, a sensor can send 4 messages per second. When receiving a message from a sensor, I need to check the Sensor ID to see if it's my list before doing some calculation. The Max I can see getting is around 100 sensors (4*100) messages per second. So that's basically 400 "if check" every second to check if the sensor ID is in my list. -
I would reconsider. QList should always be at the end of your choice list. It has very little benefits. I would (always?) start with QVector and then, only if needed, move to something else.
400 vector checks per second is not a big load for modern CPUs which get into thousands of MIPS these days. Cache unfriendly structures (like QList) is what kills performance most of the time. -
Perfect, QVector has the same method so nothing to do on my side, I like that. QVector it will be!
-
@Chris-Kawa said:
I would reconsider. QList should always be at the end of your choice list. It has very little benefits. I would (always?) start with QVector and then, only if needed, move to something else.
400 vector checks per second is not a big load for modern CPUs which get into thousands of MIPS these days. Cache unfriendly structures (like QList) is what kills performance most of the time.Just out of interest: Why not use QSet in this case? (Assuming the ID is unique)
-
@Asperamanca said:
Just out of interest: Why not use QSet in this case? (Assuming the ID is unique)
Same reason - QSet is not a contiguous memory chunk.
It is basically a QHash<T, DummyType>, which in turn is basically a list of buckets. Although a little better than a full blown linked list it potentially still has poor cache locality qualities.For the record - I'm not saying you should not use a QSet or any other container type or that they are always slower. What I am saying is you should measure, and if you're not interested enough to do that then QVector is a very good default, that will in many cases perform better than faster on paper specialized containers. But please do measure your particular cases.
-
Thanks for the added details.
Maybe the Qt docs could be a little more specific on the best-case and worst-case use on each containers.
I see diverging opinions, see answer here that favors QList.For my use case, I will probably use a QHash that can check directly (O(1)) if an item is present in the bucket or not since I do not know the algorithm complexity of ".contains" of QVector and QList.
-
For vector or a list
contains
is a linear search (simply calls std::find in the implementation), no way around it. If you can guarantee the vector is sorted you can use std::lower_bound to reduce that to binary search (logarithmic complexity).Note that the 1 in O(1) of QHash might be expensive for some types and it's possible that a vector lookup (for some sizes) is still faster (operator==). If in doubt - measure.