QList vs QHash - faster "contains" with QHash?



  • Hi,

    I want to store a List of integer QList<int>.
    I'm not sure how fast is ".contains" when using a QList.

    I want it to be very fast (O(n)), because the list will grow big, is that the case with QList?
    Or maybe I should look at QHash instead?

    example of use.
    lst.insert(23);
    lst.insert(35);

    if (lst.contains(35)) {
    do something
    }


  • Lifetime Qt Champion

    Hi,

    QVector would be recommended for that use case. How big can it become ?



  • @SGaist

    The List could be between 1 and 1000.

    I'm more concerned about the time to execute if (lst.contains(x))
    There will be a lot of check with that kind of if, so I would prefer to have a constant check, I know QHash .contains is constant, but haven't found info on QList lookup

    Thanks


  • Moderators

    Algorithm complexity is only half of the story. The other, significantly more important on hardware today, is memory access pattern. As @SGaist said, a QVector and its indexOf might be a lot faster because of cache friendliness. QList is almost always a worse idea because of the reverse - it's extremely cache unfriendly.
    The only way to be sure though is by actually measuring. Theoretical algorithm complexity is not a good indicator of anything these days, especially on a small data set like this (yes, i consider 1000 elements small).



  • Thanks for the insight,

    I think I may be overthinking this one. I will start with a QList<int> and change later if my data set get bigger.

    An example of my data set:
    Data is received from Sensors over the air, a sensor can send 4 messages per second. When receiving a message from a sensor, I need to check the Sensor ID to see if it's my list before doing some calculation. The Max I can see getting is around 100 sensors (4*100) messages per second. So that's basically 400 "if check" every second to check if the sensor ID is in my list.


  • Moderators

    I would reconsider. QList should always be at the end of your choice list. It has very little benefits. I would (always?) start with QVector and then, only if needed, move to something else.
    400 vector checks per second is not a big load for modern CPUs which get into thousands of MIPS these days. Cache unfriendly structures (like QList) is what kills performance most of the time.



  • Perfect, QVector has the same method so nothing to do on my side, I like that. QVector it will be!



  • @Chris-Kawa said:

    I would reconsider. QList should always be at the end of your choice list. It has very little benefits. I would (always?) start with QVector and then, only if needed, move to something else.
    400 vector checks per second is not a big load for modern CPUs which get into thousands of MIPS these days. Cache unfriendly structures (like QList) is what kills performance most of the time.

    Just out of interest: Why not use QSet in this case? (Assuming the ID is unique)


  • Moderators

    @Asperamanca said:

    Just out of interest: Why not use QSet in this case? (Assuming the ID is unique)

    Same reason - QSet is not a contiguous memory chunk.
    It is basically a QHash<T, DummyType>, which in turn is basically a list of buckets. Although a little better than a full blown linked list it potentially still has poor cache locality qualities.

    For the record - I'm not saying you should not use a QSet or any other container type or that they are always slower. What I am saying is you should measure, and if you're not interested enough to do that then QVector is a very good default, that will in many cases perform better than faster on paper specialized containers. But please do measure your particular cases.



  • Thanks for the added details.

    Maybe the Qt docs could be a little more specific on the best-case and worst-case use on each containers.
    I see diverging opinions, see answer here that favors QList.

    For my use case, I will probably use a QHash that can check directly (O(1)) if an item is present in the bucket or not since I do not know the algorithm complexity of ".contains" of QVector and QList.


  • Moderators

    For vector or a list contains is a linear search (simply calls std::find in the implementation), no way around it. If you can guarantee the vector is sorted you can use std::lower_bound to reduce that to binary search (logarithmic complexity).

    Note that the 1 in O(1) of QHash might be expensive for some types and it's possible that a vector lookup (for some sizes) is still faster (operator==). If in doubt - measure.


Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.