iterators in QSharedData



  • Hi everyone, small problem for you. Let's take this example:

    class TestData : public QSharedData{
    public:
    int m_num;
    QList<int> m_numList;
    QList<int>::iterator m_iter;
    ~TestData() = default;
    TestData(const HugeContainerData& other)= default;
    TestData()
    :QSharedData(other)
    ,m_num(0)
    {m_iter=m_numList.begin();}
    };
    

    At first I thought this would be a problem: when this object detaches m_iter will still point to the previous list but doing a test run yielded unexpected results. It works perfectly.

    So two questions:

    • Why?
    • Can I count on this to work going forward?

  • Qt Champions 2016

    @VRonin said in iterators in QSharedData:

    Why?

    Could you also share the test case?

    Can I count on this to work going forward?

    It's probably unsafe to assume that.



  • @kshegunov said in iterators in QSharedData:

    Could you also share the test case?

    By building it to be minimal I actually realised that, as expected, it does not work:

    #include <QExplicitlySharedDataPointer>
    #include <QSharedData>
    #include <QDebug>
    class TestData : public QSharedData{
        QList<int> m_numList;
        QList<int>::iterator m_iter;
    public:
        const QList<int>& numList() const
        {
        return m_numList;
        }
        void setFirstValue(int val){
            m_numList[0]=val;
        }
        void setNumList(const QList<int> &numList)
        {
        m_numList = numList;
        }
    
        const QList<int>::iterator& iter() const
        {
        return m_iter;
        }
    
        void setIter(const QList<int>::iterator &iter)
        {
        m_iter = iter;
        }
    
    
        ~TestData() = default;
        TestData(const TestData& other)= default;
        TestData()
            :QSharedData()
            , m_numList({1,2,3,4})
            {m_iter=m_numList.begin();}
    };
    
    class TestWrap{
    
        QExplicitlySharedDataPointer<TestData> m_d;
    public:
        TestWrap(const TestWrap& other)=default;
        TestWrap()
            :m_d(new TestData)
        {}
        void setFirstValue(int val){
            if(m_d->numList().at(0)==val)
                return;
            m_d.detach();
            m_d->setFirstValue(val);
        }
        int getFirstVal(){
            return *(m_d->iter());
    
        }
    
    };
    
    int main(/*int argc, char *argv[]*/)
    {
            TestWrap container1;
            auto container2 = container1;
            container1.setFirstValue(5);
            Q_ASSERT(container1.getFirstVal()==5);
            Q_ASSERT(container2.getFirstVal()==1);
    
            TestWrap container3;
            auto container4 = container3;
            container4.setFirstValue(5);
            Q_ASSERT(container4.getFirstVal()==5);
            Q_ASSERT(container3.getFirstVal()==1);
    
    
            return 0;
    }
    

    I need to specialise the copy constructor of the data this way to make it work:

    TestData(const TestData& other)
            :QSharedData(other)
            ,m_numList(other.m_numList)
        {
            m_iter=m_numList.begin();
            for(auto i=other.m_numList.constBegin();i!=other.m_iter;++i,++m_iter){}
        }
    

    Problem now is: what if instead of a QList I had a QSet or QHash. How would I implement that copy?


  • Qt Champions 2016

    @VRonin said in iterators in QSharedData:

    By building it to be minimal I actually realised that, as expected, it does not work

    That's why I asked for the test case, seemed suspicious. ;)

    for(auto i=other.m_numList.constBegin();i!=other.m_iter;++i,++m_iter){}
    

    Sweet Mary, holy mother of Jesus, why the hell are you doing this? Just do the usual pointer arithmetic (wrapped in tidy classes):

    m_iter = m_numList.begin() + (other.m_iter - other.m_numList.constBegin());
    if (m_iter > m_numList.end())
        m_iter = m_numList.end();
    

    Problem now is: what if instead of a QList I had a QSet or QHash. How would I implement that copy?

    m_iter = m_set.find(other.m_iter.key());
    


  • @kshegunov said in iterators in QSharedData:

    Sweet Mary, holy mother of Jesus, why the hell are you doing this? Just do the usual pointer arithmetic

    I normally use std::distance but complained for one being const_iterator and the other being (non-const)iterator, I didn't even think about using + and -, but it seems not to work anyway, hence my workaround.

    m_iter = m_set.find(other.m_iter.key());

    That doesn't work for QHash as multiple values can have the same key but different iterators


  • Qt Champions 2016

    @VRonin said in iterators in QSharedData:

    but it seems not to work anyway

    Well I have a typo, other.m_iter - other.m_numList.constBegin() should be other.m_iter - other.m_numList.begin(), as your iterator isn't a constant one. Beside that it should work just fine, operator - returns an integer.

    @VRonin said in iterators in QSharedData:

    That doesn't work for QHash as multiple values can have the same key but different iterators

    You use QHash::insertMulti?



  • @kshegunov said in iterators in QSharedData:

    Well I have a typo

    Nope, other is const TestData& so even begin would return a const_iterator. :-P

    You use QHash::insertMulti?

    I do not and I think it's evil but we are talking generally now.


  • Qt Champions 2016

    @VRonin said in iterators in QSharedData:

    Nope, other is const TestData& so even begin would return a const_iterator. :-P

    Yeah, I forgot that little detail, but cast reconstruct your iterator and you should be fine:

    QList<int>::const_iterator(other.m_iter) - other.m_numList.constBegin()
    

    I do not and I think it's evil but we are talking generally now.

    You are stuck with iterating from the first multi-element (similarly to your original approach with the list) to the pointed element in this case. Taking then the offset and reapplying the procedure backwards, taking the first of the multivalued iterators and incrementing it a number of times. Something like this:

    int offset  = 0;
    QHash<...>::Iterator start = other.m_hash.find(other.m_iter.key());
    for (offset = 0; start != other.m_iter; offset++, start++)
        ;
    
    m_iter = m_hash.find(other.m_iter.key());
    while (offset > 0)  {
        m_iter++;
        offset--;
    }
    

    PS.
    I really, really hate the hungarian notation ...



  • @kshegunov said in iterators in QSharedData:

    Yeah, I forgot that little detail

    and I forgot that constructor. good spot!

    Unfortunately QHash seems a lost cause see the main below:

    #include <QExplicitlySharedDataPointer>
    #include <QSharedData>
    #include <QDebug>
    #include <iterator>
    class TestData : public QSharedData{
        QHash<int,int> m_numList;
        QHash<int,int>::iterator m_iter;
    public:
        const QHash<int,int>& numList() const
        {
        return m_numList;
        }
        void setFirstValue(int val){
            m_numList.begin().value()=val;
        }
        void setNumList(const QHash<int,int> &numList)
        {
        m_numList = numList;
        }
    
        const QHash<int,int>::iterator& iter() const
        {
        return m_iter;
        }
    
        void setIter(const QHash<int,int>::iterator &iter)
        {
        m_iter = iter;
        }
    
    
        ~TestData() = default;
        TestData(const TestData& other)
            :QSharedData(other)
            ,m_numList(other.m_numList)
        {
            m_iter=m_numList.begin() + std::distance(other.m_numList.constBegin(),QHash<int,int>::const_iterator(other.m_iter));
        }
        TestData()
            :QSharedData()
            , m_numList({std::make_pair(1,1),std::make_pair(2,2),std::make_pair(3,3),std::make_pair(4,4)})
            {m_iter=m_numList.begin();}
    };
    
    class TestWrap{
    
        QExplicitlySharedDataPointer<TestData> m_d;
    public:
        TestWrap(const TestWrap& other)=default;
        TestWrap()
            :m_d(new TestData)
        {}
        void setFirstValue(int val){
            if(m_d->numList().begin().value()==val)
                return;
            m_d.detach();
            m_d->setFirstValue(val);
        }
        int getFirstVal(){
            return *(m_d->iter());
    
        }
    
    };
    
    int main(/*int argc, char *argv[]*/)
    {
            TestWrap container1;
            auto container2 = container1;
            container1.setFirstValue(5);
            Q_ASSERT(container1.getFirstVal()==5); //ok
            Q_ASSERT(container2.getFirstVal()==1); //fail randomly
    
            TestWrap container3;
            auto container4 = container3;
            container4.setFirstValue(5);
            Q_ASSERT(container4.getFirstVal()==5);
            Q_ASSERT(container3.getFirstVal()==1);
    
    
            return 0;
    }
    

    looks like the detached from container might end up rehashing itself anyway

    P.S.
    this is not hungarian. QExplicitlySharedDataPointer<TestData> esdptdm_d is proper hungarian. I think that is dead with modern IDEs easily telling you the type of a variable


  • Qt Champions 2016

    @VRonin said in iterators in QSharedData:

    m_iter=m_numList.begin() + std::distance(other.m_numList.constBegin(),QHash<int,int>::const_iterator(other.m_iter));
    

    You can't do that! Not with hashes anyway, your iterators are not guaranteed to be contiguous nor ordered and that it sometimes may work is mere luck. You need to do something like what I wrote - moving to and fro around the iterators you get from QHash::find.



  • Zobie-ing this topic for a related quetsion:

    From http://doc.qt.io/qt-5/containers.html#stl-style-iterators

    Implicit sharing has another consequence on STL-style iterators: you should avoid copying a container while iterators are active on that container. The iterators point to an internal structure, and if you copy a container you should be very careful with your iterators.

    the example there creates the iterator before copying but now imagine this scenario:

    QHash<int,int> intList1 = {{1,2},{3,4},{5,6},{7,8}};
        auto intList2 = intList1; // shared
        auto iter1 = intList1.begin();
        auto iter2 = intList2.begin();
    
        *iter2 =0; //detach
        for(;iter1!=intList1.end();++iter1)
            qDebug() << *iter1;
        for(;iter2!=intList2.end();++iter2)
            qDebug() << *iter2;
    

    or:

    QHash<int,int> intList1 = {{1,2},{3,4},{5,6},{7,8}};
        auto intList2 = intList1; // shared
        auto iter1 = intList1.begin();
        auto iter2 = intList2.begin();     
    
        *iter1 =0; //detach
        for(;iter1!=intList1.end();++iter1)
            qDebug() << *iter1;
        for(;iter2!=intList2.end();++iter2)
            qDebug() << *iter2;
            return 0;
    

    They all work correctly.
    Same original questions:

    • Why?
    • Can I count on this to work going forward?

  • Qt Champions 2016

    @VRonin said in iterators in QSharedData:

    Why?

    No clue.

    Can I count on this to work going forward?

    Probably, as the containers don't change that much (at least not to my knowledge), however it will still be "undocumented behaviour" ...



  • @VRonin said in iterators in QSharedData:

    auto intList2 = intList1; // shared
    auto iter1 = intList1.begin();
    auto iter2 = intList2.begin();     
    
    *iter1 =0; //detach
    

    Excuse me, iter1 is not just readonly ?...



  • @Taz742 said in iterators in QSharedData:

    iter1 is not just readonly ?

    no, if it was auto iter1 = intList1.cbegin(); then it would be


  • Qt Champions 2016

    On a related note, do you mind elaborating why do you need to keep an iterator over time? That's a pretty unusual thing to do.



  • @kshegunov I'm writing a kind-of-wrapper* around QMap/QHash (safety warning: looking at this code might cause permanent brain damage) so my Container::iterator will just be a wrapper around QHash::iterator so I want to make sure my iterator does not accidentally become invalid when it shouldn't.

    Unit tests are promising but I learned the hard way not to put to much confidence in them


    *Reason is a user of a program supposed to have around 1k elements in the QHash abused it to hold 300k and blew it up. I needed a solution to allow the abuse without compromising too much efficiency in the usual use case and without redesigning the entire thing


  • Qt Champions 2016

    A formidable task. I'd approach it somewhat differently however. What I'd consider is implementing my own hash table (a drop-in replacement for QHash) that implements a paging scheme internally. Something like splitting the buckets and nodes into groups (pages) and swapping them to the file when memory's needed. Also a weighting scheme can be added, similarly to QCache, to keep the "most used" pages in memory. Unfortunately this would also run into the usual problems when regrowing the table and it may need some sort of incremental rehashing. One thing off the top of my head would be to use two hash functions - one which is to determine the bucket, as is done ordinarily, and one to identify the page (similarly to the ideas stated with linear hashing).

    In fact it may be possible to skip the whole hash table implementation and to just proceed with an aggregating class, which will do the paging based on the stated scheme - using a second hash function to swap the tables to the disk. One possible problem I foresee though is that if the hash function doesn't spread the values enough some chunks may grow pretty big, so care should be taken to solve that.



  • Hope you don't mind but I added your post to my issues so I do not lose it: https://github.com/VSRonin/QtHugeContainer/issues/1


  • Qt Champions 2016

    Hope you don't mind but I added your post to my issues

    Not in the least. Keep us posted, as this might prove pretty useful for me at some point :p


Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.