iterators in QSharedData
-
Hi everyone, small problem for you. Let's take this example:
class TestData : public QSharedData{ public: int m_num; QList<int> m_numList; QList<int>::iterator m_iter; ~TestData() = default; TestData(const HugeContainerData& other)= default; TestData() :QSharedData(other) ,m_num(0) {m_iter=m_numList.begin();} };
At first I thought this would be a problem: when this object detaches
m_iter
will still point to the previous list but doing a test run yielded unexpected results. It works perfectly.So two questions:
- Why?
- Can I count on this to work going forward?
-
@VRonin said in iterators in QSharedData:
Why?
Could you also share the test case?
Can I count on this to work going forward?
It's probably unsafe to assume that.
-
@kshegunov said in iterators in QSharedData:
Could you also share the test case?
By building it to be minimal I actually realised that, as expected, it does not work:
#include <QExplicitlySharedDataPointer> #include <QSharedData> #include <QDebug> class TestData : public QSharedData{ QList<int> m_numList; QList<int>::iterator m_iter; public: const QList<int>& numList() const { return m_numList; } void setFirstValue(int val){ m_numList[0]=val; } void setNumList(const QList<int> &numList) { m_numList = numList; } const QList<int>::iterator& iter() const { return m_iter; } void setIter(const QList<int>::iterator &iter) { m_iter = iter; } ~TestData() = default; TestData(const TestData& other)= default; TestData() :QSharedData() , m_numList({1,2,3,4}) {m_iter=m_numList.begin();} }; class TestWrap{ QExplicitlySharedDataPointer<TestData> m_d; public: TestWrap(const TestWrap& other)=default; TestWrap() :m_d(new TestData) {} void setFirstValue(int val){ if(m_d->numList().at(0)==val) return; m_d.detach(); m_d->setFirstValue(val); } int getFirstVal(){ return *(m_d->iter()); } }; int main(/*int argc, char *argv[]*/) { TestWrap container1; auto container2 = container1; container1.setFirstValue(5); Q_ASSERT(container1.getFirstVal()==5); Q_ASSERT(container2.getFirstVal()==1); TestWrap container3; auto container4 = container3; container4.setFirstValue(5); Q_ASSERT(container4.getFirstVal()==5); Q_ASSERT(container3.getFirstVal()==1); return 0; }
I need to specialise the copy constructor of the data this way to make it work:
TestData(const TestData& other) :QSharedData(other) ,m_numList(other.m_numList) { m_iter=m_numList.begin(); for(auto i=other.m_numList.constBegin();i!=other.m_iter;++i,++m_iter){} }
Problem now is: what if instead of a QList I had a QSet or QHash. How would I implement that copy?
-
@VRonin said in iterators in QSharedData:
By building it to be minimal I actually realised that, as expected, it does not work
That's why I asked for the test case, seemed suspicious. ;)
for(auto i=other.m_numList.constBegin();i!=other.m_iter;++i,++m_iter){}
Sweet Mary, holy mother of Jesus, why the hell are you doing this? Just do the usual pointer arithmetic (wrapped in tidy classes):
m_iter = m_numList.begin() + (other.m_iter - other.m_numList.constBegin()); if (m_iter > m_numList.end()) m_iter = m_numList.end();
Problem now is: what if instead of a QList I had a QSet or QHash. How would I implement that copy?
m_iter = m_set.find(other.m_iter.key());
-
@kshegunov said in iterators in QSharedData:
Sweet Mary, holy mother of Jesus, why the hell are you doing this? Just do the usual pointer arithmetic
I normally use
std::distance
but complained for one beingconst_iterator
and the other being (non-const)iterator
, I didn't even think about using + and -, but it seems not to work anyway, hence my workaround.m_iter = m_set.find(other.m_iter.key());
That doesn't work for QHash as multiple values can have the same key but different iterators
-
@VRonin said in iterators in QSharedData:
but it seems not to work anyway
Well I have a typo,
other.m_iter - other.m_numList.constBegin()
should beother.m_iter - other.m_numList.begin()
, as your iterator isn't a constant one. Beside that it should work just fine,operator -
returns an integer.@VRonin said in iterators in QSharedData:
That doesn't work for QHash as multiple values can have the same key but different iterators
You use
QHash::insertMulti
? -
@kshegunov said in iterators in QSharedData:
Well I have a typo
Nope,
other
isconst TestData&
so evenbegin
would return aconst_iterator
. :-PYou use QHash::insertMulti?
I do not and I think it's evil but we are talking generally now.
-
@VRonin said in iterators in QSharedData:
Nope, other is const TestData& so even begin would return a const_iterator. :-P
Yeah, I forgot that little detail, but
castreconstruct your iterator and you should be fine:QList<int>::const_iterator(other.m_iter) - other.m_numList.constBegin()
I do not and I think it's evil but we are talking generally now.
You are stuck with iterating from the first multi-element (similarly to your original approach with the list) to the pointed element in this case. Taking then the offset and reapplying the procedure backwards, taking the first of the multivalued iterators and incrementing it a number of times. Something like this:
int offset = 0; QHash<...>::Iterator start = other.m_hash.find(other.m_iter.key()); for (offset = 0; start != other.m_iter; offset++, start++) ; m_iter = m_hash.find(other.m_iter.key()); while (offset > 0) { m_iter++; offset--; }
PS.
I really, really hate the hungarian notation ... -
@kshegunov said in iterators in QSharedData:
Yeah, I forgot that little detail
and I forgot that constructor. good spot!
Unfortunately QHash seems a lost cause see the main below:
#include <QExplicitlySharedDataPointer> #include <QSharedData> #include <QDebug> #include <iterator> class TestData : public QSharedData{ QHash<int,int> m_numList; QHash<int,int>::iterator m_iter; public: const QHash<int,int>& numList() const { return m_numList; } void setFirstValue(int val){ m_numList.begin().value()=val; } void setNumList(const QHash<int,int> &numList) { m_numList = numList; } const QHash<int,int>::iterator& iter() const { return m_iter; } void setIter(const QHash<int,int>::iterator &iter) { m_iter = iter; } ~TestData() = default; TestData(const TestData& other) :QSharedData(other) ,m_numList(other.m_numList) { m_iter=m_numList.begin() + std::distance(other.m_numList.constBegin(),QHash<int,int>::const_iterator(other.m_iter)); } TestData() :QSharedData() , m_numList({std::make_pair(1,1),std::make_pair(2,2),std::make_pair(3,3),std::make_pair(4,4)}) {m_iter=m_numList.begin();} }; class TestWrap{ QExplicitlySharedDataPointer<TestData> m_d; public: TestWrap(const TestWrap& other)=default; TestWrap() :m_d(new TestData) {} void setFirstValue(int val){ if(m_d->numList().begin().value()==val) return; m_d.detach(); m_d->setFirstValue(val); } int getFirstVal(){ return *(m_d->iter()); } }; int main(/*int argc, char *argv[]*/) { TestWrap container1; auto container2 = container1; container1.setFirstValue(5); Q_ASSERT(container1.getFirstVal()==5); //ok Q_ASSERT(container2.getFirstVal()==1); //fail randomly TestWrap container3; auto container4 = container3; container4.setFirstValue(5); Q_ASSERT(container4.getFirstVal()==5); Q_ASSERT(container3.getFirstVal()==1); return 0; }
looks like the detached from container might end up rehashing itself anyway
P.S.
this is not hungarian.QExplicitlySharedDataPointer<TestData> esdptdm_d
is proper hungarian. I think that is dead with modern IDEs easily telling you the type of a variable -
@VRonin said in iterators in QSharedData:
m_iter=m_numList.begin() + std::distance(other.m_numList.constBegin(),QHash<int,int>::const_iterator(other.m_iter));
You can't do that! Not with hashes anyway, your iterators are not guaranteed to be contiguous nor ordered and that it sometimes may work is mere luck. You need to do something like what I wrote - moving to and fro around the iterators you get from
QHash::find
. -
Zobie-ing this topic for a related quetsion:
From http://doc.qt.io/qt-5/containers.html#stl-style-iterators
Implicit sharing has another consequence on STL-style iterators: you should avoid copying a container while iterators are active on that container. The iterators point to an internal structure, and if you copy a container you should be very careful with your iterators.
the example there creates the iterator before copying but now imagine this scenario:
QHash<int,int> intList1 = {{1,2},{3,4},{5,6},{7,8}}; auto intList2 = intList1; // shared auto iter1 = intList1.begin(); auto iter2 = intList2.begin(); *iter2 =0; //detach for(;iter1!=intList1.end();++iter1) qDebug() << *iter1; for(;iter2!=intList2.end();++iter2) qDebug() << *iter2;
or:
QHash<int,int> intList1 = {{1,2},{3,4},{5,6},{7,8}}; auto intList2 = intList1; // shared auto iter1 = intList1.begin(); auto iter2 = intList2.begin(); *iter1 =0; //detach for(;iter1!=intList1.end();++iter1) qDebug() << *iter1; for(;iter2!=intList2.end();++iter2) qDebug() << *iter2; return 0;
They all work correctly.
Same original questions:- Why?
- Can I count on this to work going forward?
-
@VRonin said in iterators in QSharedData:
Why?
No clue.
Can I count on this to work going forward?
Probably, as the containers don't change that much (at least not to my knowledge), however it will still be "undocumented behaviour" ...
-
@kshegunov I'm writing a kind-of-wrapper* around QMap/QHash (safety warning: looking at this code might cause permanent brain damage) so my Container::iterator will just be a wrapper around QHash::iterator so I want to make sure my iterator does not accidentally become invalid when it shouldn't.
Unit tests are promising but I learned the hard way not to put to much confidence in them
*Reason is a user of a program supposed to have around 1k elements in the QHash abused it to hold 300k and blew it up. I needed a solution to allow the abuse without compromising too much efficiency in the usual use case and without redesigning the entire thing
-
A formidable task. I'd approach it somewhat differently however. What I'd consider is implementing my own hash table (a drop-in replacement for
QHash
) that implements a paging scheme internally. Something like splitting the buckets and nodes into groups (pages) and swapping them to the file when memory's needed. Also a weighting scheme can be added, similarly toQCache
, to keep the "most used" pages in memory. Unfortunately this would also run into the usual problems when regrowing the table and it may need some sort of incremental rehashing. One thing off the top of my head would be to use two hash functions - one which is to determine the bucket, as is done ordinarily, and one to identify the page (similarly to the ideas stated with linear hashing).In fact it may be possible to skip the whole hash table implementation and to just proceed with an aggregating class, which will do the paging based on the stated scheme - using a second hash function to swap the tables to the disk. One possible problem I foresee though is that if the hash function doesn't spread the values enough some chunks may grow pretty big, so care should be taken to solve that.
-
Hope you don't mind but I added your post to my issues so I do not lose it: https://github.com/VSRonin/QtHugeContainer/issues/1