On understanding QList copy-on-write

leonidwang

Take the below sample code for example:

@
f(){
...
QList<QVariant> myList;
g(myList);
modify_variant_appended_by_g();
...
}

g(QList<QVariant>& list){
...
QVariant v(...);
list.append(v);
...
}
@

Function f() creates a QList<QVariant>, then calls function g() to append a QVariant.

This QVariant v will always be valid in f's scope because the append operation added a reference to that QVariant, right?

Then, is this QVariant v allocated on g's stack or on the heap?
What happens when g returns and its stack gets unwinded, if v is allocated on stack?

On what condition does c-o-w takes place? I mean, if v is allocated on the heap, then after g returns, the only reference to v will be the QList in f, then if I called modify_variant_appended_by_g(), is it necessary for c-o-w to take place?

ludde

One of the beauties of implicit sharing is that you very rarely need to worry about how it actually works. It's usually enough to know that it does work.
The code above does work - it would work if QVariant did not use implicit sharing, and it does work not that it does use implicit sharing. Do you really need to understand the exact details why?

andre

-First of all, I am not sure what you mean by c-o-w.-

What happens is this:

you create a QList<QVariant>

you pass this list by reference to g()

inside the scope of g, you create a new QVariant v

you insert the variant into g

#* a copy of v is inserted into g!
#* but note that QVariant itself is also an implicitly shared class: the copy is very cheap

g() finishes, and v goes out of scope. The sole remaining copy of v (refering to the internal data) is in the QList

your modification function may have a hard time, don't think that just getting the variant out of the list and changing its value will modify the value in the list itself!

#* modification only works this way if you use the [] operator (which gives back a reference)
#* otherwise, you are operating on a (shared) copy, which will be detached from the internal value shared with the version in the list as soon as you modify it.

Edit: I guess you mean copy-on-write with c-o-w.

dangelog

[quote author="leonidwang" date="1314175469"]
This QVariant v will always be valid in f's scope because the append operation added a reference to that QVariant, right?[/quote]

No, it added a copy of that variant. If it added a "reference" (whatever that means), the variant would be destroyed as soon as it falls out of scope (say, you return from g), therefore being invalid in f's scope.

[quote]
Then, is this QVariant v allocated on g's stack or on the heap?
[/quote]

This has nothing to do with QList... (in your opinion?)

[quote]
What happens when g returns and its stack gets unwinded, if v is allocated on stack?
[/quote]

This has nothing to do with QList... (in your opinion?)

[quote]
On what condition does c-o-w takes place? I mean, if v is allocated on the heap, then after g returns, the only reference to v will be the QList in f, then if I called modify_variant_appended_by_g(), is it necessary for c-o-w to take place?[/quote]

Where v is exactly allocated has nothing to do to with the fact that a QList is managed with cow. You are appending it, and that makes a copy of the value (that's why QList requires your type to be copiable).
If you're talking about the fact that QVariant is another implictly shared class, well, actually a cow doesn't appear from your code:

creating v will set the refcount to 1
adding v to the QList will increase the refcount (you took a copy) to 2
destroying v will decrease it back to 1
a subsequent modification in modify_variant_appended_by_g will not detach since the refcount is 1.

goetz

leonidwang, your almost right.

Function g has a reference to the list, thus, list is just an alias to myList in function f. It references to the very same object in memory.

In function g, the QVariant v is created on the stack and destroyed on leaving g. What happens behind the scenes is, that list.append creates a copy of v (probably on the heap) and puts that into the list.

-To my knowledge, QVariant itself is not an implicitly shared class, so there is no copy-on-write taking place.-
[According to the "API docs":http://developer.qt.nokia.com/doc/qt-4.7/implicit-sharing.html it actually is implicitly shared, Volker]

A better example would be using a QString:

@

void g(QString str)
{
qDebug() << "g got str" << str;
str.append(" is cool");
qDebug() << "g result" << str;
}

void h(QString &str)
{
qDebug() << "h got str" << str;
str.append(" is REALLY cool!");
qDebug() << "h result" << str;
}

void f()
{
QString myString("DevNet");
qDebug() << "f myString 1:" << myString;

g(myString);
qDebug() << "f myString 2:" << myString;

h(myString);
qDebug() << "f myString 3:" << myString;

}
@

The output is:

@
f myString 1: "DevNet"

g got str "DevNet"
g result "DevNet is cool"

f myString 2: "DevNet"

h got str "DevNet"
h result "DevNet is REALLY cool!"

f myString 3: "DevNet is REALLY cool!"
@

f constructs a QString the usual way.

g takes an QString argument that is a shallow copy of the original string, i.e. the actual data still points to that of myString from f. Once you append to the string the data is copied and you do not modify the original string.

h takes a non-const reference to a QString. Effectively you are appending to the myString object from f.

The same holds for all the container classes. The actual contents of the container are only copied around on the first attempt to write to the container (append, remove elements, etc.).

leonidwang

That's the problem I'm not sure about...

Andre and peppe implies that after g() returns, v is still valid, and no cow takes place, while Volker states that a deep copy is necessary when g() returns.
Hope I didn't get you guys wrong, and personally, I'm with the no cow opinion.

As ludde said, I need not to get every detail about the implementation of implicit share.
I just want to avoid cow(deep copy) as much as I can. If there's no cow in my code, I would be happy to go with that, otherwise, I'll try other ways. Am I thinking too much? :)

andre

Indeed, no actual deep copies occur in the code you showed us. And, IMHO, you are thinking about this too much at this stage of development. You are trying to do premature optimization, I think. That is seldom a good idea. Build, test and profile. Then go hunt for bottlenecks. I doubt this will be it.

goetz

There is no such concept of "to avoid cow(deep copy) as much as I can"! It's the other way round! Use it as much as you can, to save CPU cycles!

It is up to you to decide what you want to do semantically:

If you want to modify the original object, use a non-const reference.
If you are about to modify the second usage of the object, but the original one you should pass it by value. You will have a shallow copy in the first place and only if you do modify the copy, the actual data is copied in memory (copy-on-write!)
If you never want to modify the copy, use a const reference. This does not copy the object and makes the object read-only in the called method

So, decide what you need and choose the appropriate tool for it. Don't do it the other way round, and choose the tool and look if it fits your needs. I have the impression that you're doing the latter right at the moment.