Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

Two Quick Implicit Sharing Questions



  • Hi I have a couple quick implicit sharing questions ....

    1. Is implicit sharing only relevant when passing instances of implicitly shared classes as value parameters to a function? So when you pass by reference no deep copies ever happen because the code uses the same object in memory right?
    2. Same idea for using an object in one function only - you can index using brackets without worrying about deep copies slowing down code right?

  • Moderators

    @Crag_Hack said in Two Quick Implicit Sharing Questions:

    1. Is implicit sharing only relevant when passing instances of implicitly shared classes as value parameters to a function?

    Not quite. Passing by reference and implicit sharing are actually very separate concepts.

    • Implicit sharing lets you create shallow copies.
    • Passing by reference creates no copies, regardless of whether the data is implicitly shared or not.

    Another term for "implicit sharing" is "copy-on-write". This means when you copy a variable, the data itself doesn't get copied until you try to write/modify one of the copies.

    QString is implicitly shared while std::string is not. Here is how implicit sharing affects data copies, when there is no parameter passing involved:

    QString longQString = ...
    std::string longStdString = ...
    
    QString longQStringB = longQString; // Creates a shallow copy -- Cheap
    std::string longStdStringB = longStdString; // Creates a deep copy -- Expensive
    
    /* At this point, longStdString and longStdStringB point to different data internally,
       while longQString and longQStringB point to the same data internally */
    
    longQStringB[10000] = 'X'; // Creates a deep copy and modifies the copy -- Expensive
    longStdStringB[10000] = 'X'; // No copying, because the data was already copied earlier -- Very cheap
    
    /* At this point,  longQString and longQStringB also point to different data internally */
    

    So when you pass by reference no deep copies ever happen because the code uses the same object in memory right?

    Correct. But more importantly, passing by-reference avoids making shallow copies too!

    Suppose you have these 4 functions:

    void processStdStringByValue(const std::string data);
    void processQStringByValue(const QString data);
    void processStdStringByRef(const std::string &data);
    void processQStringByRef(const QString &data);
    

    Here is the difference between passing by-value or passing by-reference:

    std::string longStdString = ...
    QString longQString = ...
    
    processStdStringByValue(longStdString); // Creates a deep copy -- Expensive
    processQStringByValue(longQString); // Creates a shallow copy -- Cheap
    processStdStringByRef(longStdString); // Creates no copy -- Very cheap
    processQStringByRef(longQString);  // Creates no copy -- Very cheap
    

    Again, when passing by-reference, it doesn't matter if the data is implicitly shared or not. Either way, no copies are made.

    1. Same idea for using an object in one function only - you can index using brackets without worrying about deep copies slowing down code right?

    Indexing has nothing to do with implicit sharing. Indexing is just one of many ways to write/modify data. See my first example above:

    • Without implicit sharing, a deep copy is made as soon as you copy the std::string.
    • With implicit sharing, the deep copy is not made when you copy the QString, but it is made when you try to modify the copied QString.


  • Thanks JKSH. Well what's the point of implicit sharing then? If you're passing data to a function and you need to modify, just pass by value, if you don't need to modify just pass by reference. If you're making a copy of a variable and you plan on modifying the copy you don't need implicit sharing just make a normal copy, if you don't plan on modifying the copy just use the original copy. So I'm obviously missing something here :) - I hope...


  • Qt Champions 2017

    The point is that it shares the data in a thread safe-manner. What I mean by this is that if you have a signal from one thread, to a slot in another and the variable is local to the point where the signal is emitted you get at least 1 deep copy less from using copy-on-write. If you don't use that you'd need to copy the data for each of the slots you're sending it to. Now imagine you pass an image from one thread to another for processing and to be even worse, imagine you're sending it to several threads - that's a lot of unnecessary copies you just avoid.


  • Moderators

    @Crag_Hack said in Two Quick Implicit Sharing Questions:

    Well what's the point of implicit sharing then?

    @kshegunov has given you a good point. Here's another: Think of all the functions that return QString, QList, QImage, and other implicitly-shared types.

    QImage prepareImageFromFile(const QString &filePath)
    {
        QImage myImage(filePath);
        
        /* ...
           Long, complex algorithm to validate and then modify the image
        */ ...
    
        if (processingWasSuccessful)
            return myImage;
        else
            return QImage(); // Return null image to indicate an error
    }
    

    With implicit sharing, returning data like this is cheap. Without implicit sharing, returning data like this can create an unnecessary deep copy -- this is wasteful since the local variable myImage is getting destroyed anyway, so we might as well re-use its internal data!

    We cannot return a reference, since myImage will be gone. The only other option to return the data is to allocate myImage on the heap and then return by-pointer. This is not nice because the caller has to manually delete the data later.

    If you're passing data to a function and you need to modify, just pass by value

    I disagree from an API design point-of-view. A function's signature should reflect what it intends to do with the caller's copy of the data.

    • If the caller's copy of the data will be modified, then the function should accept a pointer or non-const-reference parameter.
    • If the caller's copy of the data will NOT be modified, then the function should accept a const-reference. (Exception: Small types like int, enum, or bool can be accepted by-value)

    The rules-of-thumb above have been conventional C++ wisdom for a long time now (e.g. see https://stackoverflow.com/questions/5060137/passing-as-const-and-by-reference-worth-it ). (Note: C++11 introduced a new way: Passing data by rvalue-reference. That is a discussion for another day)

    Anyway, if the data will be modified within the function only, well this is an implementation detail and it should not be visible to the function's caller. Imagine this scenario: You release a library where one of the functions is void myFunc(QString data) -- you accept a parameter by-value because myFunc() modifies the string internally. One day, you find a way to optimize your function and realize that you don't need to modify the data anymore. What do you do? If you change the signature to void myFunc(const QString &data) you break binary compatibility which means when an end-user upgrades their version of your library, their app will stop working.



  • Thanks guys. Hopefully this will rank on Google with your enlightening information.


Log in to reply