Subclass QReadWriteLock as a struct and delete it while it is locked?

CJha

Hi, I have a data struct that is accessed by multiple threads for calculations and its lifetime is maintained by the main GUI thread. To provide worry-free access to its data I am considering using this struct as a subclass of QReadWriteLock, my idea is to implement something like this and then pass around a pointer of an instance of this struct to worker threads.

struct Data : QReadWriteLock
{
    QVector<double> dataValues;
    QVector<double> timeStamps;
};

In the main GUI thread, the data is created as:

Data* data = new Data;

In the Worker threads, data is accessed as:

if(data != nullptr){
    if(data->tryLockForRead()){
        /* Perform Calculations */
        data->unlock();
    }
}

and in main GUI thread data is always destroyed through a function:

void safeDeleteData(Data* data)
{
    data->lockForWrite(); // should I do this?
    delete data;
    data = nullptr;
}

The question is should I data->lockForWrite() before deleting it and setting its pointer to null? I want to do this as this will guarantee that when I am deleting the data no worker threads are able to access it and so none of them would be in between their calculations. When the data is deleted its pointer is set to nullptr and all workers check the data pointer for null before trying to lock the data for the read. But deleting a locked QReadWriteLock is something that I am not sure of if I should do or not?

Or is this better?

void safeDeleteData(Data* data)
{
    QWriteLocker locker(data)
    delete data;
    data = nullptr;
}

SGaist

Hi,

Why would this data shared between threads be deleted while they are accessing it ?

Chris Kawa

QWriteLocker accesses the QReadWriteLock object in its destructor, so no, you can't delete the lock object while the write locker still exists.
You'd need to do a pointer dance of this kind:

Data* temp = data;
temp->lockForWrite();
data = nullptr;
temp->unlock();
delete temp;

But this doesn't really save you from anything, because you've got a threading logic error in your worker thread code:

GUI                      WORKER
...                      if(data != nullptr){              // ok, data is alive
safeDeleteData(data);                                      // GUI thread sets data to nullptr
...                         if(data->tryLockForRead()){    // crash, data is already nullptr

You can't assume anything about any object outside of a lock and your first check for nullptr is outside of a lock.

Usually worker threads are destroyed before the main thread destroys the shared data, so the entire problem can be avoided. If that's not an option you need to guard access to the data variable itself with additional lock, not only the data it points to.

CJha

@SGaist It depends on the user, they will be the one loading and unloading the data. I think I can create a "Junk" data structure where deleted data will be stored for a while after a user deletes it, just so that if a thread is still working on it then no crash happens. Is it a good solution?

CJha

@Chris-Kawa Thanks for pointing it out. My threads remain active while the data is destroyed and loaded, so I guess I will have to properly guard the data at each access points.

CJha

@SGaist @Chris-Kawa How about this architecture:

struct Data : QReadWriteLock
{
    QVector<double> dataValues;
    QVector<double> timeStamps;
};

In the main GUI thread, the data is created as:

auto dataPtr = std::shared_ptr<Data>();

To the worker threads, it is passed as

std::weak_ptr<Data> dataWeak = dataPtr;

In worker threads, it is used as

if(auto ptr = dataWeak.lock()){
    QReadLocker locker(ptr);
    /* Perform Calculations */
}

And to delete it I will just let it go out of scope, I keep the deleted data in QQueue and I dequeue it every time it crosses 100 MB of data. Are there any downsides to this? Especially, implementing the struct as a subclass of QReadWriteLock and then using it with std::shared_ptr? Till now C++ is not showing me any errors but I have just started and implementing these changes will take me more than a day as my QWidget class is huge and very complicated.

Christian Ehrlicher

Don't derive from QReadWriteLock and create a proper class which holds your data and your lock. Deriving from QReadWriteLock is no good idea and not needed.

Chris Kawa

@CJha Well, you still need to check what weak_ptr::lock() returns. If the data is not there anymore it will return a default constructed shared_ptr. I think you decided on a faulty data structure and now trying to design your app around it, complicating it more than it needs to be.
As @Christian-Ehrlicher said don't make your data structure a lock too. data and thread safety are two separate concerns. Don't mix them.
If you separate it you get a simple scheme

QReadWriteLock data_lock;
Data* data = ...;

and simple access in gui:

QWriteLocker locker(&data_lock)
delete data;
data = nullptr;

and in the worker:

QWriteLocker locker(&data_lock)
if (data)
   /* Perform calculations */

CJha

@Chris-Kawa @Christian-Ehrlicher I understand the recommendation, but I am still curious as to why it is a bad idea to derive from a lock for my data structure? It would be nice to know why it is not recommended :)

@Chris-Kawa

Well, you still need to check what weak_ptr::lock() returns. If the data is not there anymore it will return a default constructed shared_ptr.

It is true, but as stated on cppreference.com for std::weak_ptr<T>::lock:

Creates a new std::shared_ptr that shares ownership of the managed object. If there is no managed object, i.e. *this is empty, then the returned shared_ptr also is empty.
Effectively returns expired() ? shared_ptr<T>() : shared_ptr<T>(*this), executed atomically.

I am using an if statement, and it will only be true when the weak_ptr is not pointing to an expired object. The default constructed object is accessed in the else condition of the below statement and I am not even implementing that so the default constructed shared_ptr will be out of scope as soon as it is constructed.

if(auto ptr = dataWeak.lock()){
    /* Perform Calculations */
}

Chris Kawa

@CJha said:

I understand the recommendation, but I am still curious as to why it is a bad idea to derive from a lock for my data structure?

It's a wider problem of mixing concerns. The general idea is to have different roles handled by different objects. For example you can make a widget class that derives from QString because why not. You can have an item model that derives from QMediaPlayer because why not etc. but it's a bad idea.

Ok, but that sounds a bit like "because I said so", so for actual reasons why you can google "separation of concerns", but here are just a couple of points why it's an important design principle.

Mixing functionalities in a single class creates coupling between those functionalities and those are often artificial. If you have some data payload it has nothing to do with threading so if you combine those two functionalities you entangle two things that have nothing to do with each other, except that you just happened to made them that way. Why is it wrong? It might not seem like a problem in this particular moment. But let's say some day you want to copy that data somewhere, for example to a function that saves it to a file or plots it on a chart or make a copy to send to a gpu buffer or whatever else. None of these things have anything to do with threading and yet you have to copy that lock object around and force dependency in form of include files on parts of code that wouldn't need it otherwise.

It also makes your data harder to reason about - for example what should now happen with the lock when you copy that data - should it be copied too or shared. Should you be able to copy a locked object at all? If not then nothing in this design prevents it, so this might cause a subtle bug somewhere without you even knowing it. Sure you can implement a copy constructor/assignment operator to implement correct behavior, but then you're just producing additional code to workaround an unnecessary design decision made earlier. Those are not unsolvable problems, but impose artificial considerations on a data packet that is relatively simple. It leads to overengineering, because you need to write more and more code to correctly handle the weird corner cases and entanglement between those two functionalities that you created for yourself. More code = more maintenance, testing required, harder to learn, reason about and more opportunities for bugs.

To put it into more code-like perspective - which seems easier to you? 4 classes with a single if statement or a single class with 16 cases in a switch statement (all combinations of the ifs)? Which one is easier to understand, maintain and refactor? What is easier/safer to remove when you don't need it one day? One of those classes as a whole or parts of a code within that switch statement that touch that functionality?