Slots called too frequently cause scene to get choked

alogim · 1 Mar 2016, 23:16

I have a Worker Thread that copes with heavy and long computations (up to tenth of seconds). These computations produce several thousands of QLines, representing the edges of a dynamically-growing tree. These edges can be modified anytime, since they connect the nodes of the trees by checking the cost, represented by the distance. I would like a smooth update of the QGraphicsScene containing the edges. I tried with signal and slots:

Worker thread emits two signals, when a new line must be added to the scene and when an old line must be replaced with another one
These signals are caught by the main thread, but since they are emitted very often, QGraphicsView gets choked with QLine to be added
Is there an alternative approach to this?

The slots are:

void MainWindow::drawLine (QLine l)
{
    edges.push_back(scene->addLine (l, QPen(Qt::black)));
}

void MainWindow::updateLine (int i, QLine l)
{
    delete edges[i];
    edges.removeAt(i);
    edges.insert (i, scene->addLine (l));
}

These slots are called each time a new line must be added or an existing line must be removed and replaced with a new one.
Even if I do not add the QLine to the QList of QGraphicLineItem (edge) and simply draw on the scene, after a couple of seconds the GUI becomes unresponsive.

kshegunov · wrote on 4 Jan 2016, 00:29

@alogim
Hello,
Maybe you should try not to add the lines to the scene at each call of the slots. For example you could put the lines on a queue and actually update the scene at regular intervals (a QTimer instance can give you the ticks). I.e.

void MainWindow::drawLine (QLine l)
{
    edges.push_back(scene->addLine (l, QPen(Qt::black)));   // < Don't call addLine here, just queue the change for later
}

void MainWindow::updateLine (int i, QLine l)
{
   // ...
    edges.insert (i, scene->addLine (l));  // < Don't call addLine here, just queue the change for later
}

// ... You could connect this slot to a timer with some update frequency (i.e. 40 fps)
void MainWindow::updateScene()
{
    // For each line you have in the queue now you add them to the scene.
    // Call scene->addLine (l) here
}

Kind regards.

alogim · 1 May 2016, 13:22

So, instead of adding them immediately to the scene, I added a function queue that copes with queueing the new generated lines and the ones that needs an update into a buffer. If the buffer reaches a dimension greater than 50 (for example), worker thread emits a signal passing that same buffer to the main thread, which in turn will parse it and add/update the lines in the scene.
Nevertheless, the scene gets still choked after some seconds, so I tried a different approach.
I disabled the viewport auto-update, so when I add the lines it doesn't immediately updates. I instead call a slot periodically, in which I simply do scene->update(). In this way everything is much smoother but... with 100.000 lines it gets a little laggy.
I'll post some code if you need.
@kshegunov : if you want, you can check here a basic example: git clone git@bitbucket.org:MichaelDallago/lasttest.git
Thank you

kshegunov · replied to alogim on 5 Jan 2016, 14:28

@alogim
Surely I'll try your code this evening after work, and try to suggest something. Your idea of restricting the number of scene updates is good, and I think with a bit of tweaking, it should work.

Kind regards.

alogim · 5 Jan 2016, 17:58

Thank you @kshegunov . Now I'm coping with another problem: even if I specifically set NoViewportUpdate the scene gets anyway updated.

kshegunov · 5 Jan 2016, 18:42

@alogim
Hello,
Your code mostly ran fine on my machine, I've taken the liberty to modify it a bit and here's what I've come up with:

I've removed the update handler, as it's not really needed. I modified the slot executed on the start button:

void MainWindow::on_startButton_clicked()
{
	updater.start(1000 / 40);
	QObject::connect(&updater, SIGNAL(timeout()), scene, SLOT(update()));

	QThread * thread = new QThread;
	thread->start();

	QObject::connect(thread, SIGNAL(finished()), thread, SLOT(deleteLater()));

	worker = new Worker(W_UPDATE_INTERVAL);
	worker->moveToThread(thread);

	connect(worker, SIGNAL(to_be_processed(QVector<QPoint>)), this, SLOT(process(QVector<QPoint>)));


	QObject::connect(ui->pushButton, SIGNAL(clicked(bool)), thread, SLOT(quit()));
	QObject::connect(ui->pushButton, SIGNAL(clicked(bool)), &updater, SLOT(stop()));
	QObject::connect(thread, SIGNAL(finished()), worker, SLOT(stop()));

	QMetaObject::invokeMethod(worker, "start");
}

I've put a regular timer for the updates, no need to do a single shot and reset it every time. I've reconnected some of the connections so they're a bit more clear. I've put two slots in the worker object - start() and stop() here are the relevant changes:

class Worker : public QObject
{
    Q_OBJECT
public:
    Q_INVOKABLE void doWork();
signals:
    // ...
public slots:
    void start();
    void stop();
private:
    // ...
    bool active;
};

With implementations as follow:

void Worker::stop()
{
	active = false;

	qDebug() << "Finished in" << timer.elapsed();

	emit finished();
}

void Worker::start()
{
	active = true;

	QMetaObject::invokeMethod(this, "doWork", Qt::QueuedConnection);

	qDebug() << "Started ...";
	timer.start();
}

void Worker::doWork()
{
	if (!active)
		return;

	QPoint p(rand() % 800, rand() % 500);

	bool found = false;
	for (int j = 0; j < store.size() && !found; ++j)
		if (store[j] == p)
			found = true;

	if (!found)
	{
		store.push_back(p);
		queue(p);
	}

	QMetaObject::invokeMethod(this, "doWork", Qt::QueuedConnection);
}

Not that each piece of work is done in the doWork() method and then another event is scheduled at the end. The active variable is just a simple way to control how and when the worker object should perform it's work. Then the queue() method I've modified very slightly to account for the lack of timing the execution (which is mostly unnecessary):

void Worker::queue(QPoint p)
{
	buffer.push_back(p);
	if (buffer.size() > 100)
	{
		emit to_be_processed(buffer);
		buffer.clear();
	}
}

I can't really tell if the program is sluggish, it looks like it's running very fine on my machine.
Kind regards.

alogim · 5 Jan 2016, 19:52

@kshegunov Thank you very much for your time and efforts.

Just a couple questions:

Is Q_INVOKABLE necessary? Why do we want to pass through the meta object system? Cannot we just call the method directly?
If you let this program run for a minute, you will see the GUI is a little bit slow, but still responsive, that was my problem.
It seems if there are too many objects on a QGraphicsScene, it takes a lot of resources to, for example, resize window/move the view's scrollbar (CPU usage goes up to 15% for a moment and then goes down).

kshegunov · 1 May 2016, 19:55

@alogim
Hello,
Q_INVOKABLE allows you to call a function through QMetaObject::invokeMethod. You could declare it as a slot, but since I'm not really using it as such, I prefer just to declare it invokable. The difference from calling the method directly is in the last parameter that's passed to the invokeMethod function, as you can see Qt::QueuedConnection is used that posts an appropriate event in the event loop. This means that the control will return to the event loop after the current function and any pending events will be processed (these events being signal-slot invocations in this particular case). This way I guarantee that the doWork() method doesn't in fact block the event loop and the stop() slot will be called at some point. When there's enough data (by some arbitrary standard, I've put 100 points here but it's not so important) a slot is emitted that is captured in the GUI thread and the points are added to the scene. The visual updates are run with frequency of 40 per second (the timer in the GUI thread), but you could reduce that, since you're not really after a real-time "rendering" of the state.
The QMetaObject::invokeMethod in the GUI thread is a bit different, there I use it just instead of defining a signal that's going to be emitted only once and then connecting that signal to the start() slot in the worker. It can be done just as easily in the usual approach.

On your second point, do note that the more the program runs the more points/lines it has stored in the worker's array, meaning that the time to check if a point exist in that array will grow. Additionally, this would mean that the frequency by which the "display buffer" will be filled is going to diminish, and this leads you to the visual appearance that the GUI is a bit slow.

As for the scalability of the QGraphicsScene class I could not tell for sure. I guess that it's normal to expect some heavy usage when resizing/moving the view, since this will cause the whole scene to be repainted (which would not be the case usually).

Kind regards.

alogim · 5 Jan 2016, 20:28

@kshegunov mmm I tried to remove the part in which I check in the vector if the point exists already and... boom, it literally ate my RAM, and after a while the program crashed. Plus, it was completely blocked. Probably because there were too many points, but since the computation part is detached from the graphical one, I don't understand why the GUI (not the scene), in particular the button, gets almost blocked. Every time the scene must add only some points, so the update should occur only to the part of the scene that have been changed.

kshegunov · 1 May 2016, 20:29

@alogim
Look at that picture: http://postimg.org/image/69v2j2dof/full/
The threads are not really clear, but with a little imagination I think it'll become clear. You have one thread (the worker) that works tirelessly to produce points. When there are a 100 points it'll give them to the GUI thread to add to the scene. Now everything with both threads is queued in the event loop. So at some point your signal from the worker to the GUI thread will get executed and the points will be added to the scene. Now consider what the timer does - it generates events that are put into the GUI thread event loop as well (in addition to the events generated by the worker thread). So what happens is that you're flooding the event loop of the GUI thread with events to add points to the scene, which is mostly fine. Then at some point the event loop reaches an event that the scene's update slot should be called, and then all that accumulated data in the scene should be drawn, this takes a lot of time (and takes more and more as you add more points). While the view is drawing the scene you have even more events that request additional points to be added. When finally the event loop continues it starts processing these, and there is just not enough time to process the widgets' resize events, the buttons' click events and whatever else you might try to do with the GUI, the painting of the scene coupled with the amount of signal-slots to add more points to the scene just eats all the time and there's no more left to process other events. Everything in the GUI runs in the same thread, so the buttons and widgets couldn't possibly compete with your scene for event processing.

PS.
The plateaus you see in memory usage is (most probably) where the scene's view is updated and is not currently processing any events to add additional points.

alogim · 5 Jan 2016, 21:17

@kshegunov You are completely right. I wish there exists some method to queue the slot calls, because it really seems to me this system isn't well implemented. If a window is laggy with 100.000 static points, how can modern videogames run smoothly with millions and billions of triangles moving in three dimensions?
By the way, I read that QLWidget is much more efficient, as regards graphics.

kshegunov · 1 May 2016, 21:18

@alogim
It really depends on what you're exactly trying to achieve. I have a 4 core i5 processor which works fine for most intents and purposes and an average graphic card that has 640 cores with internal memory that is faster than my RAM, and with instructions specifically tailored for calculating that exact triangles you're talking about. Nonetheless my video card couldn't possibly run the whole computer, nor could it implement the SHA cryptography at the hardware level. :)
Additionally (as I understand it) the GPU executes the shaders and rendering as a whole asynchronously, meaning that you put a request something to be computed, give the GPU a place in the RAM to put the result and at some point it does, meanwhile you continue with your work on the CPU. When ready you read that piece of RAM and voila you have a rendered scene. This is of course oversimplification of the issue, but in the end it's not realistic to compare the CPU with the GPU, they're just very different. In my work (I'm a PhD student in nuclear physics) I usually tend to need large amounts of computational power, but GPU's are not working for me because of their very simplistic support of floating point operations (I need to calculate special functions), so the only way is to either have my applications threaded, or when this is not enough use OpenMPI (or some other similar library) and run them through a cluster on many cores at the same time.

You could try to use QOpenGLWidget if you wish, but since I have no idea what exactly you're developing I couldn't say what the gain from that could possible be.

Kind regards.