QML/SceneGraph rendering performance on eglfs
-
I'm currently observing a strange behavior on Qt-5.0.1 running on a TI AM3517 with eglfs platform.
With a very simple QML app (NumberAnimation on rotation of a 400x400 Rectangle) I obtain a render frequency (measured using the signal QQuickWindow::beforeRendering()) of 22 Hz but the rendering looks very jerky (at an estimated 10 Hz).
import QtQuick 2.0
@
Rectangle {
id: rootproperty int frameCounter: 0 property int fps: 0; width: 800 height: 480 Rectangle { id: image color: "green" width: 400; height: 400 anchors.centerIn: parent NumberAnimation on rotation { from: 0.0; to: 360.0 duration: 5000 loops: Animation.Infinite } onRotationChanged: frameCounter++ } MouseArea { anchors.fill: parent onClicked: Qt.quit() }
}
@Measuring the call timing of QSGDefaultRenderer::render() method yields that every third frame takes about 100 ms whereas all others are ready after ~4 ms:
QSGDefaultRenderer::render period: 3.00449e+07 ms (first measurement invalid)
QSGDefaultRenderer::render period: 154.114 ms
QSGDefaultRenderer::render period: 6.62231 ms
QSGDefaultRenderer::render period: 5.24902 ms
QSGDefaultRenderer::render period: 4.60815 ms
QSGDefaultRenderer::render period: 103.455 ms
QSGDefaultRenderer::render period: 4.60815 ms
QSGDefaultRenderer::render period: 4.18091 ms
QSGDefaultRenderer::render period: 100.555 ms
QSGDefaultRenderer::render period: 4.57764 ms
QSGDefaultRenderer::render period: 4.24194 ms
QSGDefaultRenderer::render period: 100.586 ms
QSGDefaultRenderer::render period: 4.54712 ms
QSGDefaultRenderer::render period: 4.21142 ms
...Well, on eglfs it is said that the threaded renderer is activated by default, so I tried QML_BAD_GUI_RENDER_LOOP=1. The good news is that the animation is running much smoother (real 43 fps), the bad is that from time to time (every ~2 sec) I still observe a render period of ~100 ms:
QSGDefaultRenderer::render period: 3.00853e+07 ms
QSGDefaultRenderer::render period: 153.351 ms
QSGDefaultRenderer::render period: 20.4163 ms
QSGDefaultRenderer::render period: 12.7869 ms
QSGDefaultRenderer::render period: 16.4795 ms
...
92 times QSGDefaultRenderer::render period: ~15.5 ms
...
QSGDefaultRenderer::render period: 15.1672 ms
QSGDefaultRenderer::render period: 105.499 ms
QSGDefaultRenderer::render period: 17.1509 ms105 times QSGDefaultRenderer::render period: ~15.5 ms
QSGDefaultRenderer::render period: 15.3503 ms
QSGDefaultRenderer::render period: 100.342 ms
QSGDefaultRenderer::render period: 17.0593 ms...
I tracked it further down to the glDrawElements call in QSGRenderer::draw() which is responsible for ~96 ms of the 100 ms mentioned above. This call seems to be blocked somewhere very regularly. Any hints what could be the reason? Why does the threaded renderer sometimes have a rendering period of ~4 ms? It should never be smaller than 16.66?
Flobe.
-
Hi,
I can't really help with this, except to point to Gunnar's blogs: http://blog.qt.digia.com/blog/author/gunnar/
One possibility: if something is happening in the animation which requires (every so often) a whole heap of GL draw calls to be uploaded to the GPU, that could be causing the huge 100ms spike. You could try using the overlap renderer or something, to try to optimise this.
Otherwise, just talk to Gunnar (gunnar or sletta) or Samuel Roedal (capisce) on #qt-labs irc.
Cheers,
Chris. -
I'm not so familiar with OpenGL rendering but I assume that the reason for the 100 ms spike is an overload of the render queue. As can be seen in the renderer log above the rendering is frequently done in smaller intervals than 16 ms. It seems to me that the animation timer does not fire in regular time intervals.
I now implemented a custom animation driver that is vsync'ed on /dev/fb0. And indeed, this removes the jerks even with the threaded renderer.
Here is my code, maybe someone can comment on whether it makes sense. The VSyncAnimationDriver derives from QAnimationDriver and uses a QThread-derived class to wait for vsync using an ioctl of /dev/fb0:
vsyncanimationdriver.h:
@
#ifndef VSYNCANIMATIONDRIVER_H
#define VSYNCANIMATIONDRIVER_H#include "vsynctimer.h"
#include <QAnimationDriver>
class VsyncAnimationDriver : public QAnimationDriver
{
Q_OBJECTpublic:
static void install();private slots:
void startTimer();
void stopTimer();
void tickSlot();private:
VsyncAnimationDriver();VsyncTimer m_vsyncTimer;
};
#endif // VSYNCANIMATIONDRIVER_H
@vsyncanimationdriver.cpp:
@
#include "vsyncanimationdriver.h"VsyncAnimationDriver::VsyncAnimationDriver()
: QAnimationDriver()
{
// Connect to QAnimationDriver signals
connect(this, SIGNAL(started()), this, SLOT(startTimer()));
connect(this, SIGNAL(stopped()), this, SLOT(stopTimer()));// Connect to VsyncTimer::timerEvent signal connect(&m_vsyncTimer, SIGNAL(tickSignal()), this, SLOT(tickSlot()));
}
void VsyncAnimationDriver::startTimer()
{
m_vsyncTimer.start(QThread::HighPriority);
}void VsyncAnimationDriver::stopTimer()
{
m_vsyncTimer.stop();
}void VsyncAnimationDriver::tickSlot()
{
advance();
}void VsyncAnimationDriver::install()
{
static VsyncAnimationDriver driver;
driver.QAnimationDriver::install();
}
@vsynctimer.h:
@
#ifndef VSYNCTIMER_H
#define VSYNCTIMER_H#include <QThread>
class VsyncTimer : public QThread
{
Q_OBJECT
public:
explicit VsyncTimer(QObject *parent = 0);
~VsyncTimer();void start(Priority priority); void stop();
signals:
void tickSignal();private:
virtual void run();bool m_stopFlag; int m_fbdev;
};
#endif // VSYNCTIMER_H
@vsynctimer.cpp:
@
#include "vsynctimer.h"#include <QDebug>
#include <linux/fb.h>
#include <sys/ioctl.h>
#include <fcntl.h>VsyncTimer::VsyncTimer(QObject *parent)
: QThread(parent),
m_stopFlag(true),
m_fbdev(-1)
{
const char *framebufferDevice = "/dev/fb0";
m_fbdev = open(framebufferDevice, O_RDWR);
if (m_fbdev < 0)
qWarning() << "VsyncTimer: Framebuffer device" << framebufferDevice << "could not be opened";
}VsyncTimer::~VsyncTimer()
{
if (m_fbdev >= 0)
close(m_fbdev);
}void VsyncTimer::start(QThread::Priority priority)
{
m_stopFlag = false;
QThread::start(priority);
}void VsyncTimer::run()
{
while (!m_stopFlag && m_fbdev >= 0) {
// Wait for vsync pulse
int arg = 0;
ioctl(m_fbdev, FBIO_WAITFORVSYNC, &arg);
emit tickSignal();
}
}void VsyncTimer::stop()
{
m_stopFlag = true;
wait();
}
@ -
The animation driver based on the vsync ioctl makes a lot of sense, when the ioctl is present. I could even accept that into the customcontext in ssh://codereview.qt-project.org:29418/playground/scenegraph.git :)
However, there is one thing you want to fix in it. Right now you emit a queued signal from the vsync thread to the GUI thread to tick the animation. As this event will be queued in the GUI thread, you don't know exactly at which time it will be processed. It might be 10 ms after the last animation tick or it might be 20ms since the last animation tick. So what you want to do instead is that you want to use the vsync ioctl to figure out the vsync delta and then just send that to the animation driver in the GUI thread once. Then in the GUI thread you advance the animation with this fixed increment for every frame using the overload to QAnimationDriver::advance() that takes a quint64. The result of this is that the animation will be advanced for the exact time it shows up on screen, instead of the time at which it was being calculated.
Then there is a small quirk which is that the application can skip frames because something took longer than 16ms to render. In this case you want to have the animation driver catch up to the skipped time.
The swaplistenanimationdriver in the playground/scenegraph repository does this, but relies on swapBuffers rather than the vsync ioctl and therefore needs some extra logic to find a stable vsync delta.
-
I don't really understand what the animation drivers in the playground repo are doing. Where is the QAnimationDriver class they are derived from? Is it the same as in Qt-5.1? In that case I don't understand where they get the timer event from.
The other question I had was why we would need to catch up in case of skipped frames at all? If we missed a frame deadline, we simply would skip it and calculate the scene for the next?
The great challenge in order to generally prevent skipped frames is to know at the time of the animation tick how long the rendering will take. This requires nothing more than a crystal ball in software... ;)
-
I try to explain the general concepts of the animation driver here: http://blog.qt.digia.com/blog/2012/08/01/scene-graph-adaptation-layer/
-
Is there any reason why QSGThreadedRenderLoop uses its own animation driver replacing the default one (QDefaultAnimationDriver)? And if there is any reason why does it replace the default Qt::PreciseTimer with a Qt::CoarseTimer?
I observe what to me seems to be a OpenGL queue overload (see my original post) on a TI SoC with the QSGThreadedRenderLoop animation driver with and without QML_FIXED_ANIMATION_STEP set (using this env var yields slightly better performance but still regular jerks).
With a simple custom animation driver based on a 16 ms QBasicTimer (Qt::PreciseTimer) I get no jerks at all.
-
The threaded render loop replaces the default driver so it can tick once and exactly once per frame. It doesn't use timers at all for animations. What do you mean it replaces Qt::PreciseTimer with Qt::CoarseTimer ? There is no reference to CoarseTimer in qsgthreadedrenderloop.cpp.
Is this system properly vsynced? If it is, then I fixed animations + threaded/windows renderloop will give the smoothest visuals.