Rendering video using Qt and opengl

vochinin · wrote on 31 Aug 2011, 16:09

Hi, everyone. I decided to post this as I spent a lot of time to find something like this when I searched before. Someone may find this information usefull as well someone may advise me how I can improve my program (Actually I am new to opengl and Qt).

The aim was to build GUI for several videocameras which communicate with desktop computer via Gigabit Ethernet. My trouble was to find a way to get and render 3 streams (25 fps) of video frames with resolution 1600x1200, 640x480, 512x512. The suggested OS was Windows and target desktop computer had rather powerfull hardware. Maybe the best way was to use DirectShow for rendering, but I found it rather difficult, so I implemented my GUI on Qt using opengl.

The program has 2 threads: 1.for recieving images from network
2.main thread to handle GUI and render video.
The main window represents form fully covered with QoglWiget. The timer invokes 25 times a second and outputs last recieved frames on the screen. The images are rendered in this way

@glPixelZoom( Xzoom, -Yzoom ); //zooming
glRasterPos2d(Xpos,WidgetHeight - Ypos);//screen position
glDrawPixels(Width,Height,GL_LUMINANCE, GL_UNSIGNED_SHORT, image_buffer); //draw image
glFlush();@

As expected this way of rendering appeared to be hardware dependent, though target computer was powerful enough to render 2 video streams 512x512 and 640x480 at 25fps. I didn't test it on high resolution camera as it isn't ready yet. I found that frames seemed to have random black lines when rendering full speed (25fps), that do not appeared when I set a timer period for 500 ms (2fps). I think the reason is that glDrawPixels function blocks CPU for some time and causes packet loss. Frame transfering in network never stops and CPU delays probarbly cause overflowing of the incoming buffer of network adapter . Then I set thread preorities as follows
@TransferThread->setPriority(QThread::Priority::TimeCriticalPriority);
this->thread()->setPriority(QThread::Priority::LowestPriority); //GUI thread@

This significantly decreased number of gaps in frames caused by packets loses, but still almoust every fifth frames seemed to have missed lines.
Searching through internet i found a way to use PBO to prevent waiting of CPU when copying to video memory.
As said in this "article":http://www.songho.ca/opengl/gl_pbo.html using Pixel Buffer Objects prevents wasting of process cycles for copying data to video card memory.
After applying described technique rendering is performed in this way:

Firstly two BuffersARB for every video stream to render are created
@glGenBuffersARB(2, pboIds);
glBindBufferARB(GL_PIXEL_UNPACK_BUFFER_ARB, pboIds[0]);
glBufferDataARB(GL_PIXEL_UNPACK_BUFFER_ARB, frame_size, 0, GL_STREAM_DRAW_ARB);
glBindBufferARB(GL_PIXEL_UNPACK_BUFFER_ARB, pboIds[1]);
glBufferDataARB(GL_PIXEL_UNPACK_BUFFER_ARB, frame_size, 0, GL_STREAM_DRAW_ARB);@
Every frame is rendered in this way

@glBindBufferARB(GL_PIXEL_UNPACK_BUFFER_ARB, pboIds[index]);
glPixelZoom( zoomX, -zoomY );
glRasterPos2d(Xpos,WidgetHeight - Ypos);
glDrawPixels(width,height,GL_LUMINANCE, GL_UNSIGNED_SHORT,0);
glBindBufferARB(GL_PIXEL_UNPACK_BUFFER_ARB, pboIds[nextIndex]);
glBufferDataARB(GL_PIXEL_UNPACK_BUFFER_ARB, frame_size, 0, GL_STREAM_DRAW_ARB);
GLuint* ptr = (GLuint*)glMapBufferARB(GL_PIXEL_UNPACK_BUFFER_ARB, GL_WRITE_ONLY_ARB);
if(ptr)
{
// update data directly on the mapped buffer
memcpy(ptr,Outbuffer,frame_size );
glUnmapBufferARB(GL_PIXEL_UNPACK_BUFFER_ARB); // release pointer to mapping buffer
}
glFlush();@

This code works but for some reason in my case it gave no significant improvements.
NOTE that I needed to download and link glew32 library as LinusA "advised":http://developer.qt.nokia.com/forums/viewthread/6954/#47930 to use PBO functions as they were not implemented by default.

Currently I avoid gaps in video frames filling them with the content of previous frame, though it is not good. As well as screen refreshes 25 times a second normally user don't notices anything wrong. But he could if the object moves rapidly in front of the camera lense. I tried to render textures(glTexSubImage2D) instead of pixel arrays(glDrawPixels) but it also gave no result. I think the main problem is the time that needed for copying image data from system memory to video card memory, and not the way it is rendered then. If someone has any idea how I could improve my program preformance I would be very pleased. I also would be glad if my experience would be helpfull for anyone. You are welcome to ask me questions.

e_dauth · wrote on 15 Jun 2012, 17:44

Hi, just came across this thread.
Any chance you can post complete sample?
Thanks!

vochinin · wrote on 15 Jun 2012, 21:57

[quote author="e_dauth" date="1339782243"]Hi, just came across this thread.
Any chance you can post complete sample?
Thanks![/quote]

Hi. The complete project is too complicated to understand. But sometime ago I've made a sample to provide an idea on how I render video. I think I can can send it to your email as it will be hard to post.

By the way as the principle of transfer video data changed, I have no problems with these "gaps".

karlox · wrote on 11 Jan 2013, 18:31

Very interesting, there is someone that has some other link to opengl adn QT5?

INeedMySpace · wrote on 12 Jan 2013, 02:00

Hi! When I decided to output video frames through OpenGL I used little bit different solution.
First I chose programmable OpenGL pipeline. My first try was drawing two triangles (quad with points in each window corner) and applying texture (from video frame data) to it. In each frame corners are permanent and you only need to bindTexture from image data. In my case it takes too long to bindTexture (from 17 to 40 msec for HD size image). May be some speed improvements are possible, but I put aside this variant and used following solution. I generate two VBO - one is for vertexes, with all vertex coordinates line by line - which is permanent for each frame (whilst we don't resize frame), and the second one for each pixel color, which is the same thing is image itself (supposing we have pixel coded in RGB24 or RGB32). So every frame we write image data to second VBO. And all data from those buffers are goes through two basic vertex and fragment shaders.
In this case I got results as fast as 17ms for frame and not faster and it was driving me mad till I remembered some post from this site. If you use DoubleBuffer refresh method you get your frames updated as fast as your monitor refresh rate which is in my case 60Hz. So we got 16,6... ms for frame. So to test it I switched off Vertical Sync in my NVidia setting panel. Currently it is not possible to turn off vsync in QSurfaceFormat (I use it but QGLFormat has SwapInterval method). And in this case frame time dropped down to 3-4 msec.
I believe my experience will be helpful to you.

INeedMySpace · wrote on 12 Jan 2013, 02:23

For example you can use following shader programs. Two values (x,y) as vertex coordinates and three values as color (r,g,b).

@#version 410 core
in vec2 inPosition;
in vec3 inColor;
uniform mat4 inMatrix;

out vec3 varColor;

void main(void)
{
varColor = inColor;
gl_Position = inMatrix * vec4(inPosition, 0.0, 1.0);
}@

@#version 410 core
in vec3 varColor;

out vec4 fragColor;

void main(void)
{
fragColor = vec4(varColor, 1);
}@