Drawing camera frames produced by CUDA (as opengl textures) with overlay.

alexmarch

I have a simple application that captures camera frames (gstreamer), converts them to RGBA and runs a face detection network, all on the GPU. The output of this is a rectangle with face location, I also have a GPU pointer to the image frame being processed but also can copy the it back to the CPU. Additionally, i have code that uses the GPU frame to create opengl textures and draw them in an X window.

My goals are:

Efficiently draw frames in QML from GPU perhaps using OpenGL textures. I don't need to modify the image in any way or record the video.
Receive a signal when the face detector is done and use QML to draw the rectangle, not directly on the image, but overlaying other QML items.

This would help our designer to focus on the UI while I could continue working on the meat of the application and detection algos.

I'm a little confused with the variety of options I found so far:

Implement a VideoSource and provide a QAbstractVideoSurface to draw the frames and add my face detector by implementing QAbstractVideoFilter. This seems great, although i'm not sure how to chain the filters and if this method can use OpenGL (https://forum.qt.io/post/560035)
Try to follow the article about Jetson TK1 (i'm using Jetson Nano), but how to make this work with QML? (https://www.qt.io/blog/2015/03/03/qt-weekly-28-qt-and-cuda-on-the-jetson-tk1)
Use QtQuick Scene Graph, but unsure if I need to use QOpenGLFramebufferObject, although I can clearly overlay other QML items on top (https://doc.qt.io/qt-5/qtquick-scenegraph-openglunderqml-example.html)
Just use the VideoFilters with OpenGL interop and let the default Camera/VideoOutput handle the drawing, but I assume that's not gonna use the GPU and will still copy frames to and fro.

I don't mind implementing any combination as long as it's easy to split the backend and UI between me and the designer. Any suggestions would be much appreciated, I hope I'm not repeating the question that was already answered eons ago :)

cirquit

I have a slightly similar problem and I can at least provide you with my experiences.

I also wanted to have a GPU-accelerated video analysis with an overlay. In my case it was not performance critical as the videos very already recorded but would be encoded as RAW format, e.g sizes may be +/-200GB.

I settled for using OpenCV and the cv::VideoCapture to read the video files into a cv::Mat, doing the analysis on the cv::Mat, converting it into a QImage and drawing onto it via QPainter painter; painter.begin(&qml_image). The visualization on the QML side is done via a QQuickPaintedItem (called viewer in my case) which implements a Q_INVOCABLE setImage(QImage image) function, which is called by the previous module:

Connections {
        target: exporter
        onImageReady: { gc(); viewer.setImage(image) }
    }

So basically Video -> cv::Mat -> QImage -> QPainter (draws on QImage) -> QQuickPaintedItem (visualizes QImage) in QML

I did not get OpenGL working for me as I didn't get the visualization (e.g QQuickPaintedItem) working in QML with GPU acceleration. I also don't have any kind of experience with GPU acceleration so I didn't look into that as much because my problem is not real-time dependent.

The only question of yours this solves is that you can receive a signal from the face detector when it's done and use QML to draw whatever you like, either directly on the image or on a copy of it.

I just now found this link which may be of interest to you if you read the frames with OpenCV to get them into the GPU memory.

If you have any questions to get my proposed pipeline running I can provide you with the working snippets, or you can check out my project by yourself.

If you are running Darknet on the Jetson I had some experiences with it on the TX1/TX2 and we used OpenCV to read the video, encode it manually into the darknet-image format and then covert it into anything we needed (e.g QImage, .png) after the processing.

rrlopez

Hi @alexmarch, from what I understand on your post you're doing is displaying the result of a detection network into GStreamer. My company created an open souce project called GstInference that may help you improve efficience since its a GPU based application. Here is an example of an overlay element that we use to display the detection result using OpenCV.

We also provide a GStreamer plug-in named QT Overlay that can overlay QT elements using QML with OpenGL. This element is not open source but you can request an evaluation binary at support@ridgerun.com.

Also, if you need assistance on your application development, we provide PO hours.
We have experience with GStreamer, QT and Tegra platforms.

alexmarch

@cirquit Thank you for your suggestions, as you mention not having the requirement of real-time analysis does make the implementation process a bit smoother, I have also initially used cv::Mat as an frame container. This worked fine up until the point where the number of neural networks running on Jetson Nano went over 3 or 4 :)

The input to neural nets is a CUDA float4*, or float** which is held in the GPU memory as long as possible to delay memcpy. I don't really need to draw over the frames during inference, simply outputting bounding box QRect and drawing items in QML is enough.

I currently have the following, using QAbstractVideoSurface set from the QML VideoOutput
GStreamer Camera -> float4* RGBA to RGB888 QImage -> emit frameReady(QImage) -> surface->present(QVideoFrame(QImage))

This works okay, but the VideoSurface with a QVideoFrame still copies the image bits from GPU to CPU. I've also had some problems displaying RGBA images directly, hece the RGB888 conversion. Ideally I want to just display RGBA image as an OpenGL texture, at least that should be possible in my understanding.

This is the QML that I have now:

ApplicationWindow {
    MyApp.Capture {
        id: capture
        device: 0
        size: Qt.size(640, 480)
    }

    MyApp.CaptureOutput {
        id: output
        source: capture
    }

    MyApp.FaceDetector {
        source: capture
        onDetectionsChanged: {
            // access to QQmlListProperty<QRect> with detected faces 
            // these are then forwarded to different visualizers written in QML
        }
    }
}

Gstreamer is wrapped in MyApp.Capture QObject, MyApp.CaptureOutput implements the setVideoSurface (as per https://doc.qt.io/archives/qt-5.13/videooverview.html#working-with-low-level-video-frames) and finally the MyApp.FaceDetector also receives QImages sent by the Capture object, does the inference and outputs a list of results.

alexmarch

Hello @rrlopez,

I'm somewhat familiar with RidgeRun products from your dev pages. This does indeed seem like what I am trying to implement in our product. We use both Caffe and TensorFlow engines with TensorRT, with pre/post processing implemented in C++/CUDA. Adopting Gstreamer is something that I'm highly interested in, as well as ease of integration with QML. My main question would be: what are the advantages over the DeepStream framework? In the context NVIDIA Tegra platforms only, we're not planning to target others for now. I must admit that NVIDIA sample code is frequently a mess and support is not the most reliable.

Imran B

can you share the code that uses the GPU frame to create opengl textures . I'm working on qt application that need to show GpuMat images.But I can't directly feed GpuMat images for now I'm downloading the GpuMat images and feeding as numpy array. If you share the code that can uses GpuMat frame to opengl texture it will be usefull . Thanks in Advance !

joshwiden

I'm trying to solve similar problems in an application:

Display realtime video that is already present in GPU memory (in my case, I have a cv::cuda::GpuMat) without having to copy the data back to host memory first
Be able to draw overlays on top of the video as desired using QML

@alexmarch - Did you ever figure out a good solution for this? I'd be much appreciative if you have any pointers! 😀