QT Image Scaling taking long time

kgenbio

I have written a QT App with PyQt that has several multiprocesses running. I am trying to minimize inferencing times of some ML algorithms I am running in a separate process from the GUI process.

What I notice is that when I take an image from a camera and display it, my inference times increase significantly. When trying to zero in on what aspect of the image display was taking the most amount of time, I noticed that the function QImage.scaled, takes 4-8 ms per run. I was hoping there was a way I could still display images, in a faster way.

Here is the function I have that displays images:

@pyqtSlot(np.ndarray)
def update_image(self, cv_img):
    def convert_cv_qt(cv_img):
        """Convert from an opencv image to QPixmap"""
        h, w = cv_img.shape
        convert_to_Qt_format = QtGui.QImage(cv_img, w, h, w, QtGui.QImage.Format_Grayscale8)
        p = convert_to_Qt_format.scaled(self.vw.disply_width, self.vw.display_height, Qt.KeepAspectRatio)
        return QPixmap.fromImage(p)

    """Updates the image_label with a new opencv image"""
    qt_img = convert_cv_qt(cv_img)
    self.vw.image_label.setPixmap(qt_img)

This function will scale a 200x720 image to roughly 1000x3000 image on a 4k monitor.

I do have leeway in quality of the image.

JoeCFD

@kgenbio That is 3,000,000 interpolations per run. Not sure how many calls are made in your app. 4-8 ms per run does not look bad.
Acceleration tools: OpenMP, GPU.
Not sure Qt has it. if not, you have to code it by yourself.

kgenbio

@JoeCFD

The system will make about 10-50 calls per second, ideally later on we can work our way up to 100.

Is there a way in QT to use GPU to help reduce the load of the scaling of the image?

SGaist

Hi and welcome to devnet,

If you want to go that fast, then you should move to OpenGL directly for the rendering of your images. If memory serves well OpenCV also has GPU support so you might be able to leverage both.

kgenbio

@SGaist

To display in a QT app, you need to use QImage/QPixmap, correct? So the idea is that I could do image manipulation on the GPU and then pass it into the Qimage/QPixmap code I already have?

Is that right?

SGaist

@kgenbio You don't need that unless you want to use QLabel.

Since you already want to use your GPU to manipulate your image, my suggestion is to stay on the GPU. Qt also supports OpenGL, see QWindow, QOpenGLWidget and their friends. Going back and forth between the GPU and the CPU is a performance killer so don't do that unless you really have a need for that. Otherwise, build a processing pipeline that avoids them and enjoy the performance from it.

kgenbio

@SGaist

The functionality that requires precise timing is sending a small image (200x720) to an ML process that does inferencing on CPU. The inferencing on CPU is better than GPU for this model since its only a 5 layer NN and the CPU->GPU transfer time is too long. So for my inferencing time to be as fast as possible, I need a free CPU.

The only reason to display images, is so the user on the app has something to verify that focus is still good and that an object that we are flowing still exists. So if the rendering of the images for the user display is super slow or even loses a ton of quality, that is not a big deal.

So it does seem like a good idea if I can offload the rendering of the image onto a GPU to free up the CPU for inferencing.

would you agree?

JoeCFD

@kgenbio
Nvidia has some tools for GPU acceleration.
https://developer.nvidia.com/gpu-accelerated-libraries
They may interest you.

SGaist

@kgenbio then you have to optimize your pipeline.

What you do currently is creating a new QImage that you convert to a QPixmap for each and every frame you process. This is a waste of CPU. Also, I see you are using numpy arrays so it seems you have an extra step through OpenCV. So now the question is: what exactly are the mandatory steps you need ? Can you use numpy directly to transform the image data ? What about keeping a QPixmap at hand and update its content rather than doing all the conversions ?

kgenbio

@SGaist

Just want to start by saying thank you for all this information, you have been so helpful and so quick to respond.

I have a camera callback function that gets externally triggered to take an image. I then put it into numpy so I can squeeze an axis off and do (what I think are) fast manipulations. Here is my callback:

"""
@details: image handling during trigger mode, callback called on new image available
"""
def frameReadyCallback(self, hGrabber, pBuffer, framenumber, pData):
    if self.trigger_mode == False:
        return
    Width = ctypes.c_long()
    Height = ctypes.c_long()
    BitsPerPixel = ctypes.c_int()
    colorformat = ctypes.c_int()

    # Query the image description values
    self.ic.IC_GetImageDescription(hGrabber, Width, Height, BitsPerPixel, colorformat)

    # Calculate the buffer size
    bpp = int(BitsPerPixel.value/8.0)
    buffer_size = Width.value * Height.value * BitsPerPixel.value

    if buffer_size > 0:
        image = ctypes.cast(pBuffer,
                            ctypes.POINTER(
                            ctypes.c_ubyte * buffer_size))

        cvMat = np.ndarray(buffer=image.contents,
                           dtype=np.uint8,
                           shape=(Height.value,
                           Width.value,
                           bpp))
        
        cv_img = np.squeeze(cvMat, axis=2)   
        self.image_queue.put(cv_img)

Then I only want to display the image. so it would seem there is a ton of room for improvement if I can get away from QImage entirely. I think that I just wanted to get something working originally and didn't read too much into what the steps were. I thought you HAD to use a QPixmap to display an image by using a QLabel.

SGaist

@kgenbio you misunderstood me, QPixmap is mandatory for QLabel. What I wrote is that recreating it every time was not the most performance friendly way to handle it.