2200% CPU Usage from processed camera feed implementation

OyvindNordbo

[QML] Poor FPS and 15–20s camera control delay when displaying ROS2-processed feed (YOLOv8 + LiDAR) in QML – Qt 5.15

Hi everyone,

I'm working on a student project where I'm redesigning an existing robot GUI from Qt Widgets to QML. I'm fairly new to QML, Qt, and ROS2, so bear with me.

Setup:

Qt 5.15.13 (fixed version to match the existing system)
ROS2 for inter-process communication
A classification node (camera_classified.py) that fuses LiDAR data and YOLOv8 to produce a processed camera feed published on the ROS topic camera/classified (bounding boxes around targets)
The raw feed comes in on camera/raw

The problem:
My new QML-based GUI suffers from noticeably lower FPS than the previous Widgets-based GUI, and camera control inputs have a 15–20 second delay. Both systems share the exact same backend (camera_classified.py), so the issue must be in my QML implementation.

Note: the high CPU usage (~2200%) from the classification node is expected and present in both the old and new system — that is not the issue as its the same for the old system. We have no GPU on our dev PC's, but the camera feed from the old GUI is ≈15 fps, and the delay is sub 2sec.

My question:
Has anyone integrated a ROS2 image topic into QML, especially where the image source involves heavy ML preprocessing? I'm looking for advice on:

Best practices for bridging ROS2 image messages to a QML image provider
Whether QQuickImageProvider or a custom QAbstractVideoSurface is the right approach at Qt 5.15
How to avoid bottlenecks that could cause frame drops or input lag (e.g. unnecessary frame copies, blocking the UI thread, missing throttling)

I'm happy to share relevant code snippets — just let me know what would be most helpful to see.

Thanks in advance!

GrecKo

so the issue must be in my QML implementation.

Maybe tell us a bit more about your QML implementation.

OyvindNordbo

I'm using a class inherited by QQuickPaintedItems.

this is the function used for painting:

void paint(QPainter* painter) override{
std::lock_guardstd:mutex lock(m_mutex);
if(!m_frame->drawImage(boundingRect(), m_frame);
}
}

I belive this might be the issue. My theory is that Qt uses ffmpeg lib as default to enable cam feed. This method routes all video feed through the CPU, not the GPU.

My dev PC uses a powerful i9, and there is no GPU which could explain the lagg issues. If my theory is right however, installing a GPU won't fix it as easily.

Bob64

What triggers the call to paint()?

Ronel_qtmaster

@OyvindNordbo I see, in fact paint event is called each second. Your problem is that you do not release the memory occupied by the Image you are drawing hence the cpu activity will continue to increase . You can free that memory using the constructor of the image. Something like this might help.

QImage Image;
drawImage(rect, Image);
Image = QImage();

GrecKo

@OyvindNordbo said in 2200% CPU Usage from processed camera feed implementation:

I'm using a class inherited by QQuickPaintedItem

Yeah that won't get you far.

In Qt 5.15 use VideoOutput with your own QObject as its source and expose a videoSurface property accepting a QAbstractVideoSurface

OyvindNordbo

@Ronel_qtmaster @GrecKo

Thank you both for the suggestions — I'll work through them and report back.

Unfortunately I can't share much code as I initially said due to project confidentiality, but here's as much context as I can give (feel free to ask for more if there's any confusion):

Architecture overview:

The backend runs a ROS2 classification node using Ultralytics YOLOv8 + PyTorch. This is understandably CPU-heavy.
My GUI class subscribes to a ROS topic that fires a signal each time the classification of one frame completes.
A camera bridge class is connected to the GUI class and forwards frames to QML.
The QML GUI itself runs as a dedicated ROS2 node, launched inside a QThread.

Comparison with the old Widget-based GUI:
The previous implementation used QPixmap inside a standard Qt Widgets GUI. It worked — the FPS dropped from ~30 to ~10 due to the ML overhead, but it was fully usable. My QML implementation is significantly worse: ~1 FPS with a ~20-second rendering delay, making camera control completely unusable.

Since the backend is identical in both cases, the bottleneck must be in how I'm bridging frames from the ROS2 subscription into QML.

Potential issue I'm investigating:
I suspect the QML rendering pipeline is either blocking on frame delivery, accumulating a backlog of unprocessed frames, or doing unnecessary work on the UI thread — but I haven't pinpointed it yet.

Could there be a potential fault in how I configure the main.cpp file? I've provided the core logic from my implementation below.

int main( argc, *argv[])
{

qputenv("QSG_RENDER_LOOP", "thread"); //tried to use basic, virtually no difference
rclcpp::init(argc, argv);
qRegisterMetaType<QImage>("QImage")

.....
QApplication app(argc, argv);
.....
QNode qnode;
if (!qnode.init()) return -1;
....//Initializing camera
auto cameraProvider = new CameraBridge();
engine.rootContext()->setContextProperty("cameraBridge", cameraProvider)
......
engine.load(url);
if (engine.rootObjects().isEmpty()) return -1;

int result = app.exec();
rclcpp::shutdown();
qnode.wait();
return result;

}

@Ronel_qtmaster — I tried applying your suggestion to my paint function but didn't see an improvement. It's possible I'm not applying it at the right point in the pipeline, but I belive my implementation is correct.

@GrecKo — I'll implement your approach and report back today or tomorrow.

Fallback plan:
If I can't resolve this in QML, I may try embedding the old QPixmap-based camera widget into my QML app using QQuickWidget or similar. I know mixing Widgets and QML is generally discouraged in that direction, but it may be the pragmatic solution here. Any experience or tips with that approach would be welcomed.