QtConcurrent vs QThread CPU Usage
-
Hi,
How do you know that you only have 5 of these ?
For a video with a classic 25 frames per second frame rate, I would expect to have 50 of them created during a second since you have two videos.
Unless they can do their processing below the duration of a frame, I wouldn't be surprised about your result.
-
@SGaist Each cycle or frame would have 5 created but the number you mentioned sounds right adjusted for the framerate. For 30fps it would be 150 of them but still there would be 3 threads for each frame since frame reading / processing for each video has 2 threads working in series and output processingh has its own thread.
So the large number of threads created every second is what drives CPU usage so high? If that's the case it would make sense to use the worker QThread model and have 3 of them start to finish.
-
I don't know how exactly your pipeline works so I can't really comment on that.
Did you try to use a profiler to see what is happening where in your application ?
Did you try to measure the performance of the methods you apply to video frame processing ?
-
@SGaist I've some benchmarking using std::chrono to see how much time each section of the process takes but I'm not familiar with profiling tools. Any tools you would recommend? I came across this page:
-
Maybe you need to provide more detailed content and data.....
-
hi @rtavakko said in QtConcurrent vs QThread CPU Usage:
profiling tools
see gammaray https://www.kdab.com/development-resources/qt-tools/gammaray/
https://doc.qt.io/GammaRay/index.html -
Thanks guys I'll check out the cache profile. To give you a bit more detail about the overall process:
void Mixer::mix(unsigned int vid) { videoProcessStart[vid] = std::chrono::high_resolution_clock::now(); QFutureWatcher<void>* inProcessWatcher = new QFutureWatcher<void>; QFuture<void> inProcess = QtConcurrent::run(this, &Mixer::processInput,vid); QObject::connect(inProcessWatcher, &QFutureWatcher<void>::finished, this, [=](){videoEffectsFinished(vid);delete inProcessWatcher;}); inProcessWatcher->setFuture(inProcess); } void Mixer::processInput(unsigned int vid) { videoFrameRead[vid] = readInput(vid); addVideoEffectsMultiChannel(vid); } void Mixer::videoEffectsFinished(unsigned int vid) { mix(vid); }
In Mixer::addVideoEffectsMultiChannel, I process each frame using OpenCV functions (main operations here are cv::split, cv::addWeighted and cv::merge to allow for RGB processing of a 3-channel cv::Mat). This function was time-consuming if called in series for both frames so I decided to split the processing into two parallel threads.
After each frame is processed its pushed onto a "buffer" (an std::dequecv::Mat) so the output stage doesn't need to wait for both frames to be finished. The output stage functions essentially create a constantly spinning thread the same way above that takes the oldest frame from each buffer and mixes them using cv::addWeighted and emits its data array to be displayed.
So I have 3 threads running in parallel at this point. I've confirmed that each one is unique by comparing the IDs of the threads and also the priority of each thread does not affect CPU usage.
At this point I think all the per-element OpenCV operations are causing the high load since if you have let's say two frames of 1920x1080 being processed constantly, that would be pretty CPU-heavy. If anyone has any tips / ideas I would definitely appreciate them.
Cheers!
-
Did you consider moving the image processing stuff to the GPU ?