How to debug a hard to find memory leak in PyQt
-
Hi,
@SGaist Python 3.8.5, PyQt5==5.15.1, running on Ubuntu 20.04.1 LTS, here's a link to a version of the code that's leaking memory: https://github.com/argosopentech/argos-translate/tree/db9bac106c17330f0bfb3c708da1d7dc5a016dda .
Thanks
-
I've been able to narrow the problem down to occurring when I create a CTranslate Translator inside of a PyQt QThread that is created from a QWidget. If I remove the CTranslate Translator and do something else that allocates a large amount of memory there is no leak. If I create the CTranslate Translator from the QWidget with no QThread there is also no leak. If I run the QThread outside of a QWidget there is no leak. The leak only happens with the combo of all three.
My best guess is that there is some bug/me misusing in the combination of Python/Qt/CTranslate memory management. Python uses automatic reference counting memory management, while Qt in native C++ use C++ parent based memory management. On top of that CTranslate uses C++ extensions to Python so it seems like there are a lot of places where the problem could be appearing.
I made an example script demonstrating the leak. To run it you need a CTranslate model and need to provide a path to it in the script. Here's a Google Drive link where you can download a package for my project that if extracted (its just a renamed .zip archive) has a CTranslate model at
/model
. When this script runs it leaks ~5GB of memory.I also posted this on the OpenNMT Forum.
-
@argosopentech
Since this seems to be a PyQt5 issue, have you tried the folks over at https://riverbankcomputing.com/mailman/listinfo/pyqt, by joining that mailing list and asking your question? The author of PyQt5 is there. I don't know whether they will take the time to examine your code/issue, but it's where I would try. -
@argosopentech
I have seen your post. I don't know how people there will react to you only referring to this post here --- they like the information in their own forum posts. If it were me I might post a follow-up (i.e. Reply to all) there yourself in which you include the first, second & third of your posts here above. -
@argosopentech
I see that you have just received a reply from mailing list :) And that guy is the PyQt5 author, like I said, and probably knows exactly what he is talking about! :) He is not very chatty, just terse and to-the-point, you just have to try to act as best as you can on what he tells you. -
@JonB thanks for the help, I'm looking at his suggestions now. Here's his response for anyone curious:
I don't see how the above can be expected to work, no matter what the
run() method is doing. Your WorkerThread objects are likely to be
garbage collected before they are finished and the del won't protect
the thread.Try making the GUIWindow the parent of the WorkerThreads and see if that
makes a difference.Phil
-
My understanding of what he's saying is that because the QThread is being managed by Qt blocking in the Python
__del__
function isn't going to protect your QThread from being deleted while you're still using it. This didn't seem to be a problem I was having, I've been getting memory leaks not crashes from the QThreads being prematurely deleted though maybe them getting cleaned up early is leading to other memory being leaked. I based my original code on a tutorial I found on PyQt QThreads that uses the__del__
function this way but it makes sense that it isn't a great way of doing things.I tried making the QMainWindow the parent of the QThread like he suggested but that didn't seem to fix the problem. I also connected the QThread's
finished
signal with itsdeleteLater
slot in line with Qt's documentation's example of subclassing QThread.from PyQt5.QtWidgets import QMainWindow, QApplication from PyQt5.QtCore import QThread import ctranslate2 class WorkerThread(QThread): def run(self): translator = ctranslate2.Translator('/path/to/ctranslate/model') class GUIWindow(QMainWindow): def translate(self): new_worker_thread = WorkerThread(self) new_worker_thread.finished.connect(new_worker_thread.deleteLater) new_worker_thread.start() app = QApplication([]) main_window = GUIWindow() main_window.show() for i in range(120): print(i) main_window.translate() app.exec_()
I'm going to look into his suggestion more but I'm not sure it solves the issue. The other example in the Qt documentation puts the work in a worker object and pass to a QThread using
moveToThread
so I'm going to try structuring the threading like that and see what happens. -
@argosopentech said in How to debug a hard to find memory leak in PyQt:
like he suggested but that didn't seem to fix the problem
I suggest you go back with this fix this then, and ask him very politely to have a look and see if he can suggest anything else. Throw yourself at his mercy, nicely :) (But do try anything else you can think of from his suggestion before doing so.)
-
Do you really need to create a new translator each time ?
I am wondering whether you should reconsider your architecture.
It looks like you could make use of QtConcurrent to manage translation tasks rather than doing your own thread management.
-
@JonB thanks that's what I was thinking too. I just wanted to post here first to make sure I wasn't missing something obvious or misunderstanding him before I sent another email.
@SGaist This was something that was mentioned on the OpenNMT forum thread I started too. It seems like not making a new Translator every time would be good for performance too. I'm going to try this and I think it should at least drastically slow down my memory leak. However, since users can switch between translations some will still need to be garbage collected so I'd like to figure out what's causing this to leak.
-
Switching to a different language should be part of your API rather than a constraint. That way you can reload or replace the translator when appropriate.
-
@SGaist I just added reusing the same CTranslate Translator, this seems to prevent the memory leak from becoming a problem. Right now I save every Translator that has been used and so I never have to create one more than once. This prevents them from being leaked but ideally I'd like to just save the one that's most recently been used so if someone does a large number of translations without restarting the application they don't have to all be kept in memory. My concern would be that if I did that whatever has been causing this memory leak would also cause the Translator objects to be leaked.
Saving the Translators seems to mostly work around the problem but there does seem to be either something wrong with the way I was using PyQt/CTranslate in the example above or a bug in one of them.