How to speed up a function using QProcess and reading from stdout?
-
I am building an app in C++ and Qt (and no this is not a Qt question). One of the features I have build it is for the user to specify a directory,
-
The function recursively enters subdirectories finding files
-
list itemIt creates a vector for extensions found and a vector for file names found
-
list itemthe user chooses what kind of files he wants.
-
list itemIt iterates over the file name vector checking the extension and inserts them into the SQLite db with a transaction
Now this operation is fast enough for me and I tried it on 19,000 files. (3-10s)
The problem is, as I want to get the page count of PDFs before inserting them into the db.
The best option I found for now is to download the xpdf binaries (exe files on Win), call the binary and pass the file path, read the output that gets printed to the stdout and find the page count part.
This unbearably slows down the whole operation to the point where the application freezes, even though I have a progressbar that should tell me how deep into the process I have gotten.
I need suggestions to speed this up. If you need anymore clarifications I'll be happy.
EDIT: To the fine gentlemen who considered using a hashmap, a clarification: QVector of extensions only has single instances of an extension. The reason I have this separate one is because I pass it to a dialog, which creates a list widget of "found extensions".Additional clarification: I have seen PoDoFo and XPDF, and I don't know how to set both up to use them in my code since I am currently using qmake instead of Cmake.
This is the function, and its kind of "hacky":
int getPageCount(QString path) { QFileInfo file(path); int pages = 0; if(file.suffix() == "pdf") { QProcess process; process.start("./xpdf/bin64/pdfinfo.exe ", QStringList() << path); process.waitForReadyRead(); QList output = process.readAllStandardOutput().simplified().split(' '); pages = output[output.indexOf("Pages:") + 1].toInt(); } return pages; } -
-
I am building an app in C++ and Qt (and no this is not a Qt question). One of the features I have build it is for the user to specify a directory,
-
The function recursively enters subdirectories finding files
-
list itemIt creates a vector for extensions found and a vector for file names found
-
list itemthe user chooses what kind of files he wants.
-
list itemIt iterates over the file name vector checking the extension and inserts them into the SQLite db with a transaction
Now this operation is fast enough for me and I tried it on 19,000 files. (3-10s)
The problem is, as I want to get the page count of PDFs before inserting them into the db.
The best option I found for now is to download the xpdf binaries (exe files on Win), call the binary and pass the file path, read the output that gets printed to the stdout and find the page count part.
This unbearably slows down the whole operation to the point where the application freezes, even though I have a progressbar that should tell me how deep into the process I have gotten.
I need suggestions to speed this up. If you need anymore clarifications I'll be happy.
EDIT: To the fine gentlemen who considered using a hashmap, a clarification: QVector of extensions only has single instances of an extension. The reason I have this separate one is because I pass it to a dialog, which creates a list widget of "found extensions".Additional clarification: I have seen PoDoFo and XPDF, and I don't know how to set both up to use them in my code since I am currently using qmake instead of Cmake.
This is the function, and its kind of "hacky":
int getPageCount(QString path) { QFileInfo file(path); int pages = 0; if(file.suffix() == "pdf") { QProcess process; process.start("./xpdf/bin64/pdfinfo.exe ", QStringList() << path); process.waitForReadyRead(); QList output = process.readAllStandardOutput().simplified().split(' '); pages = output[output.indexOf("Pages:") + 1].toInt(); } return pages; }Use the QProcess::finished() signal rather than waitForReadyRead(). Waiting should be completely avoided in the UI thread and discouraged with Qt in general.
-
-
I am building an app in C++ and Qt (and no this is not a Qt question). One of the features I have build it is for the user to specify a directory,
-
The function recursively enters subdirectories finding files
-
list itemIt creates a vector for extensions found and a vector for file names found
-
list itemthe user chooses what kind of files he wants.
-
list itemIt iterates over the file name vector checking the extension and inserts them into the SQLite db with a transaction
Now this operation is fast enough for me and I tried it on 19,000 files. (3-10s)
The problem is, as I want to get the page count of PDFs before inserting them into the db.
The best option I found for now is to download the xpdf binaries (exe files on Win), call the binary and pass the file path, read the output that gets printed to the stdout and find the page count part.
This unbearably slows down the whole operation to the point where the application freezes, even though I have a progressbar that should tell me how deep into the process I have gotten.
I need suggestions to speed this up. If you need anymore clarifications I'll be happy.
EDIT: To the fine gentlemen who considered using a hashmap, a clarification: QVector of extensions only has single instances of an extension. The reason I have this separate one is because I pass it to a dialog, which creates a list widget of "found extensions".Additional clarification: I have seen PoDoFo and XPDF, and I don't know how to set both up to use them in my code since I am currently using qmake instead of Cmake.
This is the function, and its kind of "hacky":
int getPageCount(QString path) { QFileInfo file(path); int pages = 0; if(file.suffix() == "pdf") { QProcess process; process.start("./xpdf/bin64/pdfinfo.exe ", QStringList() << path); process.waitForReadyRead(); QList output = process.readAllStandardOutput().simplified().split(' '); pages = output[output.indexOf("Pages:") + 1].toInt(); } return pages; }@mario12136
As @jeremy_k has written, using the signal instead of blocking will keep your UI application "responsive". You can also use the signal on new data arriving on stdout to read it in real-time if the subprocess produces and flushes output as it goes along.However I am not clear what your question is. If you need to get the result back from your subprocess then ultimately you cannot get that till it is finished, and if that takes a noticeable amount of time then that's how it is. You cannot "speed up" how long the process takes to run, and reading from stdout is a negligible overhead in itself. Maybe that program can accept multiple file paths at a time on its command line and produce output on each of them in a single invocation, that might speed up the overall time for many files.
Maybe you can do this somehow faster without going via some external process, but that requires you open and read the files yourself, perhaps through some library. But not a Qt issue.
-