Solved Problem processing output from QProcess
-
Edit: The problem was that MKVToolNix 14+ requires the LANG environment variable to be set. In MacOS LANG is only set in the process environment when running programs from a console, and when you run a program from Finder, Launchpad or Qt Creator, LANG is not set. Setting LANG in the process environment with
QProccess::setEnvironment()
resolves the problem.I'm running a command line program (MKVToolNix) in a
QProccess
and trying to read the output. Unfortunately, I'm experiencing a strange problem on MacOS where the output terminates as soon as it reaches the first Unicode character.I've created the following connections to read the output:
connect(&qProcess, SIGNAL(readyReadStandardOutput()), this, SLOT(OutputText())); connect(&qProcess, SIGNAL(readyReadStandardError()), this, SLOT(ErrorText())); connect(&qProcess, SIGNAL(finished(int, QProcess::ExitStatus)), this, SLOT(Finished(int, QProcess::ExitStatus))); connect(&qProcess, SIGNAL(errorOccurred(QProcess::ProcessError)), this, SLOT(Error(QProcess::ProcessError)));
In OutputText() I print the output to a
QPlainTextEdit
:void IUIInfoDisplay::OutputText() { qTextEdit->insertPlainText(qProcess.readAllStandardOutput()); }
The process runs without errors, and calls Finished() with exit code 0, so there are no issues running the process. However, if the output contains Unicode text it cuts off immediately when the Unicode text starts.
For example, if I run MKVToolNix from a terminal in MacOS I get this output, which correctly displays the Japanese text:
{ "attachments": [ { "content_type": "application/x-truetype-font", "description": "", "file_name": "いくつかの日本語テキスト.ttf", "id": 1, "properties": { "uid": 10967064899104715426 }, "size": 60656 } ],
However, when I run the program from a
QProcess
the output ends as soon as it reaches the Japanese text:{ "attachments": [ { "content_type": "application/x-truetype-font", "description": "", "file_name": "
Obviously, the
readyReadStandardOutput
signal can be emitted multiple times, and the output delivered in blocks. However, that is not the issue here becausereadyReadStandardOutput
is emitted only once and no further output arrives.The issue also does not appear to be an encoding problem, but that the data simply isn't there. The
size()
andlength()
of theQByteArray
returned byreadAllStandardOutput()
is 125, when thesize()
andlength()
of the full output should be 2358. The output after it reaches the first Unicode character simply isn't present.Other observations:
- The program runs without issues on Windows and Linux, and all output, including Unicode characters, is returned by
readAllStandardOutput()
. It is therefore a Mac only issue. - If there are no Unicode characters in the output, the full output will be read successfully on the Mac, even when it is hundreds of thousands of lines.
- The program returns all output with earlier versions of MKVToolNix, up to version 13. It only experiences problems with versions of MKVToolNix from v14 to v20. This might indicate a problem with MKVToolNix. However, when run in a terminal MKVToolNix does show the full output, including Unicode characters. If it outputs text correctly to a terminal you would expect Qt should be able to read that the text.
- Output also appears correct if you run MKVToolNix in a terminal and pipe the output to a file. If you cat that file that contents of the file it is displayed correctly. If you run cat in a
QProccess
with the file as the parameterreadAllStandardOutput()
will read the full contents of the file, including the Unicode characters. It seems theQProccess
only has a problem reading the Unicode characters from MKVToolNix. - I tried creating the below test program which outputs a line containing Unicode text. I then ran that from a
QProccess
on MacOS.readAllStandardOutput()
was able to read back the full line including the Unicode text. Again then, it appears that there are only problems when reading the output from MKVToolNix.
#include <QCoreApplication> #include <iostream> int main(int argc, char *argv[]) { QCoreApplication a(argc, argv); QString test = QString::fromUtf8("\t\"file_name\": \"いくつかの日本語テキスト.ttf\""); std::cout << qPrintable(test) << std::endl; return a.exec(); }
Can anybody suggest why the output is being cut off?
- The program runs without issues on Windows and Linux, and all output, including Unicode characters, is returned by
-
Hi and welcome to devnet forum
You are reading from readyReadStandardOutput. in the documentation you find:
void QProcess::readyReadStandardOutput()
This signal is emitted when the process has made new data available through its standard output channel (stdout). It is emitted regardless of the current read channel.Most likely the signal is only emitted once. When the signals triggers your slot routine the output of your process is not finished yet. Therefore, you have only a part. Most likely your slot routine is still busy when the process ends. Therefore, there is no second readyRead signal emitted and your app does not read the rest.
You can either add the reading functionality to our routine connected to finished signal or at least you can reada there the rest of the output before you presumably distroy the QProcess object.
-
@koahnig Thanks for the suggestion. Unfortunately, I don't think that's the problem and it seems to be an issue with non-English text. I've been testing with Japanese and other Russian text, and the output always stops as soon as it reaches the unicode text.
I assume it's something to do with character encodings, but I'm not sure how to resolve it. I can't see anything relevant to character encodings in the QProccess documentation, and there's no way to reinterpret the QByteArray because the text isn't in it.
What's strange is that I'm only having this problem on MacOS, and reading from a process on Windows has no problems with unicode text.
-
Hi and welcome to devnet,
Do you mean that if you dump your QByteArray in qDebug you don't see the whole data either ?
-
@SGaist Yes that's correct. Here's the debug output:
Debugging starts "{\n \"attachments\": [\n {\n \"content_type\": \"application/x-truetype-font\",\n \"description\": \""
As you can see, it just stops when it reaches the unicode text. The size() of the QByteArray is only 102, so the remainder of the output simply isn't there.
-
What is that command line application ?
-
It's MKVToolNix. If you run mkvmerge -J file.mkv it prints details of the elements contained in the file.
On Windows it will run fine from a QProccess and readAllStandardOutput() returns the full output, including Unicode characters. However, on MacOS the output cuts off as soon as it encounters a Unicode character. If you run the same command from a terminal in MacOS it does display the full output, so I'm not sure why it is being cut off when running from a QProccess.
If all else fails I could do a workaround where I redirect the mkvmerge output to a temporary file, read it back and then delete the file, but that wouldn't be an ideal solution.
-
@RichardC said in Problem processing output from QProcess:
QByteArray test = qProcess.readAllStandardOutput();
qTextEdit->insertPlainText(test);What if you try:
void IUIInfoDisplay::OutputText() { QByteArray test = qProcess.readAllStandardOutput(); qTextEdit->insertPlainText(QString::fromUtf8(test)); }
-
@jsulm I had tried that, but I gave it another go. Unfortunately it doesn't work. The text doesn't appear to be in the QByteArray to begin with, so the conversion doesn't help.
Out of interest, I wrote a small program on the Mac to output Unicode text:
#include <QCoreApplication> #include <iostream> int main(int argc, char *argv[]) { QCoreApplication a(argc, argv); QString test = QString::fromUtf8("\t\"description\": \"Импортирован шрифт из\""); std::cout << qPrintable(test) << std::endl; return a.exec(); }
I then ran that from a QProccess and printed the output with:
qTextEdit->insertPlainText(qProcess.readAllStandardOutput());
In that case read and printed the Russian text without issues, so the above line of code should work and it doesn't appear to be an issue with my program.
Another interesting observation is that the output from versions of MKVToolNix up to v13 will all print correctly when run from a QProccess, but running v14-20 the output cannot be read from the QProccess. That seems to indicate a problem with MKVToolNix, but MKVToolNix v20 will print the output correctly when run in a terminal on the Mac, and if it prints to a terminal you would expect Qt to be able to read the output.
I'm therefore not sure if this is an issue with MKVToolNix or an issue with Qt.
-
@RichardC said in Problem processing output from QProcess:
The text doesn't appear to be in the QByteArray to begin with
You could try to print the length of the array to check that.
But most probably it isn't UTF8 but something else like UTF16 (Windows uses it as far as I know).
I'm quite sure it is encoding problem as it cuts at the first non ASCII character. Unicode characters can contain 0 bytes which are interpreted as "end of string" if you do treat the array as ASCII string. And it looks like exactly that is happening in your case. -
@RichardC
As @jsulm says, 99% certainly a UTF decoding issue from theQByteArray
returned fromreadAllStandardOutput()
. You must look atQByteArray::length/size()
.I don't know whether the following is unnecessarily complex for your case/MacOS/Russian, but here's what I found I have to use for my
readyReadStandardOutput
slot (PyQt I'm afraid, but I'm sure you can manage), food for thought:def processReadyReadStandardOutput(self): # read all output available at this point byteArray = self.process.readAllStandardOutput() # convert QByteArray to str # Linux: the decoder should always be "utf-8" # Windows: after *enormous* investigations "utf-8" *mostly* works # but if the output contains a "funny" character like "£" # it will cause a conversion error # then the correct decoder to use is what the command "chcp" says # (e.g. "cp850" for Code Page 850 in UK) to avoid the error *and* correctly display the £ text = "" try: try: text = byteArray.data().decode('utf-8') except UnicodeDecodeError: from common import osfunctions if osfunctions.isWindows(): text = byteArray.data().decode('cp850') except: # if all this fails for whatever reason (just in case) # just output a load of "?"s the length of the output # anything is better than throwing an exception here text = "?" * len(byteArray) self.appendInformativeText(text)
(BTW, if you haven't done so already you'll want to include a call to
QProcess::setProcessChannelMode(QProcess::MergedChannels)
.) -
@jsulm said in Problem processing output from QProcess:
You could try to print the length of the array to check that.
But most probably it isn't UTF8 but something else like UTF16 (Windows uses it as far as I know).
I'm quite sure it is encoding problem as it cuts at the first non ASCII character. Unicode characters can contain 0 bytes which are interpreted as "end of string" if you do treat the array as ASCII string. And it looks like exactly that is happening in your case.Both the size() and length() of the QByteArray is 102. The size() and length() of the full output should be 5066.
It therefore seems not to be an encoding issue, just that the data isn't there.
I did try converting it to UTF16 out of interest, but that changed the English text into a series of Chinese characters. It therefore seems that the text is UTF8, and the issue is not the encoding but that the data isn't there.
-
@RichardC
That is presumably because signalreadyReadStandardOutput
and functionreadAllStandardOutput
only get delivered/read whatever happens to be presently available in some buffer, i.e. chunks, not the complete output. It is expected that you will receive the signal multiple times, with subsequent chunks, and you have to handle that.My
processReadyReadStandardOutput()
only has to handle a chunk at a time. You may need to append/collect/buffer for your own purposes, depending on what you want to do with the complete output as a whole. -
@JonB Thanks for the suggestion, but unfortunately that's not the cause of the problem. I am handling the fact that the output is sent in chucks. Some of the output runs to hundreds of thousands of lines, which gets delivered in hundreds of separate chunks. That's not a problem, and the program handles that without issues.
The problem is, once it reaches a unicode character the output ends. No further readyReadStandardOutput signals are emitted so no further output arrives.
I'll edit the first post to try and explain the situation better, because I don't think I did a very good job and I've also found out a few things since then.
-
@RichardC Maybe it's a bug in Qt? You can check Qt bug tracker and file a bug report if there isn't any.
-
@RichardC said in Problem processing output from QProcess:
Another interesting observation is that the output from versions of MKVToolNix up to v13 will all print correctly when run from a QProccess, but running v14-20 the output cannot be read from the QProccess.
Hmm, is MKVToolNix an open source program? Can you check what was changed from v13 to v14 and could cause that problem?
but MKVToolNix v20 will print the output correctly when run in a terminal on the Mac, and if it prints to a terminal you would expect Qt to be able to read the output.
I would not bet on this. There are programs out there that check if they run in a shell or with redirected output and behave differently then.
As you know which output you expect, and already created a test program you run under QProcess, could you check what happens if you output the expected test from your test program? That way you could verify if it's a Qt problem and already have a test setup for the bugreport.
-
@RichardC
As @aha_1980 says, you cannot be sure how a program will behave when launched from a UI vs the command-line/terminal. A couple of thoughts:-
I would try running the MKVToolNix program with output redirected to a file and input (probably) closed, as either of these might affect its output.
-
I would see if the environment variables passed from command-line terminal vs Qt UI are the same or different. Could e.g.
LC_LANG
be having some effect?
-
-
@aha_1980 said in Problem processing output from QProcess:
@RichardC said in Problem processing output from QProcess:
Hmm, is MKVToolNix an open source program? Can you check what was changed from v13 to v14 and could cause that problem?It is open source (source here), but I must confess to finding it difficult to understand. I have asked him, but he doesn't support the MacOS build. I'm going to try it on Linux and see if it has the same problem, since the Linux version is supported.
I would not bet on this. There are programs out there that check if they run in a shell or with redirected output and behave differently then.
The program is intended to be run from a GUI, so I don't think it would do anything unexpected when run from a GUI. On Windows it behaves the same whether it's run from the command line or from a QProccess.
As you know which output you expect, and already created a test program you run under QProcess, could you check what happens if you output the expected test from your test program? That way you could verify if it's a Qt problem and already have a test setup for the bugreport.
I haven't tried outputting the full expected output from the test program, but Qt appears to have no problem reading Unicode characters from the test program. I could try it, but I don't think the same problem would arise. It seems to be a strange incompatibility between Qt and MKVToolNix on the Mac. I'll try it on Linux first, and then I might try doing the full output from a test program.
@JonB said in Problem processing output from QProcess:
@RichardC
As @aha_1980 says, you cannot be sure how a program will behave when launched from a UI vs the command-line/terminal. A couple of thoughts:- I would try running the MKVToolNix program with output redirected to a file and input (probably) closed, as either of these might affect its output.
I tried redirecting the MKVToolNix output to a file, and the full output appears including all Unicode characters.
- I would see if the environment variables passed from command-line terminal vs Qt UI are the same or different. Could e.g.
LC_LANG
be having some effect?
That could possibly be a problem, and I'll have to look into that. For now I'm going to try it on Linux just to see if it works there.
-
@JonB has put some more interesting question in the ring - especially with the environment variables.
I know for example that
LC_NUMERIC=C
is set if you run a program in QtCreators debugger (as the stupid GDB otherwise expects locale dependent decimal points).That of course has an effect on your program and also on all programs started by QProcess.
-
I tried redirecting the MKVToolNix output to a file, and the full output appears including all Unicode characters.
You now have a couple of things you can play with, to discover where your actual problem lies:
cat
the file in a terminal. Do all the characters display correctly?- Change your
QProcess
command tocat
that file. Do you get the output bytes back correctly or not? This tells you whether it's running the MKVToolNix sub-process or whether it's the content of the output which is problematic. - Compare the output bytes in the file against what you see in the
readyReadStandardOutput
(as far as it goes before getting cut off). Are they identical or is there a difference?