QByteArray to string?
Yeah, that's precisely the conclusion I came to too! Hence the question for confirmation it really is that obscure :)
Riverbank have obviously already made clear what they want to do, having explicitly apparently changed it from PyQt4 to PyQt5, so there seems little point in asking them. Unless I am mistaken, they won't much actively respond to a PyQt query from me?
Why wouldn't they ? You are not asking them to bring back QString, just what they recommend as best practice to convert a QByteArray to a python string following your example.
I meant, I thought I have tried posting to PyQt forum before, and just no replies. Maybe I'm mistaken though...
The standard python3 way to handle this would be:
This is actually to do with changes between Python versions 2 and 3. In Python 2
strcould be used for both text and binary data and was considered 'brittle' by the core devs., so in Python 3 it was decided that
str(incl. unicode) would be used only for text and
byteswould be used for binary (see more here). Therefore, in PyQt5, when a Qt type containing binary data (QByteArray, the clue is in the name) is converted to a native type,
bytesis used rather than
strgiving the developer the choice of which encoding to use if it is string data.
Also, as an active member of the PyQt mailing list, I can say it is normally pretty responsive and helpful so, in future, please think of giving us a second chance :).
Hope this helps :)
Thank you, but I'm sorry, I don't see how. The whole point is the PyQt 4 to 5 changes document (or is it Python 2 to 3, I can't recall) is that it says
QByteArray.decode()method was removed? It's not there if I try to use it. Have you tried your suggestion with PyQt5/Python3?
Apologies, that should have been
(That'll teach me to read things properly...!)
OK, that does work, thank you! Now then, may I ask:
bytes. Where was I supposed to come across documentation for
bytes.decode()(e.g. in PyQt?)? [EDIT: I'm a newbie to both Python & Qt. I spend my time looking around the Qt documentation to do this stuff. I'm beginning to guess this is a Python issue, not Qt, but it's a lot to take in!]
(Because of #1) I don't know the arguments to
decode(). I have used my
utf8and as far as I can see both work the same. Which is "right"/"preferable"?
Can you comment (briefly :) ) on why
str(encoding=...)is preferable/nicer/more Pythonic?
bytesis a python standard type and is fully documented in the python docs, the particular information you require re.
bytes.decode()can be found here.
- In the documentation linked above you will find a link to Standard Encodings (also part of the python docs) which will tell you all you ever wanted to know about encodings (and more!).
utf8are simply aliases of one another, both are perfectly acceptable (as detailed/listed in the docs) as are
- Semantics, but Python is considered to be primarily an object-oriented language and therefore you should use an objects own methods (yes,
strare objects as are all 'types' in Python) rather than a function. In fact, the
str()function just invokes an objects own
__str__()method as that defines how the object should be represented as a string (true for all types).
Yep, all good stuff, makes sense, thank you very much!
As I edited against #1, I now realise that certain things from Qt via PyQt require me to look at Python documentation rather than Qt.
Since you happen to be here, and are so kind, would you care to comment on one issue which was raised in posts above. In PyQt 4, apparently, you could go
s = QString()if you wanted to. Is it indeed correct that in PyQt 5 there really is no such thing as
QStringanywhere, and you have to deal in Python types like
strin every situation? (Doubtless same applies to, say,
bytes, and for other such Qt types where you have decided only to allow the Python type.)
Finally, don't suppose you could make Python be just like C# instead for me, then I'd be much happier? ;-)
@jazzycamel long time no see ! Thanks for the thorough explanation :-)
Parts of it would be a welcome addition to the PyQt5 documentation.
There is indeed no such thing as
QString()in PyQt5. It shouldn't be necessary as the library takes care of type marshalling between the Python and Qt (C++) types. In fact, while there is a
QVariant(), its generally not necessary to use it for the same reason.
QByteArray()does exist also, but I would steer clear of it if possible and let PyQt5 deal with via
No, I will never (and no one else should!) ever make Python like C#!! :)
@jazzycamel , or anyone else
qba.data().decode('utf8')as directed, I have now come across a situation where the
QByteArraydata returned by
QProcess.readAllStandardOutput()from an OS command run under Windows causes the Python/PyQt code to generate a
UnicodeDecodeErrorerror, as detailed in my post https://forum.qt.io/topic/85493/unicodedecodeerror-with-output-from-windows-os-command
This makes it impossible to convert the data, blocking the whole behaviour of my usage.
My belief is that this would not be happening at all from C++ where I would simply use whatever methods of
QStringor the language. The problem is precisely is that I am being forced to use a "Python/PyQt" way of doing this, causing the error in Python/PyQt only, which is exactly why I didn't want to have to do that but cannot get access to the necessary types/methods of Qt from PyQt...?
Can you show the code you use ?
I promise you all you'll see is a
QByteArraybeing returned with the sub-process's output, and I'm trying to convert that to a
QStringto put into a
QTextEdit. That's all the question is. And I get a
UnicodeDecodeError, probably when
robocopyechoes the name of a file which has that 0x9c character in it via PyQt's
can't decode byte 0x9c in position 32: invalid start byte
So presumably all you have to do is create a
QByteArray, put a
0x9cin its first byte, and try
qba.data().decode('utf8'). That's what this thread is about.
This whole issue where I'm discussing the code is in https://forum.qt.io/topic/85493/unicodedecodeerror-with-output-from-windows-os-command. If you'd be kind enough to look at that, I think that's a more appropriate place to discuss the code than here? If you still want more code there, let me know, and I'll supply.
I don't have a Windows machine at hand. Doing this on macOS yields correct results
from PyQt5.QtCore import QByteArray ba = QByteArray() ba.append(u"\u009C") PyQt5.QtCore.QByteArray(b'\xc2\x9c') ba.data().decode('utf-8') '\x9c' ba.data().decode('utf-16') '鳂'
I'm afraid I don't believe that relates to the situation.
I now have information from the client:
The exception occurs (only) when a filename
robocopyis echoing filenames as it goes --- contains the
£(UK pound sterling) character (I am in the UK, you may not be). In that situation,
QProcess.readAllStandardOutput()) results in:
Unhandled Exception: 'utf-8' codec can't decode byte 0x9c in position 32: invalid start byte <class 'UnicodeDecodeError'> File "C:\HJinn\widgets\messageboxes.py", line 289, in processReadyReadStandardOutput output = output.data().decode('utf-8')
Now, armed with that information:
- In a Command Prompt I type in:
echo £ > file
- I dump the file and I see:
9C 20 0D 0A
- So the
£character is single byte with value 0x9C
- In a Command Prompt I type in:
What do you get if you use
unicode_escapein place of
I don't know, because I don't have access to the code right now, but I will tomorrow.
Thank you, your suggestion is much more like what I have been looking for. We are now discussing the argument to
- I believe
utf-8is definitely right for Linux, where I develop.
- I'm beginning to learn (whether I like it or not) that it is not for Windows.
- Under Windows
utf-8does work 99% of the time, but not always, and now I know not for the
- I believe that either
windows_1252may be able to handle this correctly.
- I will also try your
unicode_escapeif you think it's worthwhile.
- I believe
I believe what I am seeking from you is: Haven't I seen that Qt has some function to "get the current system encoding", but I can't spot it?
Then my code would be:
and everything would just work....
[EDIT: Ooohhhh, is http://doc.qt.io/qt-5/qtextcodec.html#codecForLocale what I'm looking for, perhaps?
Returns a pointer to the codec most suitable for this locale.
On Windows, the codec will be based on a system locale. On Unix systems, the codec will might fall back to using the iconv library if no builtin codec for the locale can be found.
Or, was I thinking of the Python
But that seems filename-specific, my output could be anything, not especially file names.
[This post cross-posed to https://forum.qt.io/topic/85493/unicodedecodeerror-with-output-from-windows-os-command/18 ]
For the record, I have done exhaustive investigation, and there is only one solution which "correctly" displays the
£character under Windows. I am exhausted so will keep this brief:
To create a file name with a
£in it: Go into, say, Notepad and use its Save to name a file like
abc£.txt. This is in the UK, using a UK keyboard and a standard UK-configured Windows.
Note that at this point if you view the filename in either Explorer or, say, via
diryou do see a
£, not some other character. That's what my user will want to see in the output of the command he will run.
Run an OS command like
dir, which will include the filename in its output.
Read the output with
QProcess.readAllStandardOutput(). I'm saying the
£character will arrive as a single byte of value 0x9c.
For the required Python/PyQt decoding
QByteArray->QString) line, the only thing which works (does not raise an exception) AND represents the character as a
That is the "Code Page 850", used in UK/Western Europe (so I'm told). It is the result output of you open a Command Prompt and execute just
Any other decoding either raises
utf-8) or decodes but represents it with another character (e.g. if
I still haven't found a way of getting that
cp850encoding name programatically from anywhere --- if you ask Python for, say, the "system encoding" or "user's preferred encoding" you get the
cp1252--- so I've had to hard-code it. [EDIT: If you want it, it's
So there you are. I don't have C++ as opposed to Python for Qt, but I have a suspicion that if anyone tries it using the straight C++ Qt way of
text = QString(process.readAllStandardOutput())they'll find they do not actually get to see the