Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. Language Bindings
  4. QByteArray to string?

QByteArray to string?

Scheduled Pinned Locked Moved Solved Language Bindings
python3pyqt5
29 Posts 4 Posters 32.3k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J Offline
    J Offline
    jazzycamel
    wrote on 16 Nov 2017, 16:57 last edited by
    #13

    @JNBarchan
    Apologies, that should have been

    print(qba.data().decode('utf8'))
    

    (That'll teach me to read things properly...!)

    For the avoidance of doubt:

    1. All my code samples (C++ or Python) are tested before posting
    2. As of 23/03/20, my Python code is formatted to PEP-8 standards using black from the PSF (https://github.com/psf/black)
    J 1 Reply Last reply 16 Nov 2017, 17:24
    3
    • J jazzycamel
      16 Nov 2017, 16:57

      @JNBarchan
      Apologies, that should have been

      print(qba.data().decode('utf8'))
      

      (That'll teach me to read things properly...!)

      J Offline
      J Offline
      JonB
      wrote on 16 Nov 2017, 17:24 last edited by JonB
      #14

      @jazzycamel
      OK, that does work, thank you! Now then, may I ask:

      1. QByteArray.data() returns bytes. Where was I supposed to come across documentation for bytes.decode() (e.g. in PyQt?)? [EDIT: I'm a newbie to both Python & Qt. I spend my time looking around the Qt documentation to do this stuff. I'm beginning to guess this is a Python issue, not Qt, but it's a lot to take in!]

      2. (Because of #1) I don't know the arguments to decode(). I have used my utf-8 and your utf8 and as far as I can see both work the same. Which is "right"/"preferable"?

      3. Can you comment (briefly :) ) on why decode() vs str(encoding=...) is preferable/nicer/more Pythonic?

      1 Reply Last reply
      0
      • J Offline
        J Offline
        jazzycamel
        wrote on 16 Nov 2017, 17:40 last edited by jazzycamel
        #15
        1. bytes is a python standard type and is fully documented in the python docs, the particular information you require re. bytes.decode() can be found here.
        2. In the documentation linked above you will find a link to Standard Encodings (also part of the python docs) which will tell you all you ever wanted to know about encodings (and more!). utf-8 and utf8 are simply aliases of one another, both are perfectly acceptable (as detailed/listed in the docs) as are U8 and UTF (I think...!).
        3. Semantics, but Python is considered to be primarily an object-oriented language and therefore you should use an objects own methods (yes, bytes and str are objects as are all 'types' in Python) rather than a function. In fact, the str() function just invokes an objects own __str__() method as that defines how the object should be represented as a string (true for all types).

        For the avoidance of doubt:

        1. All my code samples (C++ or Python) are tested before posting
        2. As of 23/03/20, my Python code is formatted to PEP-8 standards using black from the PSF (https://github.com/psf/black)
        J 1 Reply Last reply 16 Nov 2017, 17:58
        3
        • J jazzycamel
          16 Nov 2017, 17:40
          1. bytes is a python standard type and is fully documented in the python docs, the particular information you require re. bytes.decode() can be found here.
          2. In the documentation linked above you will find a link to Standard Encodings (also part of the python docs) which will tell you all you ever wanted to know about encodings (and more!). utf-8 and utf8 are simply aliases of one another, both are perfectly acceptable (as detailed/listed in the docs) as are U8 and UTF (I think...!).
          3. Semantics, but Python is considered to be primarily an object-oriented language and therefore you should use an objects own methods (yes, bytes and str are objects as are all 'types' in Python) rather than a function. In fact, the str() function just invokes an objects own __str__() method as that defines how the object should be represented as a string (true for all types).
          J Offline
          J Offline
          JonB
          wrote on 16 Nov 2017, 17:58 last edited by
          #16

          @jazzycamel
          Yep, all good stuff, makes sense, thank you very much!

          As I edited against #1, I now realise that certain things from Qt via PyQt require me to look at Python documentation rather than Qt.

          Since you happen to be here, and are so kind, would you care to comment on one issue which was raised in posts above. In PyQt 4, apparently, you could go s = QString() if you wanted to. Is it indeed correct that in PyQt 5 there really is no such thing as QString anywhere, and you have to deal in Python types like str in every situation? (Doubtless same applies to, say, QByteArray type and bytes, and for other such Qt types where you have decided only to allow the Python type.)

          Finally, don't suppose you could make Python be just like C# instead for me, then I'd be much happier? ;-)

          J 1 Reply Last reply 16 Nov 2017, 20:02
          0
          • S Offline
            S Offline
            SGaist
            Lifetime Qt Champion
            wrote on 16 Nov 2017, 19:52 last edited by
            #17

            @jazzycamel long time no see ! Thanks for the thorough explanation :-)
            Parts of it would be a welcome addition to the PyQt5 documentation.

            Interested in AI ? www.idiap.ch
            Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

            1 Reply Last reply
            0
            • J JonB
              16 Nov 2017, 17:58

              @jazzycamel
              Yep, all good stuff, makes sense, thank you very much!

              As I edited against #1, I now realise that certain things from Qt via PyQt require me to look at Python documentation rather than Qt.

              Since you happen to be here, and are so kind, would you care to comment on one issue which was raised in posts above. In PyQt 4, apparently, you could go s = QString() if you wanted to. Is it indeed correct that in PyQt 5 there really is no such thing as QString anywhere, and you have to deal in Python types like str in every situation? (Doubtless same applies to, say, QByteArray type and bytes, and for other such Qt types where you have decided only to allow the Python type.)

              Finally, don't suppose you could make Python be just like C# instead for me, then I'd be much happier? ;-)

              J Offline
              J Offline
              jazzycamel
              wrote on 16 Nov 2017, 20:02 last edited by
              #18

              @JNBarchan
              There is indeed no such thing as QString() in PyQt5. It shouldn't be necessary as the library takes care of type marshalling between the Python and Qt (C++) types. In fact, while there is a QVariant(), its generally not necessary to use it for the same reason. QByteArray() does exist also, but I would steer clear of it if possible and let PyQt5 deal with via bytes().

              No, I will never (and no one else should!) ever make Python like C#!! :)

              For the avoidance of doubt:

              1. All my code samples (C++ or Python) are tested before posting
              2. As of 23/03/20, my Python code is formatted to PEP-8 standards using black from the PSF (https://github.com/psf/black)
              J 1 Reply Last reply 30 Nov 2017, 09:47
              3
              • J jazzycamel
                16 Nov 2017, 20:02

                @JNBarchan
                There is indeed no such thing as QString() in PyQt5. It shouldn't be necessary as the library takes care of type marshalling between the Python and Qt (C++) types. In fact, while there is a QVariant(), its generally not necessary to use it for the same reason. QByteArray() does exist also, but I would steer clear of it if possible and let PyQt5 deal with via bytes().

                No, I will never (and no one else should!) ever make Python like C#!! :)

                J Offline
                J Offline
                JonB
                wrote on 30 Nov 2017, 09:47 last edited by
                #19

                @jazzycamel , or anyone else

                Having implemented qba.data().decode('utf8') as directed, I have now come across a situation where the QByteArray data returned by QProcess.readAllStandardOutput() from an OS command run under Windows causes the Python/PyQt code to generate a UnicodeDecodeError error, as detailed in my post https://forum.qt.io/topic/85493/unicodedecodeerror-with-output-from-windows-os-command

                This makes it impossible to convert the data, blocking the whole behaviour of my usage.

                My belief is that this would not be happening at all from C++ where I would simply use whatever methods of QByteArray/QString or the language. The problem is precisely is that I am being forced to use a "Python/PyQt" way of doing this, causing the error in Python/PyQt only, which is exactly why I didn't want to have to do that but cannot get access to the necessary types/methods of Qt from PyQt...?

                1 Reply Last reply
                -1
                • S Offline
                  S Offline
                  SGaist
                  Lifetime Qt Champion
                  wrote on 30 Nov 2017, 14:16 last edited by
                  #20

                  Can you show the code you use ?

                  Interested in AI ? www.idiap.ch
                  Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

                  J 1 Reply Last reply 30 Nov 2017, 15:59
                  0
                  • S SGaist
                    30 Nov 2017, 14:16

                    Can you show the code you use ?

                    J Offline
                    J Offline
                    JonB
                    wrote on 30 Nov 2017, 15:59 last edited by JonB
                    #21

                    @SGaist
                    I promise you all you'll see is a QByteArray being returned with the sub-process's output, and I'm trying to convert that to a QString to put into a QTextEdit. That's all the question is. And I get a UnicodeDecodeError, probably when robocopy echoes the name of a file which has that 0x9c character in it via PyQt's decode():

                    can't decode byte 0x9c in position 32: invalid start byte
                    

                    So presumably all you have to do is create a QByteArray, put a 0x9c in its first byte, and try qba.data().decode('utf8'). That's what this thread is about.

                    This whole issue where I'm discussing the code is in https://forum.qt.io/topic/85493/unicodedecodeerror-with-output-from-windows-os-command. If you'd be kind enough to look at that, I think that's a more appropriate place to discuss the code than here? If you still want more code there, let me know, and I'll supply.

                    1 Reply Last reply
                    -1
                    • S Offline
                      S Offline
                      SGaist
                      Lifetime Qt Champion
                      wrote on 30 Nov 2017, 20:01 last edited by
                      #22

                      I don't have a Windows machine at hand. Doing this on macOS yields correct results

                      from PyQt5.QtCore import QByteArray
                      ba = QByteArray()
                      ba.append(u"\u009C")
                      PyQt5.QtCore.QByteArray(b'\xc2\x9c')
                      ba.data().decode('utf-8')
                      '\x9c'
                      ba.data().decode('utf-16')
                      '鳂'
                      

                      Interested in AI ? www.idiap.ch
                      Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

                      J 1 Reply Last reply 30 Nov 2017, 20:19
                      0
                      • S SGaist
                        30 Nov 2017, 20:01

                        I don't have a Windows machine at hand. Doing this on macOS yields correct results

                        from PyQt5.QtCore import QByteArray
                        ba = QByteArray()
                        ba.append(u"\u009C")
                        PyQt5.QtCore.QByteArray(b'\xc2\x9c')
                        ba.data().decode('utf-8')
                        '\x9c'
                        ba.data().decode('utf-16')
                        '鳂'
                        
                        J Offline
                        J Offline
                        JonB
                        wrote on 30 Nov 2017, 20:19 last edited by JonB
                        #23

                        @SGaist
                        I'm afraid I don't believe that relates to the situation.

                        I now have information from the client:

                        The exception occurs (only) when a filename robocopy encounters --- robocopy is echoing filenames as it goes --- contains the £ (UK pound sterling) character (I am in the UK, you may not be). In that situation, ba.data().decode('utf-8') (where ba is the QByteArray from QProcess.readAllStandardOutput()) results in:

                        Unhandled Exception:
                        
                        'utf-8' codec can't decode byte 0x9c in position 32: invalid start byte
                        
                        <class 'UnicodeDecodeError'>
                        File "C:\HJinn\widgets\messageboxes.py", line 289, in processReadyReadStandardOutput
                        output = output.data().decode('utf-8')
                        

                        Now, armed with that information:

                        • In a Command Prompt I type in: echo £ > file
                        • I dump the file and I see: 9C 20 0D 0A
                        • So the £ character is single byte with value 0x9C
                        1 Reply Last reply
                        -1
                        • S Offline
                          S Offline
                          SGaist
                          Lifetime Qt Champion
                          wrote on 30 Nov 2017, 20:24 last edited by
                          #24

                          What do you get if you use unicode_escape in place of utf-8 ?

                          Interested in AI ? www.idiap.ch
                          Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

                          J 2 Replies Last reply 30 Nov 2017, 20:30
                          0
                          • S SGaist
                            30 Nov 2017, 20:24

                            What do you get if you use unicode_escape in place of utf-8 ?

                            J Offline
                            J Offline
                            JonB
                            wrote on 30 Nov 2017, 20:30 last edited by
                            #25

                            @SGaist
                            I don't know, because I don't have access to the code right now, but I will tomorrow.

                            Thank you, your suggestion is much more like what I have been looking for. We are now discussing the argument to decode():

                            • I believe utf-8 is definitely right for Linux, where I develop.
                            • I'm beginning to learn (whether I like it or not) that it is not for Windows.
                            • Under Windows utf-8 does work 99% of the time, but not always, and now I know not for the £ character.
                            • I believe that either latin-1 or windows_1252 may be able to handle this correctly.
                            • I will also try your unicode_escape if you think it's worthwhile.
                            1 Reply Last reply
                            -1
                            • S SGaist
                              30 Nov 2017, 20:24

                              What do you get if you use unicode_escape in place of utf-8 ?

                              J Offline
                              J Offline
                              JonB
                              wrote on 30 Nov 2017, 20:40 last edited by JonB
                              #26

                              @SGaist
                              I believe what I am seeking from you is: Haven't I seen that Qt has some function to "get the current system encoding", but I can't spot it?

                              Then my code would be:

                              ba.data().decode(Qt.getCurrentSystemEncoding())

                              and everything would just work....

                              [EDIT: Ooohhhh, is http://doc.qt.io/qt-5/qtextcodec.html#codecForLocale what I'm looking for, perhaps?

                              QTextCodec *QTextCodec::codecForLocale()

                              Returns a pointer to the codec most suitable for this locale.

                              On Windows, the codec will be based on a system locale. On Unix systems, the codec will might fall back to using the iconv library if no builtin codec for the locale can be found.

                              Or, was I thinking of the Python sys.getfilesystemencoding() https://docs.python.org/3/library/sys.html#sys.getfilesystemencoding
                              But that seems filename-specific, my output could be anything, not especially file names.

                              1 Reply Last reply
                              -1
                              • J Offline
                                J Offline
                                JonB
                                wrote on 1 Dec 2017, 15:35 last edited by JonB 12 Apr 2017, 08:34
                                #27

                                [This post cross-posed to https://forum.qt.io/topic/85493/unicodedecodeerror-with-output-from-windows-os-command/18 ]

                                For the record, I have done exhaustive investigation, and there is only one solution which "correctly" displays the £ character under Windows. I am exhausted so will keep this brief:

                                1. To create a file name with a £ in it: Go into, say, Notepad and use its Save to name a file like abc£.txt. This is in the UK, using a UK keyboard and a standard UK-configured Windows.

                                2. Note that at this point if you view the filename in either Explorer or, say, via dir you do see a £, not some other character. That's what my user will want to see in the output of the command he will run.

                                3. Run an OS command like robocopy or even dir, which will include the filename in its output.

                                4. Read the output with QProcess.readAllStandardOutput(). I'm saying the £ character will arrive as a single byte of value 0x9c.

                                5. For the required Python/PyQt decoding bytes->str (QByteArray->QString) line, the only thing which works (does not raise an exception) AND represents the character as a £ is: ba.bytes().decode("cp850").

                                That is the "Code Page 850", used in UK/Western Europe (so I'm told). It is the result output of you open a Command Prompt and execute just chcp.

                                Any other decoding either raises UnicodeDecodeError (e.g. if utf-8) or decodes but represents it with another character (e.g. if windows_1252 or cp1252).

                                I still haven't found a way of getting that cp850 encoding name programatically from anywhere --- if you ask Python for, say, the "system encoding" or "user's preferred encoding" you get the cp1252 --- so I've had to hard-code it. [EDIT: If you want it, it's ctypes.cdll.kernel32.GetConsoleOutputCP().]

                                So there you are. I don't have C++ as opposed to Python for Qt, but I have a suspicion that if anyone tries it using the straight C++ Qt way of text = QString(process.readAllStandardOutput()) they'll find they do not actually get to see the £ symbol....

                                1 Reply Last reply
                                1
                                • G Offline
                                  G Offline
                                  germyrinn
                                  wrote on 17 Mar 2020, 07:32 last edited by
                                  #28

                                  Python makes a clear distinction between bytes and strings . Bytes objects contain raw data — a sequence of octets — whereas strings are Unicode sequences . Conversion between these two types is explicit: you encode a string to get bytes, specifying an encoding (which defaults to UTF-8); and you decode bytes to get a string. Clients of these functions should be aware that such conversions may fail, and should consider how failures are handled.

                                  We can convert bytes to string using bytes class decode() instance method, So you need to decode the bytes object to produce a string. In Python 3 , the default encoding is "utf-8" , so you can use directly:

                                  b"python byte to string".decode("utf-8")
                                  
                                  J 1 Reply Last reply 17 Mar 2020, 08:16
                                  0
                                  • G germyrinn
                                    17 Mar 2020, 07:32

                                    Python makes a clear distinction between bytes and strings . Bytes objects contain raw data — a sequence of octets — whereas strings are Unicode sequences . Conversion between these two types is explicit: you encode a string to get bytes, specifying an encoding (which defaults to UTF-8); and you decode bytes to get a string. Clients of these functions should be aware that such conversions may fail, and should consider how failures are handled.

                                    We can convert bytes to string using bytes class decode() instance method, So you need to decode the bytes object to produce a string. In Python 3 , the default encoding is "utf-8" , so you can use directly:

                                    b"python byte to string".decode("utf-8")
                                    
                                    J Offline
                                    J Offline
                                    JonB
                                    wrote on 17 Mar 2020, 08:16 last edited by
                                    #29

                                    @germyrinn
                                    Hi, this was an old post of mine.

                                    As I wrote, the problem is that for the £ sign e.g. read from a file created in the way I describe, decode("utf-8") gives me a UnicodeDecodeError. I found the only conversion which works is decode("cp850").

                                    1 Reply Last reply
                                    0

                                    • Login

                                    • Login or register to search.
                                    • First post
                                      Last post
                                    0
                                    • Categories
                                    • Recent
                                    • Tags
                                    • Popular
                                    • Users
                                    • Groups
                                    • Search
                                    • Get Qt Extensions
                                    • Unsolved