QByteArray and char type
-
Good morning @J-Hilk,
that said, std::byte would make that Qt Version require c++17 or later. I'm not sure, that's ok, or not ?
That is no problem, as Qt 6 requires C++17.
But as std::byte is also with limited scope (no arithmetic) I'm not sure it is a general solution...
-
Hi @J-Hilk ,
we've missed you at 2019 Contributers summit ;)
-
@J-Hilk said in QByteArray and char type:
@JKSH said in QByteArray and char type:
I don't see much point in switching from char to unsigned char. If we're to initiate a switch, let's do things properly and switch to std::byte.
is someone(tm) where to make the changes, like @aha_1980 suggested in this bug report: https://bugreports.qt.io/browse/QTBUG-64746
you would prefer std::byte over unsigned char ?
Actually, I take that back. I just tried playing
std::byte
and found that it's not easy to work with:std::byte b = 0xFF; // Error: cannot initialize a variable of type 'std::byte' with an rvalue of type 'int' auto x = std::byte{0xFF}; auto y = uchar{0xFF}; qDebug() << (x == y); // Error: Invalid operands to binary expression
We also can't pass
std::byte
to a function that expectsunsigned char
without casting, so it isn't any more interoperable than the existingchar
.Because Thiago was against scope creep, and adding std::byte and unsigned char probably falls in that category
I was originally thinking of adding functions that operate on
std::byte
and omitting functions that operate onunsigned char
. I'm no longer convinced that's helpful.std::byte would make that Qt Version require c++17 or later. I'm not sure, that's ok, or not ?
As @aha_1980 pointed out, this part isn't an issue.
The bigger issue is reaching a consensus on how far we should go:
- Thiago Maciera wants to keep things as-is but is open to letting "someone(TM)" add a few convenience functions for interop with
unsigned char
. - Marc Mutz wants to rework
QByteArray
completely to usestd::byte
under the hood: https://lists.qt-project.org/pipermail/development/2020-May/039532.html
@aha_1980 said in QByteArray and char type:
Hi @J-Hilk ,
we've missed you at 2019 Contributers summit ;)
There's also the blog post at https://www.qt.io/blog/first-qt-6.0-snapshot-available
P.S. Anyone signed up for the virtual Qt World Summit? :-D
- Thiago Maciera wants to keep things as-is but is open to letting "someone(TM)" add a few convenience functions for interop with
-
-
@JKSH said in QByteArray and char type:
Actually, I take that back. I just tried playing std::byte and found that it's not easy to work with:
You guys know more about C++ than I, but my reading of
std::byte()
is that it is effectively just a representation of an 8-bit pattern. You are not supposed to do any arithmetic on it, or natively compare it tounsigned char
etc. It's just a "blob" of data. ? -
@JonB said in QByteArray and char type:
my reading of
std::byte()
is that it is effectively just a representation of an 8-bit pattern.I agree.
(Caveat: A byte is defined as the smallest accesible unit of data in memory. It's usually 8-bits in today's common architectures, but it doesn't actually have to be 8-bits)
You are not supposed to do any arithmetic on it
I agree. And I think programmers shouldn't normally try to do arithmetic on QByteArray elements either. (Exception: If you have a low-level efficiency hack in mind, you really know what you're doing, and you document it clearly, then go ahead)
...or natively compare it to
unsigned char
etc. It's just a "blob" of data. ?Wasn't your original point of this thread that a "blob" of data should be
unsigned char
? -
@J-Hilk said in QByteArray and char type:
I'm not a source code contributor (yet :) )
I don't think that's a precondition. You contribute on many other places.
And you have a good knowledge about the library and a vision where it should go to.
And that's what counts :)
Regards
-
@aha_1980 said in QByteArray and char type:
@J-Hilk said in QByteArray and char type:
I'm not a source code contributor (yet :) )
I don't think that's a precondition. You contribute on many other places.
And you have a good knowledge about the library and a vision where it should go to.
And that's what counts :)
+1 @J-Hilk is definitely a Contributor to the Qt community.
-
@JKSH said in QByteArray and char type:
I agree. And I think programmers shouldn't try to do arithmetic on QByteArray elements either.
That's fine if I receive some
QByteArray
data and just want to store it/forward it onto something else. It's not fine if I need to look at its content and act on it for some purpose. Then I may need to, say, see if it's greater than 200 or whatever. At which point I think I need to cast away fromstd::byte()
to achieve that.Wasn't your original point of this thread that a "blob" of data should be unsigned char?
I was not the person who introduced the discussion about representing it via
std::byte
, for good or for bad! I want to be able to examine the bytes and do, for example, greater-then operations on them. For that, my original point was that I did not expect something referring to "bytes" --- using at least what I have found usage of that word in other languages to be, viz. an unsigned quantity in range 0--255 --- to have an interface only offering (signed)char
s, I expectedunsigned char
s to be available. Else one must be careful about comparison code, for instance. -
Let me bring even more confusion in this and point to Timur Doumler excellent talk at CppCon 2019 about type punning, where he outlines that this:
void printBitRepresentation(float f) { auto buf* = reinterpret_cast<unsigned char*>(&f); for( int i(0); i < sizeof(float); i++ ) { std::cout << buf[i]; } }
is actually undefined behavior.
https://youtu.be/_qzMpk-22cc?t=2626@JKSH thanks :D
-
@J-Hilk
I did have a look at that (frightening) discussion. I was "perturbed" by the answer that you have to rely on what he said was a "magic" implementation ofmemcpy()
, which you can't know anything about, to achieve it! And didn't really understand how that resolves whatever the issue is anyway. -
@J-Hilk
I still didn't understand how usingmemcpy()
between addresses (void *
received bymemcpy()
) resolved the problem, as opposed to just moving it elsewhere. Perhaps I would have had to read the whitepaper he showed if I wanted to understand. Unless you feel like explaining whymemcpy()
from one address to another, and then back in code accessing the destination address as anunsigned char *
but not so for the source address, would make it "work correctly"...? -
@JonB well as I understand it:
reinterpret_cast does not change the pointer. You previously pointed to the float object, and after the reinterpret_cast you still do. And now you want to do pointer arithmetic on that object that is undefined behavior.
Now with memcpy you actually copy the bytes from one pointer to an other. How thats done, only the compiler vendor knows :D but after the copy have defined behavior, because the char array is actually there!
But it makes no difference
take a look at this compiler explorer outputhttps://gcc.godbolt.org/z/7673av
the 2 functions produce identical assembler code
-
@J-Hilk
I do realize in practice the code generation is OK. Not my point.memcpy()
takesvoid *src
and avoid *dest
. It doesn't know what they point to. It copies a number of bytes from one area to the other. Now afterward back in your code you are allowed to access/array the bytes atdest *
, yet not asrc *
. Makes no sense to me.... -
@JonB said in QByteArray and char type:
I still didn't understand how using
memcpy()
between addresses (void *
received bymemcpy()
) resolved the problem, as opposed to just moving it elsewhere.Technically it does because black magic™. You have that kind of nonsense sprinkled all around the standard, just doesn't get too much exposure. To give you an example through a simple question:
What's the actual type of a lambda function?
Or to expand:
That is how does one define that a function is going to take a lambda as parameter?Conventional wisdom is use the STL (
std::function
). The ideological problem is that the latter is a template which needs to have a specified type as a template parameter, however a lambda has an undefined type, so the instantiation happens with the magicClosureType
, which is implementation defined.Here's how the Callable magic works:
https://en.cppreference.com/w/cpp/named_req/Callable
Basically you define a Callable anything that can be used through the STL's related types, but then the STL types require the template argument to be callable to make the instantiation - so it boils down to compiler incantations. (I'm not talking about the way compilers implement this though, just the ideas and the wording).PS. As a side note the lambdas are inlined extremely aggressively by the compiler. In release you don't get even a notion of such a construct.
-
@JonB said in QByteArray and char type:
It's not fine if I need to look at its content and act on it for some purpose. Then I may need to, say, see if it's greater than 200 or whatever.
...
I want to be able to examine the bytes and do, for example, greater-then operations on them
...
I expected
unsigned char
s to be available. Else one must be careful about comparison code, for instance.I think we have divergent ideas on what a byte is and what we expect of them. May I ask,
- What is your detailed definition of a byte?
- Can you provide a concrete example where you'd want to check that a byte is greater than 200 or whatever? (And I mean a byte, not a number, not an ASCII character)
- Does
unsigned char
fit your definition in #1? - Does
std::byte
fit your definition in #1?
-
@J-Hilk said in QByteArray and char type:
Let me bring even more confusion in this and point to Timur Doumler excellent talk at CppCon 2019 about type punning, where he outlines that this:
void printBitRepresentation(float f) { auto buf* = reinterpret_cast<unsigned char*>(&f); for( int i(0); i < sizeof(float); i++ ) { std::cout << buf[i]; } }
is actually undefined behavior.
https://youtu.be/_qzMpk-22cc?t=2626Wow, that's wild.
The same kind of thing happens in law -- hence why lawyers have job security!
-
@JKSH
We'll have to be careful. I realize this discussion will get out of hand, you know more than I do about correct definitions.What is your detailed definition of a byte?
About twice a "nibble" ;-) Also, if I get a mosquito nibble it doesn't hurt so much, but if I get a mosquito byte it really itches.
In a nutshell, I see for example in Python
Return a new "bytes" object, which is an immutable sequence of small integers in the range 0 <= x < 256
Wikipedia:
The modern de facto standard of eight bits, as documented in ISO/IEC 2382-1:1993, is a convenient power of two permitting the binary-encoded values 0 through 255 for one byte
Assuming 8-bits to keep it simple, I have always taken "byte" as meaning an unsigned quantity 0--255, as opposed to a signed one, -128--127. That is the nub. It's just that's how I see it used elsewhere.
Can you provide a concrete example where you'd want to check that a byte is greater than 200 or whatever? (And I mean a byte, not a number, not an ASCII character)
Nope, nothing practical :) I have an imaginary piece of hardware sending me a stream of byte values. For whatever reason (the joystick is faulty in one direction), I wish to ignore the ones larger than 200. I don't want to worry about casting/sign extension.
QByteArray b; if (b.at(0) > 200) ...
.Does unsigned char fit your definition in #1?
Yep. And I don't have to worry about sign!
Does std::byte fit your definition in #1?
It does when I don't look at the content. It's a bit useless when I do want to look at it (as I have to cast all over the place), So all in all it turns out it's a bit like a quantum object :)
Do you think in common parlance that a "byte" implies to you a value between 0--255 (just assume 8-bit). Perhaps it just as much suggests -128--127 to you?