QByteArray and char type
-
@fcarney said in QByteArray and char type:
The definition of byte is that it is 8 bits
No, it's not. A byte is the smallest unit addressable by the CPU.
On most architectures it is 8bit, but not on all.
https://en.wikipedia.org/wiki/Byte -
@JonB said in QByteArray and char type:
I know it doesn't work, that's why I wrote it. This whole thread is (supposed to be) a discussion of why that is the case in something named a QByteArray.
As noticed, it is called
QByteArray
and notQUnsignedByteArray
orQSignedByteArray
, so there is nothing in the name which implies signed or unsigned.
And I found made return signed octets a natural type, because when you use char, short or long in your code, they are per default signed. You always have to specify unsigned to got unsigned value.
It made also sense to me, because QByteArray are design to work in combination with strings, which are signed char@jsulm Your are right Byte, at beginning was not a definition of a data structure, but since decades byte and octet have same meaning in programming world.
-
@KroMignon said in QByteArray and char type:
but since decades byte and octet have same meaning in programming world
Yes, but there is no "official" specification that it has always to be 8bit. It is a "de facto standard".
-
@jsulm said in QByteArray and char type:
Yes, but there is no "official" specification that it has always to be 8bit. It is a "de facto standard".
Yes, I agree with you, but as often, it is the "de facto standard" with prevail.
Wenn you look at many binary protocol specification, in the most case "byte" is used instead of "octet".
It is wrong, but it is the reality. -
@KroMignon said in QByteArray and char type:
And I found made return signed octets a natural type, because when you use char, short or long in your code, they are per default signed.
Note that
char
may be signed or unsigned, this is implementation defined. -
@KroMignon said in QByteArray and char type:
As noticed, it is called QByteArray and not QUnsignedByteArray or QSignedByteArray, so there is nothing in the name which implies signed or unsigned.
That is what this thread is about. I have offered a couple of examples --- I could have sought more --- of what I believe illustrates that in common parlance, and in other programming languages/libraries, the word "byte" does imply unsigned. The examples quoted a range of "0--255" where they might equally well have quoted "-128--127", but in practice they did not.
Maybe that's my opinion, or the opinion of some, but not shared by others.
At which point we have probably exhausted the debate.
-
OK, we'll stick with 1 byte == 8 bits for simplicity
@JonB said in QByteArray and char type:
In a nutshell, I see for example in Python
Return a new "bytes" object, which is an immutable sequence of small integers in the range 0 <= x < 256
...
I have always taken "byte" as meaning an unsigned quantity 0--255, as opposed to a signed one, -128--127. That is the nub. It's just that's how I see it used elsewhere.
...
Do you think in common parlance that a "byte" implies to you a value between 0--255 (just assume 8-bit). Perhaps it just as much suggests -128--127 to you?
We both agree that a byte should not be treated as a signed number -128 -- 127.
After this discussion and after some extra reading, I realize now that it's common for a byte to be treated as an unsigned number 0 -- 255.
I understand now that your definition of a byte is "an unsigned 8-bit integer". In this light, your original post makes sense:
char
is not a suitable datatype to store unsigned 8-bit integer, and I agree with you on this point.Personally though, I prefer to think of a byte as an 8-bit blob of data, distinct from an 8-bit number. That's why I have no problem with QByteArray storing
char
s -- because the signedness of the implementation has no effect on the meaning of the blob. It only affects people who want to implicitly convert the blob into a number (which you do).There is no unanimous consensus, however:
Language Byte-ish Datatype What is it? C unsigned char unsigned 8-bit integer C++ unsigned char unsigned 8-bit integer C++ std::byte 8-bit blob C# byte unsigned 8-bit integer Go byte unsigned 8-bit integer Java byte signed 8-bit integer JavaScript (element of an ArrayBuffer) 8-bit blob Python (element of a bytes-like object) unsigned 8-bit integer R (element of a raw vector) 8-bit blob Swift (element of Data) unsigned 8-bit integer Visual Basic Byte unsigned 8-bit integer Web IDL byte signed 8-bit integer Web IDL octet unsigned 8-bit integer (4 languages above don't let you create a singular variable with a byte type; the bytes always come in an array and extracting the byte involves conversion)
What is your detailed definition of a byte?
About twice a "nibble" ;-) Also, if I get a mosquito nibble it doesn't hurt so much, but if I get a mosquito byte it really itches.
Haha, good one!
-
@kshegunov said in QByteArray and char type:
. In C/C++ this return value should've been promoted to int as 200 is an int literal, but I didn't take into account that the ax registers are already integers, so this is going to be pruned when optimizing.
Again, I don't see any issue here, as value is unsigned, promoting it to int will not propagate sign bit.
Supposing b.at(0) = 0x81, which is 129 in base 10 when unsigned or -127 in base 10 when value is signed.If promoted to int value (32 bit):
-
@jsulm said in QByteArray and char type:
@fcarney said in QByteArray and char type:
The definition of byte is that it is 8 bits
No, it's not. A byte is the smallest unit addressable by the CPU.
On most architectures it is 8bit, but not on all.
https://en.wikipedia.org/wiki/ByteAgreed.
Desktop/mobile devs will probably only encounter 8-bit bytes. But DSP programmers often deal with 16-bit bytes: https://processors.wiki.ti.com/index.php/Byte_Accesses_with_the_C28x_CPU
-
@jsulm said in QByteArray and char type:
@fcarney said in QByteArray and char type:
The definition of byte is that it is 8 bits
No, it's not. A byte is the smallest unit addressable by the CPU.
On most architectures it is 8bit, but not on all.
https://en.wikipedia.org/wiki/ByteTHERE ARE FOUR LIGHTS! ;-)