Qdatastreams and binary files.
-
how do i read chunks or sections with qbytearray? Should i store it in one buffer then read the section with qfile read?
1-4 bytes are just padding i believe 5 is the ammount of books that is written to the struct
0xcb 0x3b 0x8d is the bookhash
0x12 is the bookhashId
@Styx said in Qdatastreams and binary files.:
1-4 bytes are just padding
OK
i believe 5 is the ammount of books that is written to the struct
So the maximum number of books is 255?
What are bytes 6-12?
-
The null bytes are just null padding. max number of books are 5 each book has its own bookhash and bookhashId booktype bookdir and bookfilename
Would use qbytearray mid grab the max number of books which is 5 then loop thru the rest of the qbytearray to grab each bookinformation.
-
The null bytes are just null padding. max number of books are 5 each book has its own bookhash and bookhashId booktype bookdir and bookfilename
Would use qbytearray mid grab the max number of books which is 5 then loop thru the rest of the qbytearray to grab each bookinformation.
@Styx said in Qdatastreams and binary files.:
The null bytes are just null padding. max number of books are 5 each book has its own bookhash and bookhashId booktype bookdir and bookfilename
OK
Would use qbytearray mid grab the max number of books which is 5 then loop thru the rest of the qbytearray to grab each bookinformation.
Sounds good
-
is there a way to loop thru the bytearray without using mid?
was using...
Books bookinfo; bookinfo.bookCount = (bytearray.at(5) & 0xFFFF);
@Styx said in Qdatastreams and binary files.:
is there a way to loop thru the bytearray without using mid?
Sure (https://doc.qt.io/qt-5/qbytearray.html):
QByteRef operator[](int i)
char operator[](int i) const
char operator[](uint i) const
QByteRef operator[](uint i)So
for (int i = 0; i < 3; ++i) bytearray[i];
-
So once i readall the file into the qbytearray i would use mid to break up the byte offset and copy them to another bytearray?
Is there away to get around not having to use so many qbytearray to copy data?
How would seek and read work from a Qfile?
02 - book count 00 00 00 00 00 00 00 - (padding) fb 2b 7d 13 - bookhash 09 - bookhashid 00 00 00 00 00 00 00 - (padding) 44 6e 49 4f 43 44 61 62 64 - booktype 42 - booktypeid 00 00 00 00 00 00 00 - (padding) 00 00 00 00 00 00 00 00 00 00 00 00 - bookDir and bookFileName (Qstring) 00 00 00 00 00 00 00 - (padding) 1a 10 a2 ae - bookhash 08 - bookhashid 00 00 00 00 00 00 00 41 6S 64 4f 47 61 49 44 - booktype 3c - booktypeid 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 - bookDir and bookFileName (Qstring)
This is a example of a file i am reading i was trying to read it into a struct but not sure if that is the correct method. Qdatastreams cant be used because the file wasn't written by qdatastreams.
as you seen in the code the book count is looped base on the same information provided.
-
Hi
What app produces the file ?
Its not open source so you could get the actual record definitions ? -
@Styx said in Qdatastreams and binary files.:
Is there away to get around not having to use so many qbytearray to copy data?
Work with a plain const char * pointer
-
@Styx said in Qdatastreams and binary files.:
Is there away to get around not having to use so many qbytearray to copy data?
Work with a plain const char * pointer
@Christian-Ehrlicher said in Qdatastreams and binary files.:
Work with a plain const char * pointer
To add to @Christian-Ehrlicher's point: Call
QByteArray:data()
orQByteArray::constData()
to get a raw pointer to your data. Then, you can use pointer arithmetic to extract your data.QByteArray ba = file.readAll(); const char* data = ba.constData(); // Assuming that your file is little-endian... memcpy(&m_binaryVersion, data + 0, sizeof(quint8 )); memcpy(&bookCount, data + 5, sizeof(quint8 )); memcpy(&bookHash, data + 13, sizeof(quint32));
EDIT: Code above changed from reinterpret_cast<> to memcpy() for cross-platform safety
-
@JKSH Since .data() is null terminated. Think it would be better to use shift left.
// Assuming that your file is little-endian... m_binaryVersion = *reinterpret_cast<const quint8* >(data + 0 ) >> 8; bookCount = *reinterpret_cast<const quint8* >(data + 5) >> 12; bookHash = *reinterpret_cast<const quint32*>(data + 13) >> 16; // example bookCount=256 the first byte is '\0' then all the rest will be undetermined.
Shouldn't have issues calling the index and then looping thru the qbytearray to print out the data as well.
-
@JKSH Since .data() is null terminated. Think it would be better to use shift left.
// Assuming that your file is little-endian... m_binaryVersion = *reinterpret_cast<const quint8* >(data + 0 ) >> 8; bookCount = *reinterpret_cast<const quint8* >(data + 5) >> 12; bookHash = *reinterpret_cast<const quint32*>(data + 13) >> 16; // example bookCount=256 the first byte is '\0' then all the rest will be undetermined.
Shouldn't have issues calling the index and then looping thru the qbytearray to print out the data as well.
@Styx said in Qdatastreams and binary files.:
@JKSH Since .data() is null terminated. Think it would be better to use shift left.
m_binaryVersion = *reinterpret_cast<const quint8* >(data + 0 ) >> 8; bookCount = *reinterpret_cast<const quint8* >(data + 5) >> 12; bookHash = *reinterpret_cast<const quint32*>(data + 13) >> 16; // example bookCount=256 the first byte is '\0' then all the rest will be undetermined.
I don't get it. Could you please explain how this works?
@Styx said in Qdatastreams and binary files.:
How would i use seek and read dynamically to read each file. (Qfile api).
Take the code that reads one file and put it in a loop. Pass a different filename each loop iteration.
-
@JKSH Since .data() is null terminated. Think it would be better to use shift left.
// Assuming that your file is little-endian... m_binaryVersion = *reinterpret_cast<const quint8* >(data + 0 ) >> 8; bookCount = *reinterpret_cast<const quint8* >(data + 5) >> 12; bookHash = *reinterpret_cast<const quint32*>(data + 13) >> 16; // example bookCount=256 the first byte is '\0' then all the rest will be undetermined.
Shouldn't have issues calling the index and then looping thru the qbytearray to print out the data as well.
@Styx said in Qdatastreams and binary files.:
@JKSH Since .data() is null terminated. Think it would be better to use shift left.
// Assuming that your file is little-endian... m_binaryVersion = *reinterpret_cast<const quint8* >(data + 0 ) >> 8; bookCount = *reinterpret_cast<const quint8* >(data + 5) >> 12; bookHash = *reinterpret_cast<const quint32*>(data + 13) >> 16; // example bookCount=256 the first byte is '\0' then all the rest will be undetermined.
I don't know what you're trying to achieve here (as @JKSH said), but:
-
You are using shift right, not left.
-
*reinterpret_cast<const quint8* >(data + 0 )
returns aquint8
. Since that is (unsigned) 8-bits in size,>> 8
always returns 0 regardless of content. -
Similarly for
*reinterpret_cast<const quint8* >(data + 5) >> 12
, except that>> 12
makes even less sense for an 8-bit value. -
QByteArray:data()
is indeed (extra)\0
terminated, but that has no relevance to any of the lines of code you wrote.
The code without any shifts written by @JKSH makes sense. I'm afraid yours does not!
-
-
@Christian-Ehrlicher said in Qdatastreams and binary files.:
Work with a plain const char * pointer
To add to @Christian-Ehrlicher's point: Call
QByteArray:data()
orQByteArray::constData()
to get a raw pointer to your data. Then, you can use pointer arithmetic to extract your data.QByteArray ba = file.readAll(); const char* data = ba.constData(); // Assuming that your file is little-endian... memcpy(&m_binaryVersion, data + 0, sizeof(quint8 )); memcpy(&bookCount, data + 5, sizeof(quint8 )); memcpy(&bookHash, data + 13, sizeof(quint32));
EDIT: Code above changed from reinterpret_cast<> to memcpy() for cross-platform safety
@JKSH said in Qdatastreams and binary files.:
bookHash = *reinterpret_cast<const quint32*>(data + 13);
Have you actually tried this line? Because I would assume it will "segment fault" (or whatever, probably something else). You are trying to dereference a 32-bit int from
data + 13
, which will be an odd numbered address. Whoops! :) [I have a feelingstatic_cast<>
would warn/prohibit this at compile-time?]You must be very careful recommending to treat a binary block like this as though you can index into it directly for the types you know were serialized there, for this kind of reason. Here you need to pull the 4 bytes out of the buffer (e.g.
memcpy()
directly into an&quint32
if you know endian-ness is same on host as in file), or some other safe approach. -
@JKSH said in Qdatastreams and binary files.:
bookHash = *reinterpret_cast<const quint32*>(data + 13);
Have you actually tried this line? Because I would assume it will "segment fault" (or whatever, probably something else). You are trying to dereference a 32-bit int from
data + 13
, which will be an odd numbered address. Whoops! :) [I have a feelingstatic_cast<>
would warn/prohibit this at compile-time?]You must be very careful recommending to treat a binary block like this as though you can index into it directly for the types you know were serialized there, for this kind of reason. Here you need to pull the 4 bytes out of the buffer (e.g.
memcpy()
directly into an&quint32
if you know endian-ness is same on host as in file), or some other safe approach.@JonB said in Qdatastreams and binary files.:
@JKSH said in Qdatastreams and binary files.:
bookHash = *reinterpret_cast<const quint32*>(data + 13);
Have you actually tried this line? Because I would assume it will "segment fault" (or whatever, probably something else). You are trying to dereference a 32-bit int from
data + 13
, which will be an odd numbered address. Whoops! :)Thanks for the heads-up. I tried compiling it using MinGW 7.3.0 32-bit, MSVC 2017 32-bit, and MSVC2017 64-bit (all with Qt 5.14.0, release mode) and got the expected results every time. However, your comment prompted me to do some digging which led me to this question: Should I worry about the alignment during pointer casting?
I'll update my sample code.
[I have a feeling
static_cast<>
would warn/prohibit this at compile-time?]Static casting cannot be used to convert a byte array into an integer at all, no matter where the bytes sit in memory.
-
@JKSH said in Qdatastreams and binary files.:
You are trying to dereference a 32-bit int from data + 13, which will be an odd numbered address.
This is working fine on x86_64, only slow. It does not work on some ARM processors, see e.g. here: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15414.html
-
@JKSH , @Christian-Ehrlicher
Very interesting! I thought processors just "bus-dumped" or whatever on an odd address, I didn't know they would "trap" the alignment and "recover", and thereby work but run slowly. I wonder what the last "friendly" processor architecture I saw --- Motorola 68000 family, like 68010 or 68020, not this x86-type stuff --- would have done? :) -
So I have some 3000 files to go through and read. Currently I have been indexof and mid to find strings and variables.
QByteArray filedata = file.readall(); int j = 0; while ((j = filedata.indexOf("books", j)) != -1) { QDegub () << "Found String index position " << j ; ++j; // put the qbytearray into a qstring }
This method can get ugly as some of the files have over 50 strings inside it and this would make the source code look ugly.
Should i just seek to the start position then read from that point on? Should i use readline? Or read? qbytearray readall then store it in another buffer. Is there a way to extract strings from a qbytearray?
-
Stop, guys... As I remember, you can read a simple data types (int, uint and etc) using the QDataStream. And even own structures, which are not written by QDataStream (use raw read for this). You even can read a strings as a RAW objects.