Qdatastreams and binary files.
-
@JKSH , @Christian-Ehrlicher
Very interesting! I thought processors just "bus-dumped" or whatever on an odd address, I didn't know they would "trap" the alignment and "recover", and thereby work but run slowly. I wonder what the last "friendly" processor architecture I saw --- Motorola 68000 family, like 68010 or 68020, not this x86-type stuff --- would have done? :) -
So I have some 3000 files to go through and read. Currently I have been indexof and mid to find strings and variables.
QByteArray filedata = file.readall(); int j = 0; while ((j = filedata.indexOf("books", j)) != -1) { QDegub () << "Found String index position " << j ; ++j; // put the qbytearray into a qstring }
This method can get ugly as some of the files have over 50 strings inside it and this would make the source code look ugly.
Should i just seek to the start position then read from that point on? Should i use readline? Or read? qbytearray readall then store it in another buffer. Is there a way to extract strings from a qbytearray?
-
Stop, guys... As I remember, you can read a simple data types (int, uint and etc) using the QDataStream. And even own structures, which are not written by QDataStream (use raw read for this). You even can read a strings as a RAW objects.
-
@kuzulis You explain what your talking about?
Always thought Qdatastreams couldn't parse padding structures and that you could only read and write from it if it was done by qt.
Mind showing a example?
@Styx
Despite what @kuzulis has written, it does not follow that you can useQDataStream
to deserialize, say, anint
, even if there is no padding at all in its serialization (I don't know whatQDataStream
does or does not put in). The point is you have said that the file format you are trying to read is produced by someone else, not usingQDataStream
to serialize, right? In that case, as an example, even if it outputs, say, a 32-bitint
in 4 bytes you do not know whether that means low->high or high->low bytes. And nor doesQDataStream
. So how can that correctly deserialize if the way it was saved differs from howeverQDataStream
int-order deserialization works?EDIT See @kuzulis's code below which shows that you must tell
QDataStream
which order to expect if the output was not produced with the defaultQDataStream
order, which then allows you to proceed. -
What? Here an example e.g. how to parse a WAV file using the QDataStream:
... const quint32 kRiffId = 0x52494646; const quint32 kWaveId = 0x57415645; const quint32 kFmtId = 0x666d7420; const quint32 kPcmFmtSize = 16; // for PCM only const quint16 kAudioFormatId = 1; // WAVE_FORMAT_PCM const quint32 kDataId = 0x64617461; ... bool readFormat() { file.seek(0); QDataStream in(&file); quint32 chunkId = 0; in.setByteOrder(QDataStream::BigEndian); in >> chunkId; if (chunkId != kRiffId) // "RIFF" return false; quint32 chunkSize = 0; in.setByteOrder(QDataStream::LittleEndian); in >> chunkSize; // file size quint32 formatId = 0; in.setByteOrder(QDataStream::BigEndian); in >> formatId; if (formatId != kWaveId) // "WAVE" return false; quint32 subchunk1Id = 0; in.setByteOrder(QDataStream::BigEndian); in >> subchunk1Id; if (subchunk1Id != kFmtId) // "fmt " return false; quint32 subchunk1Size = 0; in.setByteOrder(QDataStream::LittleEndian); in >> subchunk1Size; if (subchunk1Size != kPcmFmtSize) // for PCM format only return false; quint16 audioFormat = 0; in.setByteOrder(QDataStream::LittleEndian); in >> audioFormat; if (audioFormat != kAudioFormatId) // for PCM format only return false; quint16 numChannels = 0; in.setByteOrder(QDataStream::LittleEndian); in >> numChannels; if (numChannels == 0) return false; quint32 sampleRate = 0; in.setByteOrder(QDataStream::LittleEndian); in >> sampleRate; if (sampleRate == 0) return false; quint32 byteRate = 0; in.setByteOrder(QDataStream::LittleEndian); in >> byteRate; quint16 blockAlign = 0; in.setByteOrder(QDataStream::LittleEndian); in >> blockAlign; if (blockAlign == 0) return false; quint16 bitsPerSample = 0; in.setByteOrder(QDataStream::LittleEndian); in >> bitsPerSample; if (bitsPerSample == 0) return false; quint32 subchunk2Id = 0; in.setByteOrder(QDataStream::BigEndian); in >> subchunk2Id; if (subchunk2Id != kDataId) // "data" return false; quint32 subchunk2Size = 0; in.setByteOrder(QDataStream::LittleEndian); in >> subchunk2Size; if (subchunk2Size == 0) return false; startDataOffset = sizeof(chunkId) + sizeof(chunkSize) + sizeof(formatId) + sizeof(subchunk1Id) + sizeof(subchunk1Size) + sizeof(audioFormat) + sizeof(numChannels) + sizeof(sampleRate) + sizeof(byteRate) + sizeof(blockAlign) + sizeof(bitsPerSample) + sizeof(subchunk2Id) + sizeof(subchunk2Size); format.setCodec(QLatin1String(kAudioCodec)); format.setChannelCount(numChannels); format.setSampleRate(sampleRate); format.setSampleSize(bitsPerSample); format.setSampleType(QAudioFormat::SignedInt); // TODO: This is correctly? format.setByteOrder(QAudioFormat::LittleEndian); return file.seek(startDataOffset); }
-
What? Here an example e.g. how to parse a WAV file using the QDataStream:
... const quint32 kRiffId = 0x52494646; const quint32 kWaveId = 0x57415645; const quint32 kFmtId = 0x666d7420; const quint32 kPcmFmtSize = 16; // for PCM only const quint16 kAudioFormatId = 1; // WAVE_FORMAT_PCM const quint32 kDataId = 0x64617461; ... bool readFormat() { file.seek(0); QDataStream in(&file); quint32 chunkId = 0; in.setByteOrder(QDataStream::BigEndian); in >> chunkId; if (chunkId != kRiffId) // "RIFF" return false; quint32 chunkSize = 0; in.setByteOrder(QDataStream::LittleEndian); in >> chunkSize; // file size quint32 formatId = 0; in.setByteOrder(QDataStream::BigEndian); in >> formatId; if (formatId != kWaveId) // "WAVE" return false; quint32 subchunk1Id = 0; in.setByteOrder(QDataStream::BigEndian); in >> subchunk1Id; if (subchunk1Id != kFmtId) // "fmt " return false; quint32 subchunk1Size = 0; in.setByteOrder(QDataStream::LittleEndian); in >> subchunk1Size; if (subchunk1Size != kPcmFmtSize) // for PCM format only return false; quint16 audioFormat = 0; in.setByteOrder(QDataStream::LittleEndian); in >> audioFormat; if (audioFormat != kAudioFormatId) // for PCM format only return false; quint16 numChannels = 0; in.setByteOrder(QDataStream::LittleEndian); in >> numChannels; if (numChannels == 0) return false; quint32 sampleRate = 0; in.setByteOrder(QDataStream::LittleEndian); in >> sampleRate; if (sampleRate == 0) return false; quint32 byteRate = 0; in.setByteOrder(QDataStream::LittleEndian); in >> byteRate; quint16 blockAlign = 0; in.setByteOrder(QDataStream::LittleEndian); in >> blockAlign; if (blockAlign == 0) return false; quint16 bitsPerSample = 0; in.setByteOrder(QDataStream::LittleEndian); in >> bitsPerSample; if (bitsPerSample == 0) return false; quint32 subchunk2Id = 0; in.setByteOrder(QDataStream::BigEndian); in >> subchunk2Id; if (subchunk2Id != kDataId) // "data" return false; quint32 subchunk2Size = 0; in.setByteOrder(QDataStream::LittleEndian); in >> subchunk2Size; if (subchunk2Size == 0) return false; startDataOffset = sizeof(chunkId) + sizeof(chunkSize) + sizeof(formatId) + sizeof(subchunk1Id) + sizeof(subchunk1Size) + sizeof(audioFormat) + sizeof(numChannels) + sizeof(sampleRate) + sizeof(byteRate) + sizeof(blockAlign) + sizeof(bitsPerSample) + sizeof(subchunk2Id) + sizeof(subchunk2Size); format.setCodec(QLatin1String(kAudioCodec)); format.setChannelCount(numChannels); format.setSampleRate(sampleRate); format.setSampleSize(bitsPerSample); format.setSampleType(QAudioFormat::SignedInt); // TODO: This is correctly? format.setByteOrder(QAudioFormat::LittleEndian); return file.seek(startDataOffset); }
@kuzulis
Sorry, what I meant was, this works because you know at each stage whether you expect little-endian or big-endian order in the external data stream format, and you explicitly code to tellQDataStream
that. I meant that a pure call to whichever way round the default ofQDataStream
expects will not work, whereas it would ifQDataStream
(with default settings) had also been used at the output side.I won't delete my post now, but I will amend it to point out that what you have written allows it to be accomplished.
-
Somethings are confusing what about a uknown format?
What do you meant?
What is startDataOffset?
It is a position in a WAVE file from where the data samples begins (after a WAVE header).
PS: It is just an example.. You should himself know a format of your file. And then you can use the stream operators for the 1,2,4,8 byte - integers, the 4,8 byte floats/doubles.. And to use the readRawData() to read a BLOB's, and to use the seek() if need. Then you can use the QDataStream.
-
Format is just some custom made binary file. I'm just trying to create a parse that converts the binary to xml. I can open and read the file fine just my method of finding data and converting it isn't the best method of use. Not much good examples of Qfile or QbyteArray to do all the things i'm trying to do.