How to skip chunks using QDataStream?

Bonnie

@Crag_Hack
You can save compressed QByteArray as BLOB type

mrjj

@Crag_Hack
Hi
There is
https://wiki.qt.io/How_to_Store_and_Retrieve_Image_on_SQLite
that stores an image. It could also be any binary blob.

Crag_Hack

This post is deleted!

Crag_Hack

Will the performance of an SQLite file be comparable to using the QDataStream QByteArray data chunk skip method? Does the SQLite setup do sequential processing behind the scenes?

With SQLite indexing will I see better performance? Do I need to recreate the index every time I update the file? Can I store the index in the file or do I have create it on the fly every time?

The size of the file storing the data in either method will probably be comparable right?

Anything else that might be relevant?

mrjj

Hi

Will the performance of an SQLite file be comparable to using the QDataStream QByteArray data chunk skip method?

Depending on how you actually need to access the data, it should be as fast or even faster than datastream if we talking GB size file.

But that depends on that you can select with sql statement the data you want and then read the subset compared having to open this
GB file and do lots of skipping.

< Does the SQLite setup do sequential processing behind the scenes?
Im not sure what you mean here ?
what type of sequential processing ?

With SQLite indexing will I see better performance?

Only if you can create index on a column that helps you look up a row faster. For any blobs, i don't think it will help at all.

Do I need to recreate the index every time I update the file? Can I store the index in the file or do I have to create it on the fly every time?

You basically just tell it to create it and then its automatically handled.

The size of the file storing the data in either method will probably be comparable right?

Yes the overhead of the DB is not that much.

Anything else that might be relevant?

Use transactions when you update data. It will save your file
if the app crash or power cut while writing.
If one of the benefits of using db system versus file directly.
Make sure you check the path of the db file when opening the db. if you point to nonexisting file, then it will create a new one which can be mighty confusing.

This tool is really helpful for managing/testing the db
https://sqlitebrowser.org/

JonB

@Crag_Hack said in How to skip chunks using QDataStream?:

Does the SQLite setup do sequential processing behind the scenes?

No, with a database there will be no "sequential processing", it will be "random access" instead, meaning it will "jump to" the data you want directly, without reading through other data.

You could create a table consisting of two columns: an "id" column (an integer to identify each row uniquely) and a "blob" column to hold a piece of associated data of any size. SQLite will handle the "id" column for you if you make it an "auto-increment", it will just assign numbers 1, 2, 3... for you. You then have to know, somehow, that what you want is "row number #2", and you can ask it to give you that row's "blob" column data directly. It will be able to hand you back that value without having to read through any other rows/columns.

Crag_Hack

Thanks guys.

For this data set I will be first looking up records based on a QString with a name identifying a subset of the data, then further with a qint64 representing a QDateTime identifying a further subset of that data. Sounds to me like the aforementioned rules still apply.

Also I might need to delete some of the records in the beginning of the database if it gets too large. Will SQLite reconstruct the autoincrement id column with up to date values?

And I need to iterate over the entire database pulling out names and qint64 datetime values but I guess this won't complicate anything will it?

Anything else else that might be relevant? :)

JKSH

@Crag_Hack said in How to skip chunks using QDataStream?:

I will be first looking up records based on a QString with a name identifying a subset of the data, then further with a qint64 representing a QDateTime identifying a further subset of that data....

If the name and timestamp are guaranteed to be unique, then you can use those as the "key". Your SQL query code will look like this:

SELECT compressed_bytes FROM main_table WHERE name='XYZ' AND timestamp=12345678;

Also I might need to delete some of the records in the beginning of the database if it gets too large. Will SQLite reconstruct the autoincrement id column with up to date values?

No, SQLite won't auto-update the IDs when you delete some data.

What's the purpose of updating the IDs? Why not just keep the original values?

And I need to iterate over the entire database pulling out names and qint64 datetime values but I guess this won't complicate anything will it?

You don't "iterate over the database". You ask the database to give you all the data that you're interested in, using only a single query.

After that, you can iterate over the returned dataset :)

Crag_Hack

Awesome JKSH. Time to get to work.

So after some more reading looks like indexing is only for searching without an AUTOINCREMENT setup or default rowid.

And for locating a database record with it's rowid or autoincrement column a binary search is used which is far better than using the whole sequential parsing of a file with QDataSream to find your data.

Crag_Hack

This post is deleted!

Crag_Hack

@SGaist Do you happen to know if HDF5 is any faster than SQLite for lots of QByteArray storage as mentioned in this thread?

SGaist

I don't know sorry. The best thing to do is write a benchmark so you can select what works best for you.