Non-blocking local file IO in Qt5
-
I'm looking at how I can implement non-blocking file I/O in Qt5. My application reads large data files and displays some summarised information about them, and I'd like to do this (a) incrementally, and (b) without freezing the GUI as the files are read from disk.
I note that there are non-blocking TCP, Bluetooth and even RS-232 serial interfaces, but none for local files. (This isn't just an absence in Qt, I know GTK and Twisted are also missing this functionality.) Is there a third party library or piece of idiomatic code that shows how to do this? Can it be done with higher level structures (eg. thread-safe queues) rather than messing around with threads and low-level threading constructs?
For example, what would the non-blocking fortune client example look like using a local file? What would the
readFortune()
slot be connected to, or how would it be implemented?Searching for various terms around this just leads me to a StackOverflow post that says "use threads," but doesn't really demonstrate how. The forum search seems to drop the "IO" term, or include it in larger words, so it's not too helpful.
(Also: is there a tag for I/O related questions? Obviously "io" doesn't meet the 3 character minimum.)
-
@detly
There is no such thing as a non-blocking IO, IO operations are synchronous. That said, you want your file to be read/written in the background, judging from points a) and b) (which doesn't translate to non-blocking IO semantics) which you could indeed achieve through threading your application. As for the implementation it depends: are you trying to read the whole file and commit it to the memory, or just schedule a read and then when the operation has completed you want to be notified? Here is a very simple example that reads a whole file in the background:class Reader : public QObject { Q_OBJECT public: // ... signals: void fileRead(QString fileName, QByteArray data); public slots: void readFile(QString fileName); } void Reader::readFile(QString fileName) { QByteArray result; QFile file(fileName); file.open(QFile::ReadOnly); // ... read the data and fill the result buffer file.close(); emit fileRead(fileName, result); } // You use that simply by starting a thread and moving the worker object to it QThread * thread = new QThread(); thread->start(); Reader * reader = new Reader(); reader->moveToThread(thread); QObject::connect(...); // Connect the signals and slots to your reader object
This utilizes the low-level threading API. There are multiple approaches suitable depending on your needs.
-
Basically, what I'm getting at is: I want to deal with data from a local file just like I would from a TCP socket. Qt has a QTcpSocket class for TCP reading which has a
readyRead()
signal, after which you can read data from the socket without blocking. There is no such mechanism for local files. (Is there a third-party library that provides them?)The files are far too big to read entirely into memory to operate on, I'd have to schedule my reads and be notified at "sensible" points (sensible being defined by the domain here).
Do I have to build this up from threading primitives, from scratch? Are there are higher level constructs, such as thread-safe queues, such that I could push data to in one thread and pull it from in another?
I have to admit to being confused by comments like this:
There is no such thing as a non-blocking IO, IO operations are synchronous.
(which doesn't translate to non-blocking IO semantics)
The example I link to is explicitly contrasted with a blocking fortune client, implying that non-blocking I/O is a thing, and that the semantics of incremental reading and buffering without blocking the GUI is what it refers to. Where have I gone wrong?
-
@detly said:
Qt has a QTcpSocket class for TCP reading which has a readyRead() signal, after which you can read data from the socket without blocking.
Yes, because the socket is threaded internally and you get data from time to time. When data arrives it's put into a buffer and then the signal is emitted. You read from the socket's buffer and it seems that it is not blocking. This is not applicable to files, the data in the file is always ready to be read - they allow arbitrary access (random access devices). One thing to consider is the classical producer-consumer problem. You start a thread that reads the file and fills a buffer, and another thread reads it and parses it or does whatever you want with the data. Another point is that a thread-safe queue is not really a high-level primitive. You can, if you wish, implement such a thing (it's done with a mutex and semaphore for synchronization) but as far as I know Qt does not provide any such thing.
PS:
The example I link to is explicitly contrasted with a blocking fortune client, implying that non-blocking I/O is a thing
Look up my first line in the comment for sockets and you'll understand what I mean. The socket is internally threaded, as for serial ports and bluetooth, USB and so on the situation is similar.
-
One thing to consider is the classical producer-consumer problem. You start a thread that reads the file and fills a buffer, and another thread reads it and parses it or does whatever you want with the data.
Here, the blocking code is a file reader that reads data from disk and populates some structs appropriately. A queue seemed like the best way to get those structs to the GUI code. I don't know if I can use signals/slots to do it directly (eg. the struct is just an argument of the signal being emitted), because I don't know what their guarantees are with ordering, especially over thread boundaries.
-
@detly
Hello,Here, the blocking code is a file reader that reads data from disk and populates some structs appropriately.
That actually clears it a bit.
A queue seemed like the best way to get those structs to the GUI code. I don't know if I can use signals/slots to do it directly (eg. the struct is just an argument of the signal being emitted), because I don't know what their guarantees are with ordering, especially over thread boundaries.
You can use any object in signal-slot connection, provided you've registered it in the metaobject system. For your particular case, you could use my example where a thread is started, and do the parsing while reading the file. When a structure with information is ready, you emit a signal that is connected to some slot in your GUI. This should do the trick. If your structures are not POD, and have some dynamical data in them, consider making a wrapper that will utilize the implicit sharing concept. Good thing about threading and signals/slots is that they are thread safe, so you won't actually need to do any explicit synchronization.
I hope that helps.
Kind regards. -
Bah, in my rush before I forgot a vital detail: the structs will also be accompanied by blobs of binary data, in the realm of 32kB or more. Not something you'd normally pass on the stack, although I realise that signal data is marshalled and queued internally. Is this a "sensible" thing to use signals/slots for, or is 32kB+ the point at which I should be looking at more serious data sharing strategies?
Also, I'm assuming that if one thread emits a series of identical signals via the default connection type (which acts like a QueuedConnection across thread boundaries), they will be received in the same order. Is this correct?
-
@detly said:
Bah, in my rush before I forgot a vital detail: the structs will also be accompanied by blobs of binary data, in the realm of 32kB or more. Not something you'd normally pass on the stack, although I realise that signal data is marshalled and queued internally. Is this a "sensible" thing to use signals/slots for, or is 32kB+ the point at which I should be looking at more serious data sharing strategies?
It is possible, although I advise you to implement implicit sharing. It's done quite easily through the QSharedDataPointer and QSharedData classes. Many of the Qt classes use those two to minimize actual data copying (including but not limited to all containers, images and the like). There are examples in the documentation how you can implement your own implicitly shared class, so I'll not make a point of it here. With sharing you could easily (and most importantly lightly) copy objects around that have payloads of megabytes without actually duplicating the data.
Also, I'm assuming that if one thread emits a series of identical signals via the default connection type (which acts like a QueuedConnection across thread boundaries), they will be received in the same order. Is this correct?
Yes, you are correct. The slot invocations are in actuality events in the event queue (for queued connections) of the thread, so they will be performed in the order of their arrival (the order in which signals had been emitted).