multiple thread with QNetworkAccessManager
-
I have a file list with 10000 lines of data and I have to send them to a local intranet in order to process and get the return, the return is a json, if the current line of data is approved the intranet will return a json:
{ "data": "success" }
otherwise:
{ "data": "error" }
I want to get all the "success" and put in a list view, but I want to speed up the process using multiple threads (like 5 or 10 or even more) with multiple
QNetworkAccessManager
s.How can I do this?
-
Hi,
Do you mean you want to send your file in several chunks in parallel ?
In any case, IIRC, QNetworkAccessManager already handles up to 6 query in parallel.
-
No, I mean, I have a file with 10000 lines of data, I read the entire file a list view and I want to get each item and use QNetworkAccessManager to connect with the intranet API in order to get the response about the current item. For example, the first item on the list is: "maria", I want to get it and connect with the intranet and get its return in json. The thing is that the user can select how many concurrent threads he wants to do the job, so he won't need to wait a QNetworkAccessManager for the response, he can create like 100 threads and each thread making requests... and I have other list view with the items that the intranet api returned "success". - meaning, I want to add items that each thread processed as "success" to the other list view.
-
@Volebab said:
he can create like 100 threads and each thread making requests...
Which in reality is pointless. After you get over the number of physical cores on a system creating more threads is not beneficial for purposes of speed. Two threads running on the same CPU core will compete for time-allocation and you will only get overhead from the OS's scheduler, but not a grain of speed.
-
@kshegunov Yes, you are right, I made a research and most of the people said that too much threads may kill performance and use more sources than needed, although using more than 2 threads or even 10 in a few computers will be harmless. Anyhow, I wanted to know how to do what I said above, the user will specify how many threads he wants to do that job. (I will probably limit the max number of threads - it depends - most of the computers that will be using it is within the network and are servers with a really nice configuration.
-
Also depending on your server/network setup starting a hundred connections might not be the best idea.
Like I wrote, QNetworkAccessManager already handles several concurrent connections. What you can do is write a worker object that will contain a QNetworkAccessManager and you pass it the data chunk you'd like to send.
-
@SGaist it's local, and normally servers can handle thousands of requests per second, that is not a problem.
You said that I can create a worker object, but how to share the same list view with multiple workers? I mean, the list has 10000 itens, each worker can get an item, how to be sure that it won't be a mess? -
It will depend on your architecture. You can have your worker object query a "task list" to get a job. Or have a "task manager" that will generate the workers as needed and give them the tasks they should do.
-
Which part ?
-
@Volebab
Here's how a worker object is defined and used (barebone code only):class MyWorkerObject : public QObject { Q_OBJECT public: WorkerObject() : QObject(NULL), nam(NULL) { } ~WorkerObject() { delete nam; } public slots: void initialize() { nam = new QNetworkAccessManager; } void sendRequest() { // Send your requests here (connect to the proper controlling signal) } QNetworkAccessManager * nam; };
Which you use like this:
QThread * workerThread = new QThread(); MyWorkerObject * workerObject = new MyWorkerObject; workerObject->moveToThread(workerThread); QObject::connect(workerThread, SIGNAL(started()), workerObject, SLOT(initialize())); QObject::connect(workerThread, SIGNAL(finished()), workerObject, SLOT(deleteLater())); QObject::connect(workerThread, SIGNAL(finished()), workerThread, SLOT(deleteLater())); QObject::connect(QCoreApplication::instance(), SIGNAL(aboutToQuit()), workerThread, SLOT(quit())); workerThread->start(); // Connect the appropriate signal(s) to MyWorkerObject::sendRequest
PS.
I should really sit down and write a threading tutorial for the wiki ... -
@kshegunov In the absolute, there's already an example in QThread's documentation but improvement to the doc/examples are always a good idea :)
-
I will try to explain it better:
I have a file with 10000 item, each line is a complete name of a person, like: "Johnnie Doe". We have an intranet with an api that we can make a request, example: http://192.168.182.10/user.php?name=Johnnie Doe and it will return a json with a valid property that specify if the costumer is validated or not in the system, along with other information about she/him, like name, documents and so on. The things that I want to do is:
- How and if is possible to read a big file with thousands of lines into a list without crashing or making the software slow?
- How to use multiple threads (which I can specify how many) in order to speed up the process of connecting with the intranet api.
- How to get the result in json of each thread and put in a table with only the name of the costumer and the result of the valid property. For example: I made a consult for the costumer "Johnnie Doe" and he is validated, so I want to put in a table his name and true for the validation.
- How to make each thread get items on the list without making a mess, for example, I have 5 threads getting items from the list, isn't it going to get messy? How to make each one get an item from the list without problems?
I work in a company and I have to do this mostly manually and I want to automate that, it's a pain.
@kshegunov - Thank you for the example.
-
@SGaist
I suppose so, but this code I can write in my sleep. I can't seem to remember the number of times I've written it here. Also the documentation doesn't really cover some finer pointslike actually waiting for the thread to finish, or running a loop through the event loop (although this should be simple to gather by yourself if you understand how the API works in the first place). -
@Volebab said:
How and if is possible to read a big file with thousands of lines into a list without crashing or making the software slow?
You put the reading of the file and the heavy lifting in (a) worker object's slot(s) (like above). You control the worker object through signals (that are connected to its slots). The worker object notifies the GUI (or main thread) again by raising signals, which you connect to slots in the widgets.
How to use multiple threads (which I can specify how many) in order to speed up the process of connecting with the intranet api.
As already discussed, use one thread for the file reading, leave the network access manager to thread the network requests itself. Use the asynchronous API of the NAM, so you don't block your worker thread (it emits signals when a reply was received, connect those to slots in your worker object).
How to get the result in json of each thread and put in a table with only the name of the costumer and the result of the valid property. For example: I made a consult for the costumer "Johnnie Doe" and he is validated, so I want to put in a table his name and true for the validation.
If talking about a table widget (GUI) then emit a signal from the worker object with the customer's information and connect that signal to a slot in the widget/controller object that will add it to the UI.
How to make each thread get items on the list without making a mess, for example, I have 5 threads getting items from the list, isn't it going to get messy? How to make each one get an item from the list without problems?
You are thinking even lower level. With multiple threads accessing the same data you need to protect that data. The most basic thing to do in that situation is to have a thread safe queue (which is done with a help of a mutual exclusion lock
QMutex
and a semaphoreQSemaphore
) but really, this is low-level stuff that requires experience. If you limit your threads and objects to communicate through signals and slots you don't need to worry about that.Kind regards.
-
Amazing answer and I get most of what you said, the only thing though that I have no idea how to accomplish is:
You are thinking even lower level. With multiple threads accessing the same data you need to protect that data. The most basic thing to do in that situation is to have a thread safe queue (which is done with a help of a mutual exclusion lock QMutex and a semaphore QSemaphore) but really, this is low-level stuff that requires experience. If you limit your threads and objects to communicate through signals and slots you don't need to worry about that.
I have a QListWidget full of names, a QTableWidget to put informations, I have a worker to read the big file and put each line as the QListWidget item, and I have a worker using QNetworkAccessManager to connect with the API.
The thing now is: How the worker using QNetworkAccessManager gets the item from the QListWidget without collapsing with other worker getting at the same time?I know that it might be ask too much, but if you provide me an example I would be really happy. I really need to get this working so I can ease my job here on the company.
-
@Volebab
Aha! Well that's a bit counter-intuitive indeed. The worker can emit a signal that it can process data (send request or w/e). The GUI is subscribed to that signal and in the slot that handles it, it raises it's own signal with the data. The signal that GUI emits is connected to the worker and the worker gets its data. It sounds like a rollercoaster, but is actually quite simple. Something like this (again bare-bone code only):class MyWorker : public QObject { Q_OBJECT signals: void canProcessData(); public slots: void process(MyDataContainer data) { // Send requests w/e // If you want to repeat, emit canProcessData() at the end } }; class GuiClass : public QObject // Can be widget or QObject, depending on the exact architecture of your application { Q_OBJECT signals: void startDataProcessing(); // This will start the data processing void dataToProcess(MyDataContainer data); public slots: void dataRequested() { MyDataContainer data; // Collect the data from the GUI side // After collecting the data emit the appropriate signal emit dataToProcess(data); } };
You connect those signals in a loop between the two objects:
// These two are only for completeness, you have to adjust the pointers so they are referencing the correct objects MyWorker * worker; GuiClass * gui; QObject::connect(gui, SIGNAL(startDataProcessing()), worker, SIGNAL(canProcessData())); // Delegating the signal so you can start the loop QObject::connect(worker, SIGNAL(canProcessData()), gui, SLOT(dataRequested())); // The worker requests a new batch of data QObject::connect(gui, SIGNAL(dataToProcess(MyDataContainer)), worker, SLOT(dataToProcess(MyDataContainer))); // The GUI provides new data for the worker
This is how you can "pull" data (as opposed to the usual "push") from another thread.
If you have two workers, you can connect one directly to the other. Suppose
readWorker
is the worker object that reads the data,netWorker
is the one making the requests, andgui
is the GUI class. The power of the signal-slot mechanism should become obvious here:class ReadWorker : public QObject { Q_OBJECT signals: void haveCustomerName(CustomerName); public slots: void startReading() { // Read the file, for each set of customer name emit the signal while ( /*... reading the file ... */) { CustomerName name; // You get a customer from the file emit haveCustomerName(name); } } }; class NetworkWorker : public QObject { Q_OBJECT signals: void haveCustomerData(CustomerData); public slots: void getCustomerData(CustomerName name) { // Make the network request, connect other signals if need etc. // At the end of the day, this slot should emit haveCustomerData when it has received it from the network } }; class GuiClass : public QObject // Can be widget or QObject, depending on the exact architecture of your application { Q_OBJECT signals: void start(); // This will start the data processing (i.e. reading the file and sending requests) public slots: void onNewCustomerName(CustomerName name) { // This will give you the customer name while waiting for response from the network. You can use it to pre-populate the GUI } void onNewCustomerData(CustomerData data) { // The data had arrived and the worker has notified you. Use it to update the GUI } };
You can connect those three "in a triangle":
// These are only for completeness, you have to adjust the pointers so they are referencing the correct objects ReadWorker * readWorker; NetworkWorker * netWorker; GuiClass * gui; // Starting the reading when the GUI emits start() QObject::connect(gui, SIGNAL(start()), readWorker, SLOT(startReading())); // One worker notifies the other that a customer name has been read. The network worker can start making requests with that QObject::connect(readWorker, SIGNAL(haveCustomerName(CustomerName)), netWorker, SLOT(getCustomerData(CustomerName))); // The worker that reads the data also notifies the GUI for the customer name, so the GUI can pre-populate (if it wishes) some widgets or w/e QObject::connect(readWorker, SIGNAL(haveCustomerName(CustomerName)), gui, SLOT(onNewCustomerName(CustomerName))); // The worker that handles the network notifies the GUI that customer data is available, GUI should then update itself to reflect that. QObject::connect(netWorker, SIGNAL(haveCustomerData(CustomerData)), gui, SLOT(onNewCustomerData(CustomerData)));
Well, it turned out a bit long-ish than I intended initially ... anyway, I think this should help.
Kind regards. -
What is the propose of the GuiClass and MyDataContainer classes?
GuiClass
is a name for the class managing the user interface. It can be a QObject, or a MainWindow subclass or practically anything that has aQObject
ancestor. If you have derived fromQMainWindow
, which is the usual approach, thenGuiClass
is exactly your main window class.MyDataContainer
, again is a generic name I'd chosen when writing the example. It's the class that contains the data (whence the name) to be transferred between the threads. As you can see there are no declarations for it, it can beQVector
,QImage
,QString
or some integral type asint
. It can also be a complex user-defined type (a structure or a class, but then some registrations are needed).I mean, I have QMainWindow, isn't it already a gui class?
It is and you can use that instead of
GuiClass
.And Why MyDataContainer if I will be reading using the ReadWorker?
So you can transfer the data safely between the threads.
If you give access directly (that is you don't use signals and slots) to an object's method/property then you have to put locks for each method call and/or property access that might happen from two threads.Kind regards.