Qt OpenMPI wrap

kshegunov · wrote on 5 Feb 2016, 01:22

Hello,
Sometime in the following 6 months to a year I would have to put up a computationally-intensive application and I'm thinking of doing a somewhat simple wrap around the OpenMPI library. The best case scenario would be to have that Qt signal-slot sweetness available out of the box for each computational node. I know that interest in such a thing will probably be on the low-side, but was wondering how I could go around designing it. Suppose I put my code in a dynamic library where I derive from QCoreApplication and do the OpenMPI init/deinit there. What I'm not quite sure how to do is actually designing a nicely flowing transition from the OpenMPI messages to the Qt's signal-slot mechanism. Does someone have an idea, a suggestion?

Kind regards.

SGaist · 5 Feb 2016, 23:19

Hi,

Just to be sure I understand you correctly, do you want to use Qt to write MPI jobs ?

kshegunov · 2 May 2016, 23:29

@SGaist
Yep! I know it to be possible since I've already done it for another project, however I used the "OpenMPI" way for the message handling (receiving, sending) only I had an intermediary that serialized/deserialized my Qt objects to byte arrays, so I could move them around the nodes.

EDIT:
What do you mean by jobs? If you meant the job managment for the cluster, no there's a queue manager for that. I mean to write an application that utilizes the OpenMPI library and Qt. An OpenMPI node forks the main() of the program but nodes communicate between themselves via messages. You could think it as a inter-process communication, which I'd like to make a bit easier for me (and possibly others) by wrapping the message passing between the nodes as signal-slot connections.

SGaist · 7 Feb 2016, 01:48

So a Qt version of mpirun ?

Nice new avatar, reminds me of the old forum highest rank badge :)

kshegunov · 2 Jul 2016, 01:51

@SGaist

Nice new avatar, reminds me of the old forum highest rank badge

Thanks. Although I don't know the badge itself, it's strange it would resemble an atom ... :)

So a Qt version of mpirun ?

Well ... no. mpirun is just fine. I shall provide some example code (from an old project), and maybe it'll become clear what I'd like to do exactly. When one makes a program that runs on a cluster (or a PC using OpenMPI) one ordinarily has something like this (this is extracted from a class, so don't take note of the undefined/unused variables):

int main(int argc, char ** argv)
{
	MPI_Init(&argc, &argv);

	// Waiting variables
	QMutex waitMutex;
	QMutexLocker dummyLock(&waitMutex);
	QWaitCondition waitCondition;

	// Process messages
	MPI_Status status;
	while (hasMessageLoop)  {
		if (MPI_Iprobe(MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &messageAvailable, &status) != MPI_SUCCESS)  {
			cerr << QStringLiteral("Error probing for MPI message, process exiting!") << endl;
			break;
		}

		if (!messageAvailable)  {
			// No message, sleep 1 second and try again.
			// NOTE: Since each task is quite slow to process there is no real reason for the busy wait implemented by MPI_Probe! Instad just poll for messages each second.
			waitCondition.wait(&waitMutex, 1000);
			continue;
		}

		try  {
			processMessage(status);
		}
		catch (QString message)
		{
			cerr << message << endl;
			break;
		}
	}

	MPI_Finalize();

	return 0;
}

As you can see this is just an ordinary event loop. Similarly on how events are generated for Qt by the window system, here they are in fact "generated" by OpenMPI. They are not what you'd call spontaneous in Qt's terms, however the principle of handling them is pretty much the same. Now suppose there's a way that each node will know about each other node (they have int identifiers, so they can be distinguished) and at some point while processing data you want to send a message to another node. you'd issue something like:

MPI_Send(&pointIndex, sizeof(pointIndex), MPI_INT32_T, status.MPI_SOURCE, MSG_TAG_PROCESS_POINT, MPI_COMM_WORLD);

However probing (as in the event loop above is simply not enough, to get that last message you've sent, you'd want to call a corresponding receive, like this:

	int size;
	if (MPI_Get_count(&status, MPI_CHAR, &size) != MPI_SUCCESS || size == MPI_UNDEFINED)
		throw QStringLiteral("Can't get size for data");

	// Receive the initial data
	char * data = new char[size];
	MPI_Recv(data, size, MPI_CHAR, status.MPI_SOURCE, status.MPI_TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE);

And so you have communication between the nodes. This is obviously a bit cumbersome, and I believe could greatly be improved upon, by using Qt and it's signal slot mechanism. I envision a "syntax" like this:

class MyMpiController : public QObject
{
	// ...
};

int main(int argc, char ** argv)
{
	QMpiApplicaton application(argc, argv);
	MyMpiController controller;

	QObject::connect(&application, &QMpiApplication::initialize, [&controller] ()  {
		QObject::connect(QMpiNode::currentNode(), SIGNAL(messageReceived(QMpiMessage), &controller, SLOT(processMessage));
	});

	return QMpiApplicaton::exec();
}

Obviously, such a way could use a lot of improvement, but let's say that it's a starting point. Now, for this to be feasible I'd have to hack up Qt's event loop, and replace it with my own - this I'm still not quite clear how to do. I should also transmit some messages between the nodes for internal purposes (like reporting which node is online, and requesting them to quit). This I believe would be pretty straight-forward. The thing is that it'd be best if I could somehow connect a messageReceived slot for each of the message types, which I'm also not very clear how I could do. The MPI send/receive specifics wouldn't be a problem since everything can be serialized before sending and deserialized before receiving.
I hope that cleared it up.

Kind regards.

SGaist · 10 Feb 2016, 22:05

Way clearer, thanks for the detailed explanation.

Do you plan to support both sync and async message passing ?

kshegunov · 2 Oct 2016, 22:11

@SGaist
Indeed, I do. It's pretty simple to add a blocking/nonblocking flag for the message. I'm advancing with the event dispatcher as well. I'm now pulling the messages from MPI and soon I'll be implementing the application-level event processing. I'll be using the metatype ID for the message classes as the "message tag" (what OMPI guys call the message type) and will be using the meta system heavily to create objects based on that on reception of the messages.

The only problem I see is that MPI_Probe (blocking check for a pending message) does not support timeouts and I can't select on a file to get real async behavior. However, MPI_Probe implements busy wait itself so I'll be just running the event loop without blocking and with the non-blocking probing, so everything should be fine.

Kind regards.

SGaist · wrote on 10 Feb 2016, 22:13

Sounds good !