I've done some research on this in January.
First, QNetworkAccessManager is no solution, as it seems, as its a good HTTP source.
But, you have to put the received content in a browser like enviroment, also parsing HTML is not really trivial, there is a tagsoup implementation which would do, but you got the problem, that some links are generated through javascript, so you really need to put that in a browser like thing -> QtWebKit.
QtWebKit offers a lot of good stuff which you can use to crawl, f.e. it can extract all <a> tags (aka links).
But, the problem here is, QtWebKit is not threadsafe, so you'd have to handle multiple Processes doing the work, in order to speed up the process.