QNetworkAccessManager: download page resources
-
Hi, Qt Project.
I had a task to download the whole webpage and store it on the computer. So, I need to download page itself and all its resources (css, img, js).
@
image.png
style.css
common.js
@Questions:
How can I download the whole webpage with all its resources
What's the best way to impl some kind of cache-manager (my task), which will save page and its resources to be run again in future.
Thanks!
-
Check the similar "thread with ideas how to download a whole web site":http://qt-project.org/forums/viewthread/20957 using QNetworkAccessManager with the assistance of QUrl.
-
You have to find pieces of html code starting with href=" or src='. I used to cut the code by these(in both cases can be " or ' behind = so I split it by href= and src= and removed 1st char), cut off unneeded part of code after next ' or " (as http://example.com/style.css">some other code...) char in code and select addresses ending with regexp I actually need, such as .css,.png etc.
Some links can look like "/images/blahblah.png" so you need to select them (easily url.toString.startsWith("/")) and add the url you downloaded it from (for example "http://example.com"+"/images/blahblah.png").
And don't forget create folder which you should save it to, for example for blahblah.png it is /images in directory you are saving it all to.
Hope this helps :) -
To download a website with all the resources use QNAM::setCache to set a QNetworkDiskCache with your preffered directory to store data.
After you download the page, you can do this on the QNetworkRequest to make an "offline" request:
@
QNetworkRequest rq(QUrl("http://whatever.url"));
rq.setAttribute(QNetworkRequest::CacheLoadControlAttribute, QNetworkRequest::AlwaysCache);
@ -
I have the same question. I want to download all the resources of a webpage (css, js, image) by loading the page in QWebPage.
The problem that I have is that read() in QNetworkReply is sequential and after QWebPage uses read() for its own rendering, my program gets nothing to read (and then to save to a file).
I have seen a few posts suggesting that we should use a custom QNetworkAccessManager and a custom QNetworkReply, but I'm new to Qt and don't know exactly how to do this. I would appreciate it if you can give a little bit more information about this. If you have any sample code for this, that would be great too.