Memory size increases per page load

buvintech

I spent weeks (on and off at least) trying to find a solution for the Qt Web Kit memory leak. There does not seem to be one. I found endless posts / discussions on the matter going back for many years, with plenty of suggestions (that don’t work!) I also tried reading the Qt source for help. Here is a link:

http://code.woboq.org/qt5/

That lead me to new ideas - but no luck! If you dig far enough you’ll find the Qt webkit uses the class MemoryCahe. Which describes itself as such: “This cache holds subresources used by Web pages: images, scripts, stylesheets, etc. “. Here’s a link to that source:

http://code.woboq.org/qt5/qtwebkit/Source/WebCore/loader/cache/MemoryCache.h.html

This is where the problems seem to come from, and the most common solutions indirectly try to manipulate without success. If you read the description of this classs (found via that link), you find that they INTENTIONALLY designed the QWebView / QGraphicsWebView to grow in memory indefinitely until the OS reclaims the resources! Read this:

“...
Dead resources in the cache are kept in non-purgeable memory.

When we prune dead resources, instead of freeing them, we mark their memory as purgeable and keep the resources until the kernel reclaims the purgeable memory.

By leaving the in-cache dead resources in dirty resident memory, we decrease the likelihood of the kernel claiming that memory and forcing us to refetch the resource (for example when a user presses back).

And by having an unbounded number of resource objects using purgeable memory, we can use as much memory as is available on the machine. The trade-off here is that the CachedResource object (and its member variables) are allocated in non-purgeable TC-malloc'd memory so we would see slightly ore memory use due to this. “

While this could be an interesting OPTION to have, why in the hell would this be forced on everyone? What browser works like this? What other class have you found that purposefully consumes infinite memory???

Anyway, I followed the advice posted by bms20: http://qt-project.org/forums/viewthread/11105 in Nov of ‘11.

He was vague, and his solution a nasty one, but it can be made to work. Basically, the memory is used until you close the application, i.e. terminate the process and the OS then frees it. So you can use QProcess to launch a process which has its own memory space. You can communicate between processes with QShared Memory or QLocal Socket.

Rather than “spawning off every page” as he describes (and am still rather uncertain what / how he means this), I basically restart my an entire app every X number of page loads. I used about 50 – so it is infrequent, but at least happens occasionally to prevent the infinite consumption. I launch another instance of my application, size it and place it exactly where the original was, and set a few more properties – getting this info via QShared Memory. Then, I close the oringal app. Depending on the OS, this is more / less seamless. Basically a funky window flash occurs. Maybe there is a way to eliminate that...

To provide a little more help than he did – there are complications to doing this to keep in mind. Some I resolved, some I did not. It may or may not effect you, depending on your usage. Also I had a variety of ideas that seemed to hit road blocks. You may have similar thoughts (or not think of few things mention here) when tackling this, so hopefully this will help.

Shared Memory can store simple data. I could not find a way to place a widget like a QGraphicWebView into shared memory though, so 2 processes could share it directly. The best you can do is manually share the properties. I even tried passing pointers via string representations of memory addresses, and this kind of worked but the OS threw access restriction errors when I tried to use the pointers. (Even then, however, I worried the addresses probably can’t be truly shared because they are only valid within their own memory space?)
There does not seem to be a platform independent way to launch separate processes and contain them within a parent window. (Though platform specific options do exist).
There is no platform independent way to determine the memory usage of your process, and then use that a means to determine when to do this nasty trick. The safest / easiest way is to just use a page load counter.
When it comes to the webview content - Just sharing a url is not good enough. Reloading the url could produce undesired effects on the web page / web app. More importantly – what if you posted a form to the page? The url does not account for this, only query string submissions would be duplicated. The best solution is instead to read & write the entire page contents across the processes after it has been loaded to begin with.
If you use or want to support nested iframes – you will have to plan this out in your page content transfers. The source needs to be extracted / written per frame.
When you spawn off the new process you will lose: history, links visited coloration, session storage... I’m still looking for quick ways of handling these details, but probably more manual twiddling is required.

bms20

Apologies for being vague.

I was working on a digital sinage system for Camvine - which went bust shortly after my post; hence I've got no reason to be vague anymore.

You should be able to get the white flashing out; I certainly did. I believe that you may need to wait for the signal that loading has finished.

Basically the sinage system consisted of an opengl compositor which needed to run continuously. The sinage system could display videos / images / pdfs and web pages. The web pages were bastards because of the memory exhaustion issue. Irrespective of how many web views I had in existence at a point in time there always was a persistent memory leak.

The solution was to render the web view to an offscreen buffer (which existed in shared memory). The shared memory would then be loaded on damage into an OpenGL texture for compositing.

One thing I'd point out - I stopped using QShared memory because it wasn't reliable (this was in the Qt 4.7 series). I observed persistent shared segment leakage. I ended up writing my own implementations for both linux and windows to work around this problem. Basically QSharedMemory segments were still in existence after all processes were terminated...

You should be able to do something similar for whatever system you are building.

Alternatively, you may want to look into the webkit2 stuff - that is designed exactly with the idea of blowing away processes in mind.

Feel free to ask me additional questions if you'd like.

-bms20

buvintech

Hi bms20,

Thanks for the reply - and all this time after you original post! I can infer that you are very skilled, and I appreciate the help!

Let me layout what I am trying to accomplish so my limitations are more clear and we can differentiate what you did / were able to do verse this scenario I am working with.

I am using Qt 5.1, with Qt Creator as my primary IDE. Creator lets can make an “HTML5 Application” with the wizard that pops up when you create a new project. This adds and implements an html5applicationviewer class which is essentially a wrapper for a QGrahicsWebView object.

Qt Creator is setup to automatically update this code as enhancements come out. I disabled this functionality, but intend to re-enable it at times and then re-implement my extensive customizations. To this end, I am altering as little as possible of the base code (and commenting the hell out of the changes when I do). Almost all of my additions are tacked onto the end of the cpp so I can easily copy and paste them back when Creator erases them during such an auto update. I believe I need to work like this, because most of the customizations are additions to the html5applicationviewerprivate class which is defined in the cpp to restrict access (so I can’t just use a separate file in the project)...

Anyway, I therefore must the QGrahicsWebView object as my primary interactive component. This needs to work at least in Windows, Mac, and various mobile devices e.g. Android. (Most of this Qt provided wrapper class it fact revolves around adjustments for mobile use).

I am still a little unclear about the fundamentals of your solution. When you have the chance, please elaborate on this part of your post:

"rendering the web view to...shared memory ... [and] then ... loaded ... into an OpenGL texture for compositing."

So, I gather you avoided my horror of swapping out the ENTIRE application?

Did you not actually use a QWebView / QGraphicsWebView in your app for the user to interact with, or did something else appear in your “view”? Or did you do so, but somehow “move” the data into an OpenGL object as needed to manage the destruction of web pages?

If you had the webview in shared memory, this must be another benefit to your own shared memory class. I can’t figure out any way of putting a widget into QSharedMemory. I think it follows the same rules as the basic container classes regarding “assignable-data-types”:

http://qt-project.org/doc/qt-4.8/containers.html#assignable-data-type

"The values stored in the various containers can be of any assignable data type...such as int and double, pointer types, and Qt data types such as QString, QDate, and QTime, but it doesn't cover QObject or any QObject subclass (QWidget, QDialog, QTimer, etc.). "

If possible, I would like to avoid any specific OS coding that I can. The primary reason for using Qt is platform independence.

Did you put a pointer to the widget into shared memory? If so, how did you use that pointer in another memory space?

buvintech

Also bms20, did you copy the frame source/contents from one process to another verse using the url? Did you account for frame nesting? Did you manage to account for history, session storage, and other such state losses?

I only ask because I am trying to retain as much info as possible. These kind of state losses can obvious break a web app that is being displayed via webview. I don't expect to handle every problem that arises from this tactic, but most of the big ones at least...

buvintech

Bms20 - I have reread your post more carefully and I believe I may understand it better now. This is my understanding. Please correct any confusions on my part:

Your main process had an OpenGL Texture object which displays text, images, etc.

You would start a process in the background that contained a webview, and use that to browse to a given url.

From that, you would extract either the source (or perhaps an image of the entire web page?) and place that in shared memory.

Then you would kill that background process, and take that source (or finished rendering?) and display it in the OpenGL texture.

If that is all correct, is it therefore true that the web pages were read-only, non-interactive, javascript free, etc?

That would be an excellent solution for simply displaying the page, but wouldn’t help if your goal was like mine to have a “custom browser” that acts as a shell for an interactive web application.

I am still working on my attempt to transfer the entire state of the webview across processes, as the additional pieces such as links visited, session storage, etc. is needed. I’ll just have to limit support for those such things I can’t manage to transfer without potentially causing errors in the app. For instance, there doesn’t seem to be good way to transfer history (in the natural sense), because there are no public methods for manipulating that other than clear(). I could automatically browse the entire history on each “webview reload”, but that obviously could be fraught with peril depending upon the web app...

veveve

Did anyone submit this bug to Qt-team?
Will anything change in QtWebEngine?

buvintech

I've seen post and bug reports about this going back for years. Apparently, they like this way! Refer to my first post in this thread.

veveve

I reported this to Qt bug tracker: https://bugreports.qt-project.org/browse/QTBUG-36530

Bach

I think best solution to decrease load time is CDN aka Content Delivery Networks. Plus they can speed up your website in globally. You should watch this video about CDN: https://www.youtube.com/watch?v=_4COWL7oNSw

dv879

I realize this topic is very old, but I came across this problem today and managed to figure out a work-around.
The

QWebSettings::setMaximumPagesInCache(0)
QWebSettings::setObjectCacheCapacities(0,0,0)

had absolutely no effect on the memory leak under Windows 10 and MAC OS X 10.

After reading the post above, about the QWebView's caching mechanism and memory usage, I looked for more information and came across this article: https://webkit.org/blog/516/webkit-page-cache-ii-the-unload-event/
Something caught my attention - "unload event handlers, prevent pages from going into the Page Cache". Ding, ding, ding... :)

So I decided to try and manually add an unload event handler to the source of each page that is loaded and see what happens.
So after the page loading is complete, I do:

QWebElement body=myWebView->page()->mainFrame()->findFirstElement("body");
body.setAttribute("onunload","myFunction()");

Just these two lines stopped a 1MB to 3MB memory leak per page loaded in my case. :)

I'm also doing:

QWebSettings::clearMemoryCaches();
webView->history()->clear();

before loading each page (because I don't need them).

I hope this work-around will be helpful to other people who may stumble upon this problem in the future.