Either `QFileInfo` caching does not work, or just its methods are unreasonable (very) slow
-
@RTEDFD said in Either `QFileInfo` caching does not work, or just its methods are unreasonable (very) slow:
QFileInfo should use caching of OS API response, so using setCaching(true) is unnecessary (Even if I use it, it changes nothing).
You have no idea how much any "OS API" caching of file information might or might not achieve time-wise.
So either caching does not work, or the problem is with something other.
One possibility might be if
QFileInfo
caching has a limit on how many files it retains in cache. You talk about 30,000 files. That might exceed the cache size and render it ineffective. If you bring it down to, say, 100 files how do the timings compare?BTW, I find the new method
QFileInfo::stat
useless.And what would that be? You don't say anything about what version of Qt you are using.
-
@ChrisW67
Might be relevant if the OP stated this, maybe this is a Qt6 issue?Why it returns void? I sure it should work the same way as in others languages, it should return a data struct similars to one is returned by
Why should it? It is a class instance method, it fills the necessary class members which can then be accessed through the Qt class's members interface instead of some potentially OS-dependent structure for which Qt would have to introduce support. Seems fine to me. You seem to be pretty angry.
-
https://doc.qt.io/qt-6/qfileinfo.html#setCaching
Says:When caching is enabled, QFileInfo reads the file information from the file system the first time it's needed, but generally not later.
Caching is enabled by default.It clearly says that it
QFileInfo
requests the data only once, on the first demand. It's the default behaviour.for (QFileInfo &fileInfo : fileInfoList) { fileInfo.lastModified(); }
So, the expected behaviour, that the second run of this code will take
1
ms, but no, it takes90
ms each run.
There is no any note about limits.
Qt 6.2. If it was Qt 5.x the code would not compile.
Okay, 100 files:
[timer][entryInfoList]: 0 ms [timer][sort(QList<QFileInfo>)]: 4 ms [timer][sort(QList<QFileInfo>)]: 3 ms [timer][entryInfoList]: 0 ms [timer][myFileInfoList]: 0 ms [timer][sort(QList<MyFileInfo>)]: 0 ms [timer][sort(QList<MyFileInfo>)]: 0 ms -------------------- [timer][entryInfoList]: 0 ms [timer][stat]: 1 ms [timer][lastModified]: 0 ms [timer][lastModified x2]: 0 ms [timer][lastModified x2, birthTime]: 0 ms
4
ms to sort, and3
ms to sort already sorted, while it should take "0
" ms.200:
[timer][entryInfoList]: 1 ms [timer][sort(QList<QFileInfo>)]: 10 ms [timer][sort(QList<QFileInfo>)]: 8 ms [timer][entryInfoList]: 0 ms [timer][myFileInfoList]: 0 ms [timer][sort(QList<MyFileInfo>)]: 0 ms [timer][sort(QList<MyFileInfo>)]: 0 ms -------------------- [timer][entryInfoList]: 0 ms [timer][stat]: 4 ms [timer][lastModified]: 0 ms [timer][lastModified x2]: 1 ms [timer][lastModified x2, birthTime]: 1 ms
Fantastic.
for (QFileInfo &fileInfo : fileInfoList) { fileInfo.lastModified(); }
takes
0
ms. Howeverfor (QFileInfo &fileInfo : fileInfoList) { fileInfo.lastModified(); fileInfo.lastModified(); }
takes
1
ms.For 400 files:
1
ms and2
ms.If there is a "limit", it is
0
. -
QFileInfo::stat() does what it should and I don't see why it should return some kind of a structure because the structure is the QFileInfo instance.
I can reproduce the non-caching of the filetimes - also on linux, need some time to see what's going on.
-
QFileInfo::stat() does what it should and I don't see why it should return some kind of a structure because the structure is the QFileInfo instance.
Does
QFileInfo
containdev
,ino
,mode
,nlink
,uid
,gid
,rdev
,blksize
,blocks
?It looks that Qt just ignores that data (while it can return them), so if
stat
returns that data it would be useful.It works such way, for example:
- In Node.js: https://nodejs.org/api/fs.html#fsstatpath-options-callback
- In Python: https://docs.python.org/3/library/os.html#os.stat
-
@RTEDFD said in Either `QFileInfo` caching does not work, or just its methods are unreasonable (very) slow:
uid, gid
Yes - https://doc.qt.io/qt-5/qfileinfo.html#groupId and https://doc.qt.io/qt-5/qfileinfo.html#ownerId
The rest is to OS-specific as @JonB aleady said.
-
Ok, I think the caching works, it's the QDateTime creation. You can prove that when you add
for (QFileInfo &fileInfo : fileInfoList) { fileInfo.setCaching(false); }
After you gathered the file information.
/edit:
Looks like it's the QDateTime conversion from UTC to localTime here. This is done after the cache and therefore executed every time. RemovingtoLocalTime()
gives very impressive results:with .toLocalTime() [timer][sort(QList<MyFileInfo>)]: 35 ms [timer][sort(QList<MyFileInfo>)]: 30 ms [timer][sort(QList<QFileInfo>)]: 176 ms [timer][sort(QList<QFileInfo>)]: 169 ms without .toLocalTime() [timer][sort(QList<MyFileInfo>)]: 3 ms [timer][sort(QList<MyFileInfo>)]: 2 ms [timer][sort(QList<QFileInfo>)]: 5 ms [timer][sort(QList<QFileInfo>)]: 3 ms
Source code: https://code.woboq.org/qt5/qtbase/src/corelib/io/qfileinfo.cpp.html#_ZNK9QFileInfo8fileTimeEN11QFileDevice8FileTimeE
QDateTime::toLocalTime() calls some OS tz - Functions and one of this is calling getenv() which is the bottleneck here on Linux. Think it's a similar thing on Windows.
-
Most likely it is.
Since the caching code is pretty simple*: https://github.com/qt/qtbase/blob/f29566c5a41c127eacaf13f3dbfe4624e55bc83f/src/corelib/io/qfileinfo.cpp#L191-L218
It checks does there a value in the array, if it does, returns it, so the performance should be the same as when I read it from
MyFileInfo
object.
*Also it caches ONLY the requested time, so if I need all 4 time values (M, B, C, A) I need to ask OS 4 times, while most likely the OS call returns all 4 times once. It's not optimal in terms of performance.
Using of that
stats
method is overkill, since it is slower that calls of 4 methods (lastModified
,birthTime
,metadataChangeTime
,lastRead
). Currently with that 30000 files directorystat
takes1100
ms, onlylastModified
—95
ms, all 4 times —360
ms. -
@RTEDFD
I looked at the code that @Christian-Ehrlicher has been looking at. Unfortunately you cannot influence/replace the conversion to localtime each time you call one of the filetime functions, and it uses internal Qt calls you cannot access so it's not possible to write a drop in replacement.So unless you want to wait for a possible fix --- which I think would mean them changing quite some code they may not be prepared to do --- if this speed issue breaks you you are left with are you prepared to use your own structure like you show so that you can hold the converted times for the files?
-
@RTEDFD said in Either `QFileInfo` caching does not work, or just its methods are unreasonable (very) slow:
*Also it caches ONLY the requested time, so if I need all 4 time values (M, B, C, A) I need to ask OS 4 times, while most likely the OS call returns all 4 times once. It's not optimal in terms of performance.
You should really look and understand the code before ranting about something. You're wrong: https://code.qt.io/cgit/qt/qtbase.git/tree/src/corelib/io/qfilesystemengine_unix.cpp#n377
Most likely it is.
Not only most likely - see my numbers.
Here again numbers for 30k files. With QDateTime::toLocalTime() 1.3s, without 0.07s.
-
Here the bug report: https://bugreports.qt.io/browse/QTBUG-100349
-
It looks you run the code from the bug report in the debug mode.
When I run your code in the debug mode:
getFileInfos: 62 #files: 30000 myFileInfoList 93 bencMyFileInfo 19 bencQFileInfo 2925
The difference is
150
times.But when I run it in the release mode:
getFileInfos: 64 #files: 30000 myFileInfoList 93 bencMyFileInfo 2 bencQFileInfo 2879
MyFileInfo
sorting is1439
times faster. -
@RTEDFD said in Either `QFileInfo` caching does not work, or just its methods are unreasonable (very) slow:
It looks you run the code from the bug report in the debug mode.
I don't see what it should matter - the problem was identified and it the testcase properly shows it.
-
Quite old but in qt6.6 QDateTime::lastModified() gained a new parameter 'QTimeZone' where you can pass QTimeZone::UTC to avoid conversion to the local time.
See also https://codereview.qt-project.org/c/qt/qtbase/+/437009 -
Only for
QDateTime::lastModified()
?The problem exists with the other times too.
For example,
birthTime()
andmetadataChangeTime()
are affected with this issue too. -
@RTEDFD
https://bugreports.qt.io/browse/QTBUG-100349 states:Should be fixed in Qt 6.6 https://codereview.qt-project.org/c/qt/qtbase/+/437009
You can now pass a QTimeZone to any of the QFileInfo file times related methods to specify which time zone returned times should be in
So should apply to e.g.
birthTime()
too, but awaits Qt6.6.. -