Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

[solved] Advise On Reading Lots Of Records From A File



  • I hope someone here can offer a trick or a few words of wisdon on how best to optimize reading lines from a file. I have been using QIODevice::readLine(), but for a file with line numbers approaching 230K it seems kind of slow. Does readLine() do any caching? Should I try to do my own caching with a QIODevice:;readAll(). With a readAll how do I then parse the big QByteArray to get sub QByteArrays terminated on eol. Perhaps I should I drop down and use Posix calls for reading large files? One more thing concerning Qt:

    In looking at the 5.x documentation there is a virtual method:
    QIODevice::readLineData(char * data, qint64 maxSize)

    This is in fact called by QIODevice::readLine, and the documentation has the following:

    "Buffered devices can improve the performance of readLine() by reimplementing this function"

    I guess my question is how can I reimplement this function to optimize for reading one line at a time. Thanks in advance for any help or suggestions. (Example code is always welcomed !)


  • Moderators

    "seems kind of slow" is quite a broad term ;-) Are you using the code from the snippet in "QFile docs":http://qt-project.org/doc/qt-5/QFile.html?

    You can do readAll() and then split the result using "\n" as the delimiter (if you have set the QFile::Text flag first), although I am not sure you will gain anything by doing so.



  • Slow being like 25 seconds to load. Yes I read it in a loop till EOF and process each line. The readAll should be faster as it reads the whole file in one or two big chunks. Then I can parse in memory which has got to be very fast. My question is that once I have the QByteArray from a readAll(), how do I parse it with the "\n" deliminator as you suggested. Do I create a QStringList ?

    Thanks for taking the time to reply

    -david


  • Moderators

    Ah, so I suspect your line processing is eating that time. But I might be wrong, of course.

    Here:
    @
    QByteArray allData;
    // ...
    QStringList lines(QString(allData).split("\n"));

    foreach (const QString &line, lines) {
    // process each line
    }
    @

    Should work. It might be that QByteArray also has split() method, I am not sure.



  • [SOLVED] Thanks for your suggestions. I tried both ways: readAll() with line parses on the big byte array read in, and reading one line at a time. There was almost no difference (readAll() was actually a few hundred milliseconds slower on average for 100k records).

    For my file, which loaded in 208,000 records, it took 119 seconds. Clearly the time is in the processing for which I will see if there is a way to speed up.


  • Moderators

    Right, happy coding! :-)


Log in to reply