Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB
-
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Thanks for all the responses! Didn't actually expect much here.
Can't speak for other fora, but we are quality here in this forum :)
If any of you are interested, I'll test each of your solutions and provide an update.
Please do :) Note that mine will be least the code for the greatest speed benefit :) You get your money back if you don't think so ;-) Note that really the file reading, buffering etc. is marginal to the whole, the single most important thing is that
QDateTime::fromString()
is "unusably bad" for performance, unless you move to Qt 6.3+ and say it's a lot better there.@JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Please do :) Note that mine will be least the code for the greatest speed benefit :) You get your money back if you don't think so ;-) Note that really the file reading, buffering etc. is marginal to the whole, the single most important thing is that
QDateTime::fromString()
is "unusably bad" for performance, unless you move to Qt 6.3+ and say it's a lot better there.Of all the solutions, this one has the best performance, for Qt at least. Results are around 3900ms. Still quite a bit slower than Rust. Issue is, I'm looking at parsing files 1GB+ in size so that really adds up when its 5x the time to complete.
-
Rust is made for quicker performance. Qt is made for HMI and speed is not the key feature.
If there are bottlenecks in your app, you can add Rust lib to handle them. If you know Rust well, the big banks will like to have you for high-speed trading apps. Rust is getting popular now and it is something: good to know.@JoeCFD said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Rust is made for quicker performance. Qt is made for HMI and speed is not the key feature.
If there are bottlenecks in your app, you can add Rust lib to handle them. If you know Rust well, the big banks will like to have you for high-speed trading apps. Rust is getting popular now and it is something: good to know.I'm beginning to notice that. But I haven't even found a C/C++ implementation that competes, at least in this aspect alone, with what Rust is doing.
I am actually considering just creating a Rust lib to handle this. Not sure what other options I have if I'm looking to get this kind of performance. I know Rust well enough so let the big banks know I'm available ;-).
Given that I'm expecting much larger file sizes, using Rust would be the best choice I think. Maybe a solution will arise with C++ that can handle it.
-
@JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
nd the question is what to do from Qt to get acceptable performance regardless of the reasons?
compare rust with std::get_time() or similar functions. And maybe open a bug report with the findings here.
@Christian-Ehrlicher said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
@JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
nd the question is what to do from Qt to get acceptable performance regardless of the reasons?
compare rust with std::get_time() or similar functions. And maybe open a bug report with the findings here.
I'll try to get to this at some point. Just seems insane that the performance between Rust and C++ is so different when it comes to parsing these files.
-
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
@JonB @J-Hilk @DerReisende Thanks for all the responses! Didn't actually expect much here. I apologize for not providing more details. I've been dealing with this file parsing issue in C++ for years. Same code in Windows takes >100x times to complete rather than using Linux for some odd reason which I've posted in C++ forums prior to using Qt, but what you've provided is actually the first significant improvement I've ever seen.
Just tested my solution on windows 11 with VS 2022...
My cached solution finishes in 19 seconds (compared to approx. 7 on macOS), the original version takes 575 seconds...OMG! Release mode with AVX2 enabled.@DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Just tested my solution on windows 11 with VS 2022...
My cached solution finishes in 19 seconds (compared to approx. 7 on macOS), the original version takes 575 seconds...OMG! Release mode with AVX2 enabled.Yeah, there has to be something internally with the way Windows handles I/O for the massive difference. I've been trying to narrow down the issue for years without success. Interesting point though.
-
One issue I'm facing is still QDateTime parsing. Even if I create the Rust lib, it doesn't have QDateTime and std::chrono interoperability. So I was thinking of returning a Tick array with a const char* for DateTime but then I'd still have to go through the entire Tick array and convert all the values to QDateTimes anyway. This sort of bring back the initial problems.
Honestly, this performance thing is the big reason I'm struggling to decide between this, https://github.com/fzyzcjy/flutter_rust_bridge, and Qt/QML for projects. I love Qt but performance is huge for me. Just looking for a justification here.
-
One issue I'm facing is still QDateTime parsing. Even if I create the Rust lib, it doesn't have QDateTime and std::chrono interoperability. So I was thinking of returning a Tick array with a const char* for DateTime but then I'd still have to go through the entire Tick array and convert all the values to QDateTimes anyway. This sort of bring back the initial problems.
Honestly, this performance thing is the big reason I'm struggling to decide between this, https://github.com/fzyzcjy/flutter_rust_bridge, and Qt/QML for projects. I love Qt but performance is huge for me. Just looking for a justification here.
@TheLumbee Why don't you just parse the date string in a rust function which returns the number of msecs sinc midnight 1970? This way you only have to store a qint64 variable. If you need a QDateTime instance then just call
QDateTime::fromMSecsSinceEpoch
when needed. This should be fast. -
@TheLumbee Why don't you just parse the date string in a rust function which returns the number of msecs sinc midnight 1970? This way you only have to store a qint64 variable. If you need a QDateTime instance then just call
QDateTime::fromMSecsSinceEpoch
when needed. This should be fast.@DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
@TheLumbee Why don't you just parse the date string in a rust function which returns the number of msecs sinc midnight 1970? This way you only have to store a qint64 variable. If you need a QDateTime instance then just call
QDateTime::fromMSecsSinceEpoch
when needed. This should be fast.Don't know why I didn't think of that. I'll write the Rust lib, and hopefully have some great results. I'll let you know.
-
@Christian-Ehrlicher said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
@JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
nd the question is what to do from Qt to get acceptable performance regardless of the reasons?
compare rust with std::get_time() or similar functions. And maybe open a bug report with the findings here.
I'll try to get to this at some point. Just seems insane that the performance between Rust and C++ is so different when it comes to parsing these files.
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Just seems insane that the performance between Rust and C++ is so different when it comes to parsing these files.
Simply read my explanation on what needs to be done when parsing a datetime string. This can be optimized in Qt but if someone needs speed other solutions are much better (see @J-Hilk 's solution by bypassing QDateTime::fromString() at all) so noone is doing it.
-
One issue I'm facing is still QDateTime parsing. Even if I create the Rust lib, it doesn't have QDateTime and std::chrono interoperability. So I was thinking of returning a Tick array with a const char* for DateTime but then I'd still have to go through the entire Tick array and convert all the values to QDateTimes anyway. This sort of bring back the initial problems.
Honestly, this performance thing is the big reason I'm struggling to decide between this, https://github.com/fzyzcjy/flutter_rust_bridge, and Qt/QML for projects. I love Qt but performance is huge for me. Just looking for a justification here.
@TheLumbee
From where you are now. If you comment out the datetime handling in the C++ (and the Rust if you like), is your performance timing for the Qt/C++ acceptable compared to the Rust? So it is only theQDateTime
parsing which is the issue for you? -
@JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
I doubt this is significant
it is, depending on Qt Version, if 5.15 than yes, using splitRef should be significantly faster. Not sure if they optimised split in Qt6 or simply dropped splitRef because hardly any one used it 🤷♂️
@J-Hilk said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Not sure if they optimised split in Qt6 or simply dropped splitRef because hardly any one used it 🤷♂️
QStringRef is gone and so is splitRef().
You should use QStringView and it's split() version. -
Here is a version of @J-Hilk 's program modified to run with Qt 6.4:
int main(int argc, char **argv) { QFile *testFile = new QFile(R"(C:\Users\aea_t\Downloads\TestFile.txt)"); testFile->open(QFile::ReadOnly); Instrument instr; Tick t; QElapsedTimer *parseTimer2 = new QElapsedTimer(); parseTimer2->start(); testFile->reset(); instr.tickList.reserve(1000000); auto DateTimeParser = [](const QStringView string) -> QDateTime { const QDate date(string.left(4).toInt(), string.mid(4, 2).toInt(), string.mid(6, 2).toInt()); const QTime time(string.mid(9, 2).toInt(), string.mid(11, 2).toInt(), string.mid(13, 2).toInt(), string.mid(15, 3).toInt()); QDateTime dt(date, time); return dt; }; QTextStream readFile(testFile); QString line; line.reserve(100); const QChar semicolon(';'); while (!readFile.atEnd()) { if (!readFile.readLineInto(&line, 100)) { break; } const auto result = line.split(semicolon); instr.tickList.append( {DateTimeParser(result.at(0)), result.at(1).toFloat(), result.at(2).toFloat(), result.at(3).toFloat(), result.at(3).toUInt()}); } qDebug().noquote() << QString("Qt parse time: %1ms") .arg(parseTimer2->elapsed()); }
Before we continue complaining about Qt it would be great to know on what OS @TheLumbee is running the code?
All solution seem to run much slower on Windows than Linux and macOS.
I tried the above version on my Win 11 machine VS2022 and it takes around 32 seconds to complete (compared to his 1.4 seconds on macOS). I don't have a mingw installation for comparison unfortunately :( -
Here is a version of @J-Hilk 's program modified to run with Qt 6.4:
int main(int argc, char **argv) { QFile *testFile = new QFile(R"(C:\Users\aea_t\Downloads\TestFile.txt)"); testFile->open(QFile::ReadOnly); Instrument instr; Tick t; QElapsedTimer *parseTimer2 = new QElapsedTimer(); parseTimer2->start(); testFile->reset(); instr.tickList.reserve(1000000); auto DateTimeParser = [](const QStringView string) -> QDateTime { const QDate date(string.left(4).toInt(), string.mid(4, 2).toInt(), string.mid(6, 2).toInt()); const QTime time(string.mid(9, 2).toInt(), string.mid(11, 2).toInt(), string.mid(13, 2).toInt(), string.mid(15, 3).toInt()); QDateTime dt(date, time); return dt; }; QTextStream readFile(testFile); QString line; line.reserve(100); const QChar semicolon(';'); while (!readFile.atEnd()) { if (!readFile.readLineInto(&line, 100)) { break; } const auto result = line.split(semicolon); instr.tickList.append( {DateTimeParser(result.at(0)), result.at(1).toFloat(), result.at(2).toFloat(), result.at(3).toFloat(), result.at(3).toUInt()}); } qDebug().noquote() << QString("Qt parse time: %1ms") .arg(parseTimer2->elapsed()); }
Before we continue complaining about Qt it would be great to know on what OS @TheLumbee is running the code?
All solution seem to run much slower on Windows than Linux and macOS.
I tried the above version on my Win 11 machine VS2022 and it takes around 32 seconds to complete (compared to his 1.4 seconds on macOS). I don't have a mingw installation for comparison unfortunately :(@DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
const auto result = line.split(semicolon);
Please read my last post about QStringView
-
@DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
const auto result = line.split(semicolon);
Please read my last post about QStringView
@Christian-Ehrlicher It is just a port of @J-Hilk code to run with 6.4. I do not intend to rework it for QStringView - and my first attempt was incorrect anyways.
-
@Christian-Ehrlicher It is just a port of @J-Hilk code to run with 6.4. I do not intend to rework it for QStringView - and my first attempt was incorrect anyways.
@DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
It is just a port of @J-Hilk code to run with 6.4.
But the port is wrong - you're now creating useless QString objects ...
-
@DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
It is just a port of @J-Hilk code to run with 6.4.
But the port is wrong - you're now creating useless QString objects ...
@Christian-Ehrlicher Maybe I don't get the problem:
.splitRef()
does not exist anymore in 64..split()
returns a QStringList which is implicitly shared according to the doc. Where is the mistake and how would you come to aQStringRef/QStringView
version on Qt 6.4? -
@Christian-Ehrlicher Maybe I don't get the problem:
.splitRef()
does not exist anymore in 64..split()
returns a QStringList which is implicitly shared according to the doc. Where is the mistake and how would you come to aQStringRef/QStringView
version on Qt 6.4?@DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Where is the mistake and how would you come to a QStringRef/QStringView version on Qt 6.4?
QString::splitRef() - create a QList/Vector with references to parts of a QString, no QString objects are created, nothing gets copied
QString::split() - create a QList/Vector with newly create QString objectsTo avoid the creation of the QString objects and get the same behavior as before, use QStringView::split()0
-
Here is a version of @J-Hilk 's program modified to run with Qt 6.4:
int main(int argc, char **argv) { QFile *testFile = new QFile(R"(C:\Users\aea_t\Downloads\TestFile.txt)"); testFile->open(QFile::ReadOnly); Instrument instr; Tick t; QElapsedTimer *parseTimer2 = new QElapsedTimer(); parseTimer2->start(); testFile->reset(); instr.tickList.reserve(1000000); auto DateTimeParser = [](const QStringView string) -> QDateTime { const QDate date(string.left(4).toInt(), string.mid(4, 2).toInt(), string.mid(6, 2).toInt()); const QTime time(string.mid(9, 2).toInt(), string.mid(11, 2).toInt(), string.mid(13, 2).toInt(), string.mid(15, 3).toInt()); QDateTime dt(date, time); return dt; }; QTextStream readFile(testFile); QString line; line.reserve(100); const QChar semicolon(';'); while (!readFile.atEnd()) { if (!readFile.readLineInto(&line, 100)) { break; } const auto result = line.split(semicolon); instr.tickList.append( {DateTimeParser(result.at(0)), result.at(1).toFloat(), result.at(2).toFloat(), result.at(3).toFloat(), result.at(3).toUInt()}); } qDebug().noquote() << QString("Qt parse time: %1ms") .arg(parseTimer2->elapsed()); }
Before we continue complaining about Qt it would be great to know on what OS @TheLumbee is running the code?
All solution seem to run much slower on Windows than Linux and macOS.
I tried the above version on my Win 11 machine VS2022 and it takes around 32 seconds to complete (compared to his 1.4 seconds on macOS). I don't have a mingw installation for comparison unfortunately :(@DerReisende I'm running on Ubuntu 22.04 64-bit. I'm currently creating the Rust lib. Got the shared lib and C-header file generated, but having an issue linking it all. So hopefully that will be resolved soon.
-
@TheLumbee
From where you are now. If you comment out the datetime handling in the C++ (and the Rust if you like), is your performance timing for the Qt/C++ acceptable compared to the Rust? So it is only theQDateTime
parsing which is the issue for you?@JonB Unfortunately, no. The solutions provided are by far leaps and bounds better than before, but the file sizes I'm expecting will definitely cause an issue if it's all done in Qt. I'm currently writing a Rust lib to resolve the problem for now, but even a 2-3x increase with Rust will make a monumental difference.
-
@DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Where is the mistake and how would you come to a QStringRef/QStringView version on Qt 6.4?
QString::splitRef() - create a QList/Vector with references to parts of a QString, no QString objects are created, nothing gets copied
QString::split() - create a QList/Vector with newly create QString objectsTo avoid the creation of the QString objects and get the same behavior as before, use QStringView::split()0
@Christian-Ehrlicher
Ok I modified it with the following://const auto result = line.split(semicolon); const QStringView sv{line}; const auto result = sv.split(semicolon);
Doesn't make a difference in runtime. And looking at the QString split code I am almost sure that line.split already does this optimization through implicit sharing through QStringList.
-
@Christian-Ehrlicher
Ok I modified it with the following://const auto result = line.split(semicolon); const QStringView sv{line}; const auto result = sv.split(semicolon);
Doesn't make a difference in runtime. And looking at the QString split code I am almost sure that line.split already does this optimization through implicit sharing through QStringList.
@DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
I am almost sure that line.split already does this optimization through implicit sharing through QStringList.
No, implicit sharing of a container has nothing to do with creating new QString objects - QStringList is a list of string objects, not a list of QStringViews ...