Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB
-
int main(int argc, char** argv) { QFile* testFile = new QFile(":/TestFile.txt"); testFile->open(QFile::ReadOnly); QByteArrayList allData = testFile->readAll().split('\n'); Instrument instr; Tick t; QElapsedTimer* parseTimer1 = new QElapsedTimer(); parseTimer1->start(); for (int ii = 0; ii < allData.size() - 1; ii++) { QByteArrayList data = allData.at(ii).split(';'); t.dt = QDateTime::fromString(data.at(0), "yyyyMMdd HHmmss zzz0000"); t.last = data.at(1).toDouble(); t.bid = data.at(2).toDouble(); t.ask = data.at(3).toDouble(); t.volume = data.at(4).toInt(); instr.tickList.append(t); } qDebug().noquote() << QString("Qt parse time: %1ms") .arg(parseTimer1->elapsed()); QElapsedTimer* parseTimer2 = new QElapsedTimer(); parseTimer2->start(); testFile->reset(); instr.tickList.clear(); instr.tickList.reserve(1000000); auto DateTimeParser =[](const QStringRef & string) ->QDateTime { const QDate date(string.left(4).toInt(), string.mid(4,2).toInt(), string.mid(6.2).toInt()); const QTime time(string.mid(9,2).toInt(), string.mid(11,2).toInt(), string.mid(13.2).toInt(), string.mid(15.3).toInt()); QDateTime dt (date,time); return dt; }; QTextStream readFile(testFile); QString line; line.reserve(100); const QChar semicolon(';'); while(!readFile.atEnd()){ if(!readFile.readLineInto(&line,100)){ break; } const auto result = line.splitRef(semicolon); instr.tickList.append({DateTimeParser(result.at(0)),result.at(1).toDouble(), result.at(2).toDouble(), result.at(3).toDouble(),result.at(3).toInt()}); } qDebug().noquote() << QString("Qt parse time: %1ms") .arg(parseTimer2->elapsed()); }
result: 1440 ms, but your pc might be faster :D
-
int main(int argc, char** argv) { QFile* testFile = new QFile(":/TestFile.txt"); testFile->open(QFile::ReadOnly); QByteArrayList allData = testFile->readAll().split('\n'); Instrument instr; Tick t; QElapsedTimer* parseTimer1 = new QElapsedTimer(); parseTimer1->start(); for (int ii = 0; ii < allData.size() - 1; ii++) { QByteArrayList data = allData.at(ii).split(';'); t.dt = QDateTime::fromString(data.at(0), "yyyyMMdd HHmmss zzz0000"); t.last = data.at(1).toDouble(); t.bid = data.at(2).toDouble(); t.ask = data.at(3).toDouble(); t.volume = data.at(4).toInt(); instr.tickList.append(t); } qDebug().noquote() << QString("Qt parse time: %1ms") .arg(parseTimer1->elapsed()); QElapsedTimer* parseTimer2 = new QElapsedTimer(); parseTimer2->start(); testFile->reset(); instr.tickList.clear(); instr.tickList.reserve(1000000); auto DateTimeParser =[](const QStringRef & string) ->QDateTime { const QDate date(string.left(4).toInt(), string.mid(4,2).toInt(), string.mid(6.2).toInt()); const QTime time(string.mid(9,2).toInt(), string.mid(11,2).toInt(), string.mid(13.2).toInt(), string.mid(15.3).toInt()); QDateTime dt (date,time); return dt; }; QTextStream readFile(testFile); QString line; line.reserve(100); const QChar semicolon(';'); while(!readFile.atEnd()){ if(!readFile.readLineInto(&line,100)){ break; } const auto result = line.splitRef(semicolon); instr.tickList.append({DateTimeParser(result.at(0)),result.at(1).toDouble(), result.at(2).toDouble(), result.at(3).toDouble(),result.at(3).toInt()}); } qDebug().noquote() << QString("Qt parse time: %1ms") .arg(parseTimer2->elapsed()); }
result: 1440 ms, but your pc might be faster :D
@J-Hilk said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
QByteArrayList allData = testFile->readAll().split('\n');
...
QByteArrayList data = allData.at(ii).split(';');Sadly QByteArrayView doesn't have yet a split() function as QStringView has - the two lines above are painful slow for big datasets.
-
@JonB @J-Hilk @DerReisende Thanks for all the responses! Didn't actually expect much here. I apologize for not providing more details. I've been dealing with this file parsing issue in C++ for years. Same code in Windows takes >100x times to complete rather than using Linux for some odd reason which I've posted in C++ forums prior to using Qt, but what you've provided is actually the first significant improvement I've ever seen.
So thank you for that!
I've tested this with versions 512, 5.15, 6.0, 6.2.4, and 6.4. Never noticed a major difference between them regarding this issue. Current machine: i7-6700 with 32GB RAM. So not sure what y'all are working with but the results seem promising.
I was previously streaming into a QTextStream then reading line-by-line but came across this post: https://forum.qt.io/topic/98282/parsing-large-big-text-files-quickly and a couple of others that suggested that is more expensive that using a QByteArray. I didn't notice much difference to be quite honest.
I did comment out the QDateTime parsing just to check and it was a significant improvement. Not quite like Rust but I'll attribute that to @JonB comment:
That "naive" means it does not do any local time/daylight etc, conversions.
If any of you are interested, I'll test each of your solutions and provide an update. But this actually woke me up and got me excited to start my day so thank you.
-
@JonB @J-Hilk @DerReisende Thanks for all the responses! Didn't actually expect much here. I apologize for not providing more details. I've been dealing with this file parsing issue in C++ for years. Same code in Windows takes >100x times to complete rather than using Linux for some odd reason which I've posted in C++ forums prior to using Qt, but what you've provided is actually the first significant improvement I've ever seen.
So thank you for that!
I've tested this with versions 512, 5.15, 6.0, 6.2.4, and 6.4. Never noticed a major difference between them regarding this issue. Current machine: i7-6700 with 32GB RAM. So not sure what y'all are working with but the results seem promising.
I was previously streaming into a QTextStream then reading line-by-line but came across this post: https://forum.qt.io/topic/98282/parsing-large-big-text-files-quickly and a couple of others that suggested that is more expensive that using a QByteArray. I didn't notice much difference to be quite honest.
I did comment out the QDateTime parsing just to check and it was a significant improvement. Not quite like Rust but I'll attribute that to @JonB comment:
That "naive" means it does not do any local time/daylight etc, conversions.
If any of you are interested, I'll test each of your solutions and provide an update. But this actually woke me up and got me excited to start my day so thank you.
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
f any of you are interested, I'll test each of your solutions and provide an update
sure, feedback is always appreciated! Nothing more discouraging than getting ghosted after providing an answer :D
But this actually woke me up and got me excited to start my day so thank you
👍 thumbsup
-
Also, I'd like to note that I did run both Rust and Qt in release mode. Tried to give Qt the best shot I could. My initial thoughts were some type of buffering/caching Rust did internally but I've attempted tests with C++ on that front and can't match it.
-
Is there a reason why your Tick class's rust impl uses 32bit floats and your c++ code 64bit doubles?
@DerReisende Didn't actually notice that until now. But with doubles, the performance is almost the same.
-
@JonB @J-Hilk @DerReisende Thanks for all the responses! Didn't actually expect much here. I apologize for not providing more details. I've been dealing with this file parsing issue in C++ for years. Same code in Windows takes >100x times to complete rather than using Linux for some odd reason which I've posted in C++ forums prior to using Qt, but what you've provided is actually the first significant improvement I've ever seen.
So thank you for that!
I've tested this with versions 512, 5.15, 6.0, 6.2.4, and 6.4. Never noticed a major difference between them regarding this issue. Current machine: i7-6700 with 32GB RAM. So not sure what y'all are working with but the results seem promising.
I was previously streaming into a QTextStream then reading line-by-line but came across this post: https://forum.qt.io/topic/98282/parsing-large-big-text-files-quickly and a couple of others that suggested that is more expensive that using a QByteArray. I didn't notice much difference to be quite honest.
I did comment out the QDateTime parsing just to check and it was a significant improvement. Not quite like Rust but I'll attribute that to @JonB comment:
That "naive" means it does not do any local time/daylight etc, conversions.
If any of you are interested, I'll test each of your solutions and provide an update. But this actually woke me up and got me excited to start my day so thank you.
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Thanks for all the responses! Didn't actually expect much here.
Can't speak for other fora, but we are quality here in this forum :)
If any of you are interested, I'll test each of your solutions and provide an update.
Please do :) Note that mine will be least the code for the greatest speed benefit :) You get your money back if you don't think so ;-) Note that really the file reading, buffering etc. is marginal to the whole, the single most important thing is that
QDateTime::fromString()
is "unusably bad" for performance, unless you move to Qt 6.3+ and say it's a lot better there. -
@DerReisende Didn't actually notice that until now. But with doubles, the performance is almost the same.
@TheLumbee Using floats instead of doubles should minimize memory usage. And my solution was made with Qt 6.4
-
@TheLumbee Using floats instead of doubles should minimize memory usage. And my solution was made with Qt 6.4
@DerReisende
Purely OoI: with your 6.4 did you just try the originalQDateTime::fromString()
as-was, did that have a significant improvement over the "buggy" previous one? -
@DerReisende
Purely OoI: with your 6.4 did you just try the originalQDateTime::fromString()
as-was, did that have a significant improvement over the "buggy" previous one?@JonB
QDateTime::fromString
without any optimization takes 37 seconds to complete --> slow as hell.
I tested performance ofQDate::fromString
in all Qt6 versions on windows and it NEVER got any faster - thats why I am using the posted cache approach to increase the performance in my app. -
Rust is made for quicker performance. Qt is made for HMI and speed is not the key feature.
If there are bottlenecks in your app, you can add Rust lib to handle them. If you know Rust well, the big banks will like to have you for high-speed trading apps. Rust is getting popular now and it is something: good to know. -
The problem with QDateTime is that for every call the internal format parser (QDateTimeParser, private class) is re-created and needs to re-evaluate the string. This takes a lot of time.
-
The problem with QDateTime is that for every call the internal format parser (QDateTimeParser, private class) is re-created and needs to re-evaluate the string. This takes a lot of time.
@Christian-Ehrlicher
This is true, but you have to reflect on Rust is using itsNaiveDateTime::parse_from_str(dt, "%Y%m%d %H%M%S %f")
for this and is 30x-odd faster. That is not an "acceptable" difference, and the question is what to do from Qt to get acceptable performance regardless of the reasons? -
@Christian-Ehrlicher
This is true, but you have to reflect on Rust is using itsNaiveDateTime::parse_from_str(dt, "%Y%m%d %H%M%S %f")
for this and is 30x-odd faster. That is not an "acceptable" difference, and the question is what to do from Qt to get acceptable performance regardless of the reasons?@JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
nd the question is what to do from Qt to get acceptable performance regardless of the reasons?
compare rust with std::get_time() or similar functions. And maybe open a bug report with the findings here.
-
@JonB @J-Hilk @DerReisende Thanks for all the responses! Didn't actually expect much here. I apologize for not providing more details. I've been dealing with this file parsing issue in C++ for years. Same code in Windows takes >100x times to complete rather than using Linux for some odd reason which I've posted in C++ forums prior to using Qt, but what you've provided is actually the first significant improvement I've ever seen.
So thank you for that!
I've tested this with versions 512, 5.15, 6.0, 6.2.4, and 6.4. Never noticed a major difference between them regarding this issue. Current machine: i7-6700 with 32GB RAM. So not sure what y'all are working with but the results seem promising.
I was previously streaming into a QTextStream then reading line-by-line but came across this post: https://forum.qt.io/topic/98282/parsing-large-big-text-files-quickly and a couple of others that suggested that is more expensive that using a QByteArray. I didn't notice much difference to be quite honest.
I did comment out the QDateTime parsing just to check and it was a significant improvement. Not quite like Rust but I'll attribute that to @JonB comment:
That "naive" means it does not do any local time/daylight etc, conversions.
If any of you are interested, I'll test each of your solutions and provide an update. But this actually woke me up and got me excited to start my day so thank you.
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
@JonB @J-Hilk @DerReisende Thanks for all the responses! Didn't actually expect much here. I apologize for not providing more details. I've been dealing with this file parsing issue in C++ for years. Same code in Windows takes >100x times to complete rather than using Linux for some odd reason which I've posted in C++ forums prior to using Qt, but what you've provided is actually the first significant improvement I've ever seen.
Just tested my solution on windows 11 with VS 2022...
My cached solution finishes in 19 seconds (compared to approx. 7 on macOS), the original version takes 575 seconds...OMG! Release mode with AVX2 enabled. -
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
f any of you are interested, I'll test each of your solutions and provide an update
sure, feedback is always appreciated! Nothing more discouraging than getting ghosted after providing an answer :D
But this actually woke me up and got me excited to start my day so thank you
👍 thumbsup
@J-Hilk said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
sure, feedback is always appreciated! Nothing more discouraging than getting ghosted after providing an answer :D
Thanks for the answer, but QStringRef is not available in Qt 6.4. Can't test this solution. Although I did replace with QString, but the performance was middling.
-
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Thanks for all the responses! Didn't actually expect much here.
Can't speak for other fora, but we are quality here in this forum :)
If any of you are interested, I'll test each of your solutions and provide an update.
Please do :) Note that mine will be least the code for the greatest speed benefit :) You get your money back if you don't think so ;-) Note that really the file reading, buffering etc. is marginal to the whole, the single most important thing is that
QDateTime::fromString()
is "unusably bad" for performance, unless you move to Qt 6.3+ and say it's a lot better there.@JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Please do :) Note that mine will be least the code for the greatest speed benefit :) You get your money back if you don't think so ;-) Note that really the file reading, buffering etc. is marginal to the whole, the single most important thing is that
QDateTime::fromString()
is "unusably bad" for performance, unless you move to Qt 6.3+ and say it's a lot better there.Of all the solutions, this one has the best performance, for Qt at least. Results are around 3900ms. Still quite a bit slower than Rust. Issue is, I'm looking at parsing files 1GB+ in size so that really adds up when its 5x the time to complete.
-
Rust is made for quicker performance. Qt is made for HMI and speed is not the key feature.
If there are bottlenecks in your app, you can add Rust lib to handle them. If you know Rust well, the big banks will like to have you for high-speed trading apps. Rust is getting popular now and it is something: good to know.@JoeCFD said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Rust is made for quicker performance. Qt is made for HMI and speed is not the key feature.
If there are bottlenecks in your app, you can add Rust lib to handle them. If you know Rust well, the big banks will like to have you for high-speed trading apps. Rust is getting popular now and it is something: good to know.I'm beginning to notice that. But I haven't even found a C/C++ implementation that competes, at least in this aspect alone, with what Rust is doing.
I am actually considering just creating a Rust lib to handle this. Not sure what other options I have if I'm looking to get this kind of performance. I know Rust well enough so let the big banks know I'm available ;-).
Given that I'm expecting much larger file sizes, using Rust would be the best choice I think. Maybe a solution will arise with C++ that can handle it.
-
@JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
nd the question is what to do from Qt to get acceptable performance regardless of the reasons?
compare rust with std::get_time() or similar functions. And maybe open a bug report with the findings here.
@Christian-Ehrlicher said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
@JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
nd the question is what to do from Qt to get acceptable performance regardless of the reasons?
compare rust with std::get_time() or similar functions. And maybe open a bug report with the findings here.
I'll try to get to this at some point. Just seems insane that the performance between Rust and C++ is so different when it comes to parsing these files.
-
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
@JonB @J-Hilk @DerReisende Thanks for all the responses! Didn't actually expect much here. I apologize for not providing more details. I've been dealing with this file parsing issue in C++ for years. Same code in Windows takes >100x times to complete rather than using Linux for some odd reason which I've posted in C++ forums prior to using Qt, but what you've provided is actually the first significant improvement I've ever seen.
Just tested my solution on windows 11 with VS 2022...
My cached solution finishes in 19 seconds (compared to approx. 7 on macOS), the original version takes 575 seconds...OMG! Release mode with AVX2 enabled.@DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Just tested my solution on windows 11 with VS 2022...
My cached solution finishes in 19 seconds (compared to approx. 7 on macOS), the original version takes 575 seconds...OMG! Release mode with AVX2 enabled.Yeah, there has to be something internally with the way Windows handles I/O for the massive difference. I've been trying to narrow down the issue for years without success. Interesting point though.