Solved Split string every two chara
-
Hi,
I have a string that looks like this
6628286628443028289c8368294829282828282828733300What would be the best way to split this into string list that would look like this
66 28 28 66 28 44 30 28 28 9c 83 68 29 48 29 28 28 28 28 28 28 73 33 00
I need pairs of two strings.Tnx,
Zgembo -
-
This post is deleted! -
@lelev I have found out this way:
QRegularExpression rx("(..)");
QRegularExpressionMatchIterator rxIterator = rx.globalMatch(datas.toHex());
QStringList stringData;
while (rxIterator.hasNext()) {
QRegularExpressionMatch match = rxIterator.next();
QString word = match.captured(1);
stringData << word;
} -
@zgembo is datas a
QByteArray
?Then simply use
datas.toHex(' ');
Fastest and easiest.Regards
Edit: Ah, it should be in a string list. No problem, you use:
const QString s = datas.toHex(' '); const QStringList list = s.split(' ');
-
Guys, I know regular expressions are sexy an all, but they are terrible terrible performance and memory monsters. Don't use them for simple tasks like splitting arrays. It just hurts eyes to see.
If you really need the strings a simple for loop is a lot better:
QByteArray datas_hex = datas.toHex(); QStringList result; result.reserve(datas_hex.size() / 2); for (int i = 0; i < datas_hex.size(); i += 2) result.push_back(QString::fromLatin1(datas_hex.data() + i, 2));
If you can keep the original data around it's even better to not make any copies at all:
QByteArray datas_hex = datas.toHex(); QVector<QLatin1String> result; result2.reserve(datas_hex.size() / 2); for (int i = 0; i < datas.size(); i += 2) result.push_back(QLatin1String(datas_hex.data() + i, 2));
And if you can use more efficient container it's even better:
QByteArray datas_hex = datas.toHex(); std::vector<QLatin1String> result; result.reserve(datas_hex.size() / 2); for (int i = 0; i < datas_hex.size(); i += 2) result.emplace_back(datas_hex.data() + i, 2);
I did some timings for you. On a 10Mb data sample on my machine:
regex: 6572ms
string copy: 1021ms
string ref: 155ms
string ref + std::vector: 105ms
aha_1980 solution: 1322msPlease, please, please mind our battery lives and electrical bills.
-
@chris-kawa Looking at the OP's last code,
datas
contain the raw bytes, because they are convertedtoHex()
first. So you will need to adopt your code.Regards
-
@aha_1980 Thanks, I missed that. Still, what I said holds. I corrected my post.
-
hi.
its pretty hefty difference between regex and string ref. very interesting. -
@mrjj I consider Regexps good for validating short input data like login forms and the likes. For processing large amounts of data a handcrafted solution, even if you need to add couple of ifs or switches to match the regex is always gonna be a lot faster. They are just too generic to have good performance.
-
I guess thats the normal trade-off between generality/flexibility and hand made a specific solution.
It also explains why Qt syntax highlighting gets very heavy with huge files. :)