Spliting a text file by according to a delimiter
-
I'm building an app that has an autocomplete text functionality.
What I'm trying to implement, but having trouble finding the solution for anywhere, is reading a block of text until a character set as a delimiter, i.e. reading a variable block size of text and iterating through it.
If we take "$" as a delimiter character, the example would be:Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum imperdiet massa leo, id auctor metus placerat sit amet. Nam semper nisl in diam feugiat laoreet. $ Etiam ut quam dignissim, scelerisque urna sed, imperdiet lacus. Nulla facilisi. Phasellus fringilla augue a ex tristique, gravida suscipit nibh auctor. Interdum et malesuada fames ac ante ipsum primis in faucibus. Curabitur lacinia ac metus eget facilisis. Praesent in lacinia leo. Suspendisse a nunc at enim tempus pharetra ac in libero. Sed lacinia lobortis erat et pellentesque. Curabitur scelerisque magna sit amet nisl scelerisque, non volutpat mi molestie. Vestibulum bibendum et magna sed dictum. Lorem ipsum dolor sit amet, consectetur adipiscing elit. $ Vestibulum congue, dolor quis faucibus volutpat $ ~~~~ end of text file ~~~~ My question is, is there any way to parse this text file in a way that the text in between the delimiter characters is read, no matter the text block size. I'm working in Qt C++ Thanks in advance and sorry for my bad english.
-
I'm building an app that has an autocomplete text functionality.
What I'm trying to implement, but having trouble finding the solution for anywhere, is reading a block of text until a character set as a delimiter, i.e. reading a variable block size of text and iterating through it.
If we take "$" as a delimiter character, the example would be:Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum imperdiet massa leo, id auctor metus placerat sit amet. Nam semper nisl in diam feugiat laoreet. $ Etiam ut quam dignissim, scelerisque urna sed, imperdiet lacus. Nulla facilisi. Phasellus fringilla augue a ex tristique, gravida suscipit nibh auctor. Interdum et malesuada fames ac ante ipsum primis in faucibus. Curabitur lacinia ac metus eget facilisis. Praesent in lacinia leo. Suspendisse a nunc at enim tempus pharetra ac in libero. Sed lacinia lobortis erat et pellentesque. Curabitur scelerisque magna sit amet nisl scelerisque, non volutpat mi molestie. Vestibulum bibendum et magna sed dictum. Lorem ipsum dolor sit amet, consectetur adipiscing elit. $ Vestibulum congue, dolor quis faucibus volutpat $ ~~~~ end of text file ~~~~ My question is, is there any way to parse this text file in a way that the text in between the delimiter characters is read, no matter the text block size. I'm working in Qt C++ Thanks in advance and sorry for my bad english.
@Mihic Simply read the text into a QString variable and then split it using your delimiter (https://doc.qt.io/qt-6/qstring.html#split-1).
-
I'm building an app that has an autocomplete text functionality.
What I'm trying to implement, but having trouble finding the solution for anywhere, is reading a block of text until a character set as a delimiter, i.e. reading a variable block size of text and iterating through it.
If we take "$" as a delimiter character, the example would be:Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum imperdiet massa leo, id auctor metus placerat sit amet. Nam semper nisl in diam feugiat laoreet. $ Etiam ut quam dignissim, scelerisque urna sed, imperdiet lacus. Nulla facilisi. Phasellus fringilla augue a ex tristique, gravida suscipit nibh auctor. Interdum et malesuada fames ac ante ipsum primis in faucibus. Curabitur lacinia ac metus eget facilisis. Praesent in lacinia leo. Suspendisse a nunc at enim tempus pharetra ac in libero. Sed lacinia lobortis erat et pellentesque. Curabitur scelerisque magna sit amet nisl scelerisque, non volutpat mi molestie. Vestibulum bibendum et magna sed dictum. Lorem ipsum dolor sit amet, consectetur adipiscing elit. $ Vestibulum congue, dolor quis faucibus volutpat $ ~~~~ end of text file ~~~~ My question is, is there any way to parse this text file in a way that the text in between the delimiter characters is read, no matter the text block size. I'm working in Qt C++ Thanks in advance and sorry for my bad english.
@Mihic said in Spliting a text file by according to a delimiter:
My question is, is there any way to parse this text file in a way that the text in between the delimiter characters is read, no matter the text block size.
@jsulm proposes a simple solution. If, say, your text file is 4GB+ big, with frequent
$
lines dividing it into a lot of small "chunks", you will use an awful lot of memory reading the whole file into aQString
and splitting it. Is that acceptable to you as a solution, or do you want/need something with a much smaller memory footprint? -
@Mihic said in Spliting a text file by according to a delimiter:
My question is, is there any way to parse this text file in a way that the text in between the delimiter characters is read, no matter the text block size.
@jsulm proposes a simple solution. If, say, your text file is 4GB+ big, with frequent
$
lines dividing it into a lot of small "chunks", you will use an awful lot of memory reading the whole file into aQString
and splitting it. Is that acceptable to you as a solution, or do you want/need something with a much smaller memory footprint? -
The solution is acceptable since the file is no bigger than 100KB, there's only one thing I'm still having trouble with, and that is defining the regular expression itself. If you wouldn't mind writing the example I'm looking for with the "$" sign. When i split the string what should i put in the QRegularExpression brackets so that the string is split according to the "$" sign.
Thank you again and sorry for bothering you. -
The solution is acceptable since the file is no bigger than 100KB, there's only one thing I'm still having trouble with, and that is defining the regular expression itself. If you wouldn't mind writing the example I'm looking for with the "$" sign. When i split the string what should i put in the QRegularExpression brackets so that the string is split according to the "$" sign.
Thank you again and sorry for bothering you. -
The solution is acceptable since the file is no bigger than 100KB, there's only one thing I'm still having trouble with, and that is defining the regular expression itself. If you wouldn't mind writing the example I'm looking for with the "$" sign. When i split the string what should i put in the QRegularExpression brackets so that the string is split according to the "$" sign.
Thank you again and sorry for bothering you.@Mihic
We do not know your exact rule about how you want to split "with the$
sign" you show. It might be any$
anywhere (doubtful), it might be$
at start of line or end of line,$
on a line of its own,$
preceded and followed by blank line, etc....Initially I would suggest don't both a regular expression, what about
stringInput.split("$\\n");
Does that do you?