Read CSV and Parse, but Ignore Comma inside Double Quotes
-
wrote on 25 Feb 2022, 14:58 last edited by Phamy1289
I'm reading a CSV file where when it parses it reads the comma inside the double quotes. For example:
The CSV: 1,"test\\test2\\Hello, World",4,0,2
The output: 1
"test\\test2\\Hello
World
4
0
2Expected output: 1
"test\\test2\\Hello, World"
4
0
2Also, when I use a regular expression it crashes.
loadFile Function:
string CsvFileImporter::loadFile(QString filename, vector<TrajectoryData *> & trajectoryList) { QFile csvFile(filename); string errorString = ""; QStringList lines; if(csvFile.open(QIODevice::ReadOnly | QIODevice::Text)) { QTextStream stream(&csvFile); while(!stream.atEnd()) { QString line = stream.readLine(); lines.append(line); } TrajectoryCSVImportDialog* dialog = new TrajectoryCSVImportDialog(); dialog->setText(lines); bool importFlag = dialog->exec(); if(importFlag) { int startLine = dialog->getLinesToSkip(); QChar separator = dialog->getSeparator(); double timeStep = dialog->getTimeStep(); // QStringList fieldNames = dialog->getFields(); QList<int> fieldMetrics = dialog->getMetrics(); QList<int> fieldColumns = dialog->getColumnNumber(); QList<double> fieldFactors = dialog->getFactors(); parseFile(lines, startLine, separator, timeStep, fieldMetrics, fieldColumns, fieldFactors, trajectoryList); } else { errorString = "Canceled by user"; QApplication::restoreOverrideCursor(); } } return errorString;
parseFile Function:
void CsvFileImporter::parseFile(QStringList lines, int startLine, QChar separator, double timeStep, QList<int> fieldMetrics, QList<int> fieldColumns, QList<double> fieldFactors, vector<TrajectoryData *> & trajectoryList) { //Some code for(int i=startLine;i<(int)lines.size();i++) { QStringList data = lines[i].split(",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)"); //Some More Code } }
-
I'm reading a CSV file where when it parses it reads the comma inside the double quotes. For example:
The CSV: 1,"test\\test2\\Hello, World",4,0,2
The output: 1
"test\\test2\\Hello
World
4
0
2Expected output: 1
"test\\test2\\Hello, World"
4
0
2Also, when I use a regular expression it crashes.
loadFile Function:
string CsvFileImporter::loadFile(QString filename, vector<TrajectoryData *> & trajectoryList) { QFile csvFile(filename); string errorString = ""; QStringList lines; if(csvFile.open(QIODevice::ReadOnly | QIODevice::Text)) { QTextStream stream(&csvFile); while(!stream.atEnd()) { QString line = stream.readLine(); lines.append(line); } TrajectoryCSVImportDialog* dialog = new TrajectoryCSVImportDialog(); dialog->setText(lines); bool importFlag = dialog->exec(); if(importFlag) { int startLine = dialog->getLinesToSkip(); QChar separator = dialog->getSeparator(); double timeStep = dialog->getTimeStep(); // QStringList fieldNames = dialog->getFields(); QList<int> fieldMetrics = dialog->getMetrics(); QList<int> fieldColumns = dialog->getColumnNumber(); QList<double> fieldFactors = dialog->getFactors(); parseFile(lines, startLine, separator, timeStep, fieldMetrics, fieldColumns, fieldFactors, trajectoryList); } else { errorString = "Canceled by user"; QApplication::restoreOverrideCursor(); } } return errorString;
parseFile Function:
void CsvFileImporter::parseFile(QStringList lines, int startLine, QChar separator, double timeStep, QList<int> fieldMetrics, QList<int> fieldColumns, QList<double> fieldFactors, vector<TrajectoryData *> & trajectoryList) { //Some code for(int i=startLine;i<(int)lines.size();i++) { QStringList data = lines[i].split(",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)"); //Some More Code } }
wrote on 25 Feb 2022, 15:43 last edited by@Phamy1289
Yes, you need to parse properly to deal correctly with commas inside quotes.I have not looked at them in detail, but what about:
https://stackoverflow.com/a/38954889
https://stackoverflow.com/a/30100233 (and the qtcsv library it references)
? -
@Phamy1289
Yes, you need to parse properly to deal correctly with commas inside quotes.I have not looked at them in detail, but what about:
https://stackoverflow.com/a/38954889
https://stackoverflow.com/a/30100233 (and the qtcsv library it references)
? -
@JonB Thank you so much for the links. Trying to use regular expressions was not working for me. Making another parse function based on the first link worked and was much easier to follow.
wrote on 25 Feb 2022, 18:08 last edited by@Phamy1289
Reg exs are great, but no substitute for parsing. When rules get too complex you are better writing parsing code than trying to find a robust reg ex.
1/4