Qt 5.14.1 QSqlModel and French Characters.
-
How did you insert the data into the sqlite database? Which encoding did you use?
-
@Christian-Ehrlicher The data was inserted with the supplied QtSQL libraries using conventional SQLite SQL Statements from a QString. This should ensure UTF-8 string data going in, I think.
There are no other functions to change the defaults of UTF-8 (system default) when the database is created for SQLite nor QSqlDatabase and QSqlDriver for SQLITE3.
-
And how did you insert the data? Where came it from? Since QSqlQueryModel does all right I would guess the insertion was wrong - please show us some code.
-
@Christian-Ehrlicher OK here is a code snippet from the database handler that does data insertions.
bool SQLiteProxy::addNewQSO(QString f) { // QString f is a string with field data separated by '|' // // Data values are inserted with bound values. This was one attempt to // // normalize the data. // bool sqlOK = false; QString in_flds = f; QString strDML = ""; QStringList flds = QStringList(); QSqlQuery qry(sqd->db); flds.clear(); flds = in_flds.split("|",QString::KeepEmptyParts); if ( flds[4].trimmed().length() <=0 ) return false; if ( flds.size() == 45 ) flds.append("1"); strDML = "INSERT INTO log "; strDML += "("; strDML += "id,lognumber,qso_date,time_on,time_off,"; strDML += "call,contacted_op,freq,bandid,"; strDML += "modeid,tx_pwr,pfx,qth,"; strDML += "state,gridsquare,cnty,cont,"; strDML += "lat,lon,dxcc,qsl_sent,"; strDML += "qsl_rcvd,qslsdate,qslrdate,country,"; strDML += "rst_sent, rst_rcvd,qsl_sent_via, qsl_rcvd_via,"; strDML += "eqsl_qsl_rcvd,eqsl_qsl_sent,eqsl_qslrdate, eqsl_qslsdate,"; strDML += "iota,ituz,iaruz,cqz,"; strDML += "ten_ten,region,srx,stx,"; strDML += "qslmsg,bearing,distance,notes,"; strDML += "comment) "; strDML += "VALUES ("; strDML += ":id,:logno,:qsodate,:timeon,:timeoff,"; strDML += ":callsign,:opname,:frequency,:bandval,"; strDML += ":modeval,:power,:prefix,:city,"; strDML += ":st,:grid,:county,:continent,"; strDML += ":latval,:lonval,:dxccval,:qslsent,"; strDML += ":qslrcvd,:qslsdate,:qslrdate,:ctry,"; strDML += ":rstsent,:rstrcvd,:sent_via,:rcvd_via,"; strDML += ":eqslrcvd,:eqslsent,:eqslrdate,:eqslsdate,"; strDML += ":iotano,:itu,:iaru,:cq,"; strDML += ":tenten,:arrlregion,:srxval,:stxval,"; strDML += ":qslmsgstr,:bearingval,:distanceval,:notesstr,:cmtstr)"; qry.prepare(strDML); qry.bindValue(":qsodate", flds[1].trimmed()); / qry.bindValue(":timeon",flds[2].trimmed()); qry.bindValue(":timeoff",flds[3].trimmed()); qry.bindValue(":callsign",flds[4].trimmed()); qry.bindValue(":opname",flds[5].trimmed()); qry.bindValue(":frequency",flds[6].trimmed()); qry.bindValue(":bandval",flds[7].trimmed().toInt()); qry.bindValue(":modeval",flds[8].trimmed().toInt()); qry.bindValue(":power",flds[9].trimmed().toInt()); qry.bindValue(":prefix",flds[10].trimmed()); qry.bindValue(":city",flds[11].trimmed()); qry.bindValue(":st",flds[12].trimmed()); qry.bindValue(":grid",flds[13].trimmed()); qry.bindValue(":county",flds[14].trimmed()); qry.bindValue(":continent",flds[15].trimmed()); qry.bindValue(":latval",flds[16].trimmed()); qry.bindValue(":lonval",flds[17].trimmed()); qry.bindValue(":dxccval",flds[18].trimmed().toInt()); qry.bindValue(":qslsent",flds[19].trimmed()); qry.bindValue(":qslrcvd",flds[20].trimmed()); qry.bindValue(":qslsdate",flds[21].trimmed()); qry.bindValue(":qslrdate",flds[22].trimmed()); qry.bindValue(":ctry",flds[23].trimmed()); qry.bindValue(":rstsent",flds[24].trimmed()); qry.bindValue(":rstrcvd",flds[25].trimmed()); qry.bindValue(":sent_via",flds[26].trimmed()); qry.bindValue(":rcvd_via",flds[27].trimmed());/ qry.bindValue(":eqslrcvd",flds[28].trimmed()); qry.bindValue(":eqslsent",flds[29].trimmed()); qry.bindValue(":eqslrdate",flds[30].trimmed()); qry.bindValue(":eqslsdate",flds[31].trimmed()); qry.bindValue(":iotano",flds[32].trimmed()); qry.bindValue(":itu",flds[33].trimmed().toInt()); qry.bindValue(":iaru",flds[34].trimmed()); qry.bindValue(":cq",flds[35].trimmed().toInt()) qry.bindValue(":tenten",flds[36].trimmed().toInt()); qry.bindValue(":arrlregion",flds[37].trimmed()); qry.bindValue(":srxval",flds[38].trimmed()); qry.bindValue(":stxval",flds[39].trimmed()); qry.bindValue(":qslmsgstr",flds[40].trimmed()); qry.bindValue(":bearingval",flds[41].trimmed().toInt()); qry.bindValue(":distanceval",flds[42].trimmed().toInt()); qry.bindValue(":notesstr",flds[43].trimmed()); qry.bindValue(":cmtstr",flds[44].trimmed()); if ( flds[45].toInt() <= 0) qry.bindValue(":logno",1); else qry.bindValue(":logno",flds[45].toInt()); qry.bindValue(":idno",flds[0].trimmed().toInt()); sqlOK = qry.exec(); if ( sqlOK ) { errNo = QSqlError::NoError; errStr = ""; } else { errNo = qry.lastError().type(); errStr = qry.lastError().text(); qCritical() << "DBERROR ON ADD - " << Q_FUNC_INFO << ":" << __LINE__ << "\nErr: " << errNo << "\nMsg: " << errStr << "\nSQL:" << strDML << "\nData:" << flds; strDML = "ROLLBACK"; // to protect DB qry.exec(strDML); return false; } return true; }
-
And you're sure that QString f is correct? Where do you read it from?
-
@Christian-Ehrlicher I must be missing something.
To get to this function one must create a QString with the field values separated by a '|'.
I think I stated that. If not I do apologize.
But it does seem logical, given the code , that the data is unpacked from the supplied string. It has to be a QString segment to be concatenated to the string supplied to this function.
Besides, the origin should not matter. Even if it came from some other source, the use of the QTranslator, QString, and UTF-8 character set at the system-level should not have this problem.
The letters with diacritical marks are in the UTF-8 character set. Not all fonts display them correctly, but I have tried quite a few different fonts and none display the French or Scandinavian characters correctly.
-
@ad5xj said in Qt 5.14.1 QSqlModel and French Characters.:
But it does seem logical, given the code , that the data is unpacked from the supplied string
Besides, the origin should not matter.It does matter where this string comes from - do you read it from a file or from a e.g. QLineEdit? If the encoding there is wrong then it will be put wrong in the database.
-
@Christian-Ehrlicher I am obviously not communicating well.
Regardless whether the data comes from a file or a QLineEntry it is assigned to a QString in order to be a segment of the QString f. In this case the offending data is part of data loaded from an external source in text format.
Are you saying that assignment will not change the data to UTF-8?
-
@ad5xj said in Qt 5.14.1 QSqlModel and French Characters.:
Are you saying that assignment will not change the data to UTF-8?
If you convert it wrong then QString will be wrong, yes.
This is working fine for me:
int main(int argc, char **argv) { QApplication app(argc, argv); QSqlDatabase db(QSqlDatabase::addDatabase("QSQLITE")); db.setDatabaseName(":memory:"); db.open(); QSqlQuery q; q.exec("CREATE TABLE tmp (name varchar);"); q.exec("INSERT INTO tmp (name) VALUES ('Ren\u00e9');"); q.exec("SELECT * FROM tmp;"); q.next(); QString str = q.value(0).toString(); QLabel lbl(str); lbl.show(); return app.exec(); }
-
@Christian-Ehrlicher Not sure how to answer your last reply.
The data is read into a QByteArray from a text file with a QIODeice as a QFile object opened as text, using the readAll() method. The resulting QByteArray is appended to a QString. That allows the data to be split into lines on the "\n" of each record.
Since the incoming data is delimited text, each line of text is parsed for each data field. The data in each field becomes the correct data type as detected by the parser.
So unless there is something being overlooked at the time the data is read from the input, I am not sure what else to tell you.
-
@ad5xj said in Qt 5.14.1 QSqlModel and French Characters.:
The resulting QByteArray is appended to a QString.
Again: how and what's the encoding of the file? Please show us the code!
-
@Christian-Ehrlicher OK, but I don't think you will learn any more than I have already posted.
void ReadFile(QString fil) QByteArray allRecs; fil = "textfile.txt" QStringLIst fields; // get the file to parse QFile infile(fil); if ( !infile.open(QIODevice::ReadOnly | QIODevice::Text)) { QMessageBox mb; mb.setWindowTitle(QApplication::tr("Text Import Failure")); msg = __FILE__; msg += __FUNCTION__; msg += QString("%1 ").arg(__LINE__); msg += QApplication::tr("Cannot read file "); msg += QString("%1:").arg(fil); msg += QString("\n%2 ").arg(infile.errorString()); mb.setText(msg); mb.exec(); return; } data = ""; lines.clear(); // init the lines var strLineSeg = ""; // First read all the text file records into memory allRecs = infile.readAll(); // read everything into buffer file buffer infile.close(); // close the input file data.append(allRecs); // change from required QByteArray to QString lines = data.split("\n",QString::KeepEmptyParts); // split file records into lines on line break fields.clear(); for (int p = pos; p < y; ++p ) { linestr = lines[p].trimmed().toLocal8Bit(); if ( linestr.length() > 0 ) { // not a blank line // make a list of all fields in the line fields = linestr.split("<",QString::KeepEmptyParts); fields << linestr.trimmed().split("<",QString::SkipEmptyParts); cnt = fields.count(); parse(fields); } fields.clear(); } void parse(QStringList flds) { If (fldname == "Name") { QString datastr = flds[1]; QString name = datastr.trimmed(); // All fields read now make the new record from the data QString strRec = ""; strRec = name + "|"; . . . addNewQSO(strRec); }
The rest of the database save code you have already.
BTW the database was created with the SQLite PRAGMA encoding = 'UTF-8'; -
@ad5xj UPDATE:
Some progress...I have been able to use various text editors to determine the incoming data is text encoded as Windows-1258. Using the QTextCodec to convert the QByteArray to QString as UTF-8 it fixed the problem.This is only a short-term solution since I will not always know what the encoding is. So now, how do I detect the character encoding from the raw text file in order to use the right conversion codec?
A search of this forum is inconclusive.
-
@ad5xj There is no complete solution for detecting the encoding. For example UTF-8 and Latin1 are identical for the first 127 characters. See https://forum.qt.io/topic/12319/how-to-automatically-detect-the-codec-text-file
-
With some research, I have come upon a command-line utility called UTRAC which does at least a partial job of text encoding detection. It is not a complete list and is not 100% effective but does beg the question; why QFile and QTextStream have not at least attempted this. If the UTRAC guys have come this far, it is at least feasible. If it requires a feature request for QT 5.15 then I will put one in.
-
@ad5xj said in Qt 5.14.1 QSqlModel and French Characters.:
why QFile and QTextStream have not at least attempted this
Why should they (especially QFile)? QFile is simply a file handle with some operation for basic file operations like reading/writing. Most of the time you do not have to guess as you know what you're writing and reading. And also, as you already discovered, there is no 100% solution for this and doing this checking is time consuming (you need to read the whole file and check each character), so it is an expensive operation.