SQLite and Utf-8 encoding problem
-
Hi,
I had database in mysql encoded by utf-8.
I convert it to sqlite with tool mysql2sqlite and write to file db.sqlite.
But when i'm trying to get data from that file in places where should be utf-8 coded letter i see there question tag ("?"). I see it when i qDebug or i see it when i'm trying to setText to some QTextEdit. What am i doing wrong? Maybe it's with converting, maybe with Qt. I don't know. Please help.
How am i getting this data:
@
QListWidget *list = new QListWidget(); // i'm adding this later to layout
QSqlDatabase db = QSqlDatabase::addDatabase("QSQLITE3");
QString filename=QApplication::applicationDirPath()+"/db.sqlite";
db.setDatabaseName(filename);
if(db.open()) {
QSqlQuery selectQuery("SELECT human FROM emp ORDER BY human ASC");
if(selectQuery.exec()){
while(selectQuery.next()){
list->addItem(selectQuery.value(0).toString());
cout << selectQuery.value(0).toString().toStdString();
}
}else{
cout << "is not active";
}
}
else if (!db.open()) {
QMessageBox::critical(0, QObject::tr("Database Error"),
db.lastError().text()); // ?
exit(1);
}
@
cout (or qDebug() if you want) gives me "?" on special letters. How to repair that? Where i made a mistake? -
QString internally uses Unicode, encoded as UTF-16.
You are using QString::toStdString() to convert the string data which explains your problem:
std::string QString::toStdString () const
Returns a std::string object with the data contained in this QString. The Unicode data is converted into 8-bit characters using the toAscii() function.Converting a string that contains Non-ASCII Unicode characters to plain ASCII will probably give you a "?" characters for all characters above code point #127!
Instead, if you need an 8-Bit encoding, you may try something like:
cout << myStr.toUtf8().constData() << endl;But then the Console might not be able to deal with UTF-8 characters, depending on what platform you are on...
(On Windows you have to set the correct Codepage with SetConsoleOutputCP() first. And even then cout might mangle your string! For me using "_setmode(_fileno(stdout), _O_U8TEXT)" has resolved this issue under MSVC.)
-
So, what is you say is:
If i will do:
@QTextEdit *textEdit = new QTextEdit(this);
textEdit->setText(selectQuery.value(0).toString().toUtf8()); // toString() converts to QString@
then it will work? I tried and it doesn't worked...
I tried your line with cout and it still gives me an "?". myStr is a QString, right? If so, it didn't worked for me. -
No.
Given that selectQuery.value(0).toString() returns a QString (UTF-16 internally), you should be able to pass that QString "as-is" to textEdit->setText(), because textEdit->setText() should take a QString as argument. No conversion needed! If, however, you want to print your QString on the STDOUT using cout or printf(), then you have to convert your QString to a char-array, because these functions are from the C++ StdLib and have no idea about QString's. QString::toStdString() however won't work, because it converts to plain ASCII and can't preserve Unicode. Instead you can get an UTF-8 encoded char-array from a QString by using myStr.toUtf8().constData().
Note: Printing UTF-8 encoded string with printf() or cout may NOT work for the reasons explained before...
I would recommend to check whether the String comes out correctly from the QSqlQuery via:
@QMessageBox::information(0, "test", selectQuery.value(0).toString());@
If that already shows a distorted string, then you know the problem is in the DB itself (or how it's read). -
By default qDebug() prints onto the Console. Unfortunately printing Unicode characters on the console under Windows is big pain! First of all, the Windows console will expect strings to be encoded in the local ANSI Codepage, by default. In order to print out Unicode on the Windows console, you will have to call SetConsoleOutputCP() and set the Codepage to UTF-8. But that's not enough! Of course you will have to convert your Unicode strings to UTF-8 now, before you print them on the console. That conversion is easy with Qt, thanks to QString::toUtf8(). But: UTF-8 encoded strings get mangled by printf(...) or cout << str. You can print them with WriteFile(GetStdHandle(), ...) to bypass the C++ Standard Library. Or you can, at least with the M$ compiler, use _�_setmode( _fileno( stdout ), O_BINARY) in order to prevent the CRT from mangling your UTF-8 strings...
After all, I think that using a message box is far easier for debugging Unicode strings, as it avoids all the other things that may go wrong with printing Unicode strings ;)
-
For simple situations like this, you can use the "Browser for QDebug Log Output":/wiki/Browser_for_QDebug_output as presented in the wiki. It basically opens a [[Doc:QTextBrowser]] in order to display the qDebug() output. I made it for my own project because of the lacking debugging facilities on the console on Windows :-) If you need something more sophisticated, there are full featured logger frameworks ready to use out there.