[SOLVED] How to force QSqlTableModel to use UTF-8 encoding?



  • Hi

    I'm using QSqlTableModel / QTableView to represent tables of an existing MySQL database. Everything works good else than unicode strings. They are interpreted as ASCII characters and not shown correctly. When fetching records from database with QSqlRecord, I use UTF-8 decoder to obtain UTF-8 encoded QString like this:
    @
    // Open MySQL database
    QSqlQuery query("SELECT * from sellers");
    QTextDecoder *decoder = QTextCodec::codecForName("UTF-8")->makeDecoder();
    QString uid = decoder->toUnicode(query.value(i_uid).toByteArray());@
    How do I do same thing for QSqlTableModel?
    Also I'm not sure my above code is good enough. Is there a better way (probably application-wide) to force application to use a specific encoding for strings?



  • Two questions:

    What version of MySql are you using?

    Did you set QTextCodec::codecForLocale()?



    1. 5.1.54
    2. No! what should I do?
      Update: I tried
      @QTextCodec::setCodecForLocale(QTextCodec::codecForName("UTF-8"));@
      before instantiating my MainWindow but doesn't help.


  • Of what MySQL type is the field (are the fields) you are trying to read?



  • They are VARCHAR(50) and MEDIUMTEXT.
    Here is the create script of a table:
    @
    CREATE TABLE customers (
    id smallint(5) unsigned NOT NULL AUTO_INCREMENT,
    name varchar(50) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
    last varchar(50) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
    phone bigint(20) unsigned zerofill NOT NULL,
    address mediumtext CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
    PRIMARY KEY (id)
    ) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci@



  • Ok. what does query->data().toString() give you? (Sorry for the amount of question, I was going through the MySQL driver source, and it seems, that it automatically stores it as Unicode in QString).



  • query->data().toString() gives me a wrongly decoded string. For example for a Persian name like شرکت نرم‌افزاری مرویٰ query->data().toString() returns a QString containing:
    شرکت نرÙ
    ‌افزاری Ù
    رویٰ
    instead of correct string.



  • May I ask what tool you used to insert your data into the DB?



  • The charset can normally can be set at table, db or server. I think.. you might also need to create your database with the charset. I remember having a similar issue in my web app..

    @ CREATE DATABASE mydb DEFAULT CHARACTER SET utf8 @

    Some info in this "link":http://www.phpwact.org/php/i18n/utf-8/mysql..



  • I actually tested it and everything works fine as long as one uses Qt to write the table.



  • I would check the MySQL Qt plugin source code -- conversion between QString and VARCHAR is supposed to be 100% automatic (see [[doc:sql-types]] in the docs).



  • That's what I did. And it is (well as long as you don't somehow get the binary flag involved).

    EDIT: Seems as if the binary flag is also set if you use the uft8_bin collation.



  • bq. May I ask what tool you used to insert your data into the DB?

    MySQL admin and QueryBrowser. But does it matter?

    bq. I actually tested it and everything works fine as long as one uses Qt to write the table.

    Yes. I think it's not a database issue, database just stores strings with given encoding. It's all about QVariant. there should be some way to tell QVariant how to interpret binary data to create strings. and actually there is a way (QTextCodec) but I can't put things together to work!

    I'm not sure but It seems to work if a subclass of QSqlTableModel used and text edit delegate (or something like that) replaced with one uses encoding...

    I saw some examples of QTableView / Model that cells of tables are replaced with customized or completely new widgets.



  • [quote author="soroush" date="1308747079"]bq. May I ask what tool you used to insert your data into the DB?

    MySQL admin and QueryBrowser. But does it matter?

    bq. I actually tested it and everything works fine as long as one uses Qt to write the table.

    Yes. I think it's not a database issue, database just stores strings with given encoding. It's all about QVariant. there should be some way to tell QVariant how to interpret binary data to create strings. and actually there is a way (QTextCodec) but I can't put things together to work![/quote]

    It's NOT QVariant. If the QVariant returned by a query is wrapping a QString, the encoding conversion was already done for you by the SQL driver (or by whatever put the QString there). If you're 100% sure that the data stored in the DB is correct, you should try to debug the plugin itself.

    [quote]I'm not sure but It seems to work if a subclass of QSqlTableModel used and text edit delegate (or something like that) replaced with one uses encoding...
    [/quote]

    This may actually be a workaround (that is: a subclass which overrides data/setData and performs another encoding conversion there).



  • The problem is that you use utf8_bin which set the BINARY_FLAG, which disables the automatic conversion to Unicode.



  • [quote author="loladiro" date="1308747478"]The problem is that you use utf8_bin which set the BINARY_FLAG, which disables the automatic conversion to Unicode.[/quote]

    I tried everything, I promise! with and without binary flag there is no difference. for example this table is shown wrong :
    @CREATE TABLE customers (
    id smallint(5) unsigned NOT NULL AUTO_INCREMENT,
    name varchar(50) CHARACTER SET utf8 NOT NULL,
    last varchar(50) CHARACTER SET utf8 NOT NULL,
    phone bigint(20) unsigned zerofill NOT NULL,
    address mediumtext CHARACTER SET utf8 NOT NULL,
    PRIMARY KEY (id)
    ) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8 @



  • I used the following test program and it worked (your table):
    @
    QSqlQuery query("SELECT * from test2");
    QSqlQuery query2(QString("INSERT INTO test2 VALUES("%1","%2","%2","%1","%2")").arg(QString::number(0),QString::fromUtf8("شرکت نرم‌افزاری مروی")));

    query2.exec();
    
    query.exec();
    
    while (query.next()) {
        qDebug() << query.value(1).typeName();
             qDebug() << query.value(1).toString();
         }
    
    return 0;
    

    @



  • Ok, I do a test to understand behavior of QVariant with strings
    @int main()
    {
    QString s = "فارسی";
    QByteArray b = s.toAscii();
    for (int i = 0; i < b.size(); ++i) {
    qDebug()<<(int)b[i];
    }
    qDebug()<<b.size();
    QVariant v(b);
    QString s2 = v.toString();
    qDebug() << s2;
    }@
    Above code prints:
    @-39
    -127
    -40
    -89
    -40
    -79
    -40
    -77
    -37
    -116
    10
    "فارسی" @
    Which is correct. It seems that QVariant knows how to convert UTF-8 encoded strings of bytes to QStrings (is not?), so I'm going to play around database.



  • True, but what is really happening (inside the driver) is more like:
    @
    int main()
    {
    QString s = "فارسی";
    QByteArray b = s.toUtf8();
    for (int i = 0; i < b.size(); ++i) {
    qDebug()<<(int)b[i];
    }
    qDebug()<<b.size();
    QVariant v(b);
    QString s2 = v.toString();
    qDebug() << s2;
    }
    @



  • Ok I tried this code with same database:
    @int main()
    {
    QSqlDatabase db = QSqlDatabase::addDatabase("QMYSQL");
    db.setDatabaseName("shop");
    db.setHostName("localhost");
    db.setUserName("root");
    db.setPassword(<password>);
    qDebug()<<db.open();
    QSqlQuery q("select * from customers");
    q.exec();
    while(q.next())
    qDebug()<<q.value(1).toString();
    }@

    works well. but there is no hope with QSqlTableView.



  • That's also absolutely no problem:
    !http://dl.dropbox.com/u/32965023/sqltest.png(proof)!

    @ QSqlTableModel *model = new QSqlTableModel(0,db);
    model->setTable("test2");
    model->select();
    QTableView v;
    v.setModel(model);

    v.show();
    
    return a.exec&#40;&#41;;@




  • I think it's something about the tool you used to insert the data. At first I used the mysql command line tool which by default uses they system charset, not UTF-8.



  • [quote author="loladiro" date="1308751303"]I think it's something about the tool you used to insert the data. At first I used the mysql command line tool which by default uses they system charset, not UTF-8.[/quote]

    dropping table and creating it again didn't help. also I tested insert commend and that is wrong too.
    The only differences between command-line tests (working correctly) and the application itself (wrong encoding), is QtGUI module declared in .pro file and QTableView



  • Did you remove the utf8_bin?



  • I found problem! consider the code:
    @ QSqlDatabase db = QSqlDatabase::addDatabase("QMYSQL");
    db.setDatabaseName("shop");
    db.setHostName("localhost");
    db.setUserName("root");
    db.open();
    QSqlQuery q("select * from sellers");
    q.exec();
    while(q.next())
    qDebug()<<q.value(3).toString();@
    This code In a qt application with no GUI prints correct output:
    @
    "سروش"
    "حسام"
    حامد" "
    @
    but in a Qt widget project with a .pro file containing
    @QT += gui@
    prints this:
    @"سروش"
    "حساÙ
    "
    "حاÙ
    د" @



  • bq. Did you remove the utf8_bin?

    Yes i removed utf8_bin and binary flags of fields.
    utf8_bin is just colliding algorithm that used to compare values inside database. I think this problem is not related to database. I used same database in php and everything was good.



  • Could you please create a minimal compilable example that still exhibits the behavior you just described?



  • [quote author="soroush" date="1308752734"]bq. Did you remove the utf8_bin?

    Yes. utf8_bin is just colliding algorithm that is used to compare values inside database. I think it's not related to database. I used this database in php and everything was good.[/quote]

    I know, but, as I said,

    [quote]
    The problem is that you use utf8_bin which sets the BINARY_FLAG, which disables the automatic conversion to Unicode.
    [/quote]



  • [quote author="loladiro" date="1308749245"]True, but what is really happening (inside the driver) is more like:
    @
    QString s = "فارسی";
    @
    [/quote]

    BEWARE OF ENCODING ISSUES if you do that!



  • @#include <QtGui/QApplication>
    #include <QMainWindow>
    #include <QSqlDatabase>
    #include <QSqlQuery>
    #include <QSqlTableModel>
    #include <QTableView>
    #include <QDebug>

    int main(int argc, char argv[])
    {
    QApplication a(argc, argv);
    QMainWindow w;
    QSqlDatabase db = QSqlDatabase::addDatabase("QMYSQL");
    db.setDatabaseName("shop");
    db.setHostName("localhost");
    db.setUserName("root");
    db.setPassword(password);
    qDebug() << db.open();
    QSqlTableModel
    model = new QSqlTableModel(0,db);
    model->setTable("sellers");
    model->select();
    QTableView* view = new QTableView();
    view->setModel(model);
    view->show();
    w.setCentralWidget(view);
    QSqlQuery q1("select * from sellers");
    q1.exec();
    w.show();
    return a.exec();
    }@
    The result is:
    !http://s1.picofile.com/file/6845918508/screenshot8.png()!
    but this code:
    @
    #include <QApplication>
    #include <QVariant>
    #include <QSqlDatabase>
    #include <QSqlQuery>
    #include <QDebug>

    int main(int argc, char *argv[])
    {
    QApplication a(argc, argv);
    QSqlDatabase db = QSqlDatabase::addDatabase("QMYSQL");
    db.setDatabaseName("shop");
    db.setHostName("localhost");
    db.setUserName("root");
    db.setPassword(password);
    qDebug()<<db.open();
    QSqlQuery q("select * from sellers");

    while(q.next())
        qDebug()<<q.value(3).toString();
    

    return a.exec();
    }@
    results correct output:



  • Ok, that's really weird, could you do a mysql dump of your table, so that I can test it.

    [quote author="peppe" date="1308753285"]
    BEWARE OF ENCODING ISSUES if you do that!
    [/quote]

    Yeah, I know, it was just to illustrate a point.



  • bq. Ok, that’s really weird, could you do a mysql dump of your table, so that I can test it.

    @
    -- MySQL dump 10.13 Distrib 5.1.54, for debian-linux-gnu (x86_64)

    -- Host: localhost Database: shop


    -- Server version 5.1.54-1ubuntu4

    /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT /;
    /
    !40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS /;
    /
    !40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION /;
    /
    !40101 SET NAMES utf8 /;
    /
    !40103 SET @OLD_TIME_ZONE=@@TIME_ZONE /;
    /
    !40103 SET TIME_ZONE='+00:00' /;
    /
    !40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 /;
    /
    !40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 /;
    /
    !40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' /;
    /
    !40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

    --
    -- Table structure for table customers

    DROP TABLE IF EXISTS customers;
    /*!40101 SET @saved_cs_client = @@character_set_client /;
    /
    !40101 SET character_set_client = utf8 /;
    CREATE TABLE customers (
    id smallint(5) unsigned NOT NULL AUTO_INCREMENT,
    name varchar(50) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
    last varchar(50) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
    phone bigint(20) unsigned zerofill NOT NULL,
    address mediumtext CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
    PRIMARY KEY (id)
    ) ENGINE=InnoDB AUTO_INCREMENT=181 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
    /
    !40101 SET character_set_client = @saved_cs_client */;

    --
    -- Dumping data for table customers

    LOCK TABLES customers WRITE;
    /*!40000 ALTER TABLE customers DISABLE KEYS /;
    INSERT INTO customers VALUES (1,'xx','xx',00000000000003661195,'xx'),(3,'x','xx',00000000004294967295,'xx'),(4,'xxx','xx',00000000000002241578,'xx'),(5,'سهراب','سپهری',00000000000000000000,'xx'),(12,'نتسنیتب','نتسنیتب',00000000000000000012,'نتسنیتب'),(180,'ش','ش',00000000000000000180,'ش');
    /
    !40000 ALTER TABLE customers ENABLE KEYS */;
    UNLOCK TABLES;

    --
    -- Table structure for table factors

    DROP TABLE IF EXISTS factors;
    /*!40101 SET @saved_cs_client = @@character_set_client /;
    /
    !40101 SET character_set_client = utf8 /;
    CREATE TABLE factors (
    id smallint(5) unsigned NOT NULL,
    customer_id smallint(5) unsigned NOT NULL,
    stuff_id smallint(5) unsigned NOT NULL,
    seller_id smallint(5) unsigned NOT NULL,
    count tinyint(3) unsigned NOT NULL,
    PRIMARY KEY (id),
    KEY construct_cid (customer_id),
    KEY construct_sid (stuff_id),
    KEY construct_seid (seller_id),
    CONSTRAINT construct_cid FOREIGN KEY (customer_id) REFERENCES customers (id) ON DELETE CASCADE ON UPDATE CASCADE,
    CONSTRAINT construct_seid FOREIGN KEY (seller_id) REFERENCES sellers (id) ON DELETE CASCADE ON UPDATE CASCADE,
    CONSTRAINT construct_sid FOREIGN KEY (stuff_id) REFERENCES stuffs (id) ON DELETE CASCADE ON UPDATE CASCADE
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
    /
    !40101 SET character_set_client = @saved_cs_client */;

    --
    -- Dumping data for table factors

    LOCK TABLES factors WRITE;
    /*!40000 ALTER TABLE factors DISABLE KEYS /;
    INSERT INTO factors VALUES (1,4,6,1,5),(2,3,3,2,3),(3,3,6,3,2);
    /
    !40000 ALTER TABLE factors ENABLE KEYS */;
    UNLOCK TABLES;

    --
    -- Table structure for table sellers

    DROP TABLE IF EXISTS sellers;
    /*!40101 SET @saved_cs_client = @@character_set_client /;
    /
    !40101 SET character_set_client = utf8 /;
    CREATE TABLE sellers (
    id smallint(5) unsigned NOT NULL,
    uid varchar(20) COLLATE utf8_bin NOT NULL,
    password varchar(20) COLLATE utf8_bin NOT NULL,
    name varchar(20) COLLATE utf8_bin NOT NULL,
    last varchar(20) COLLATE utf8_bin NOT NULL,
    PRIMARY KEY (id)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
    /
    !40101 SET character_set_client = @saved_cs_client */;

    --
    -- Dumping data for table sellers

    LOCK TABLES sellers WRITE;
    /*!40000 ALTER TABLE sellers DISABLE KEYS /;
    INSERT INTO sellers VALUES (1,'admin','1370','x','x'),(2,'hesam','123','x','x'),(3,'x','123','x','x');
    /
    !40000 ALTER TABLE sellers ENABLE KEYS */;
    UNLOCK TABLES;

    --
    -- Table structure for table stuffs

    DROP TABLE IF EXISTS stuffs;
    /*!40101 SET @saved_cs_client = @@character_set_client /;
    /
    !40101 SET character_set_client = utf8 /;
    CREATE TABLE stuffs (
    id smallint(5) unsigned NOT NULL,
    weight float unsigned NOT NULL,
    fee float unsigned NOT NULL,
    p_date date NOT NULL,
    e_date date NOT NULL,
    manufactorer tinytext COLLATE utf8_bin NOT NULL,
    PRIMARY KEY (id)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
    /
    !40101 SET character_set_client = @saved_cs_client */;

    --
    -- Dumping data for table stuffs

    LOCK TABLES stuffs WRITE;
    /*!40000 ALTER TABLE stuffs DISABLE KEYS /;
    INSERT INTO stuffs VALUES (1,1,285,'2011-06-18','2012-06-18','IBM'),(2,1.35,300,'2009-01-10','2010-05-21','Microsoft'),(3,5.5,25,'2001-07-25','2011-07-25','x x x'),(6,3.5,950,'2001-07-25','2021-07-25','x x x');
    /
    !40000 ALTER TABLE stuffs ENABLE KEYS /;
    UNLOCK TABLES;
    /
    !40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

    /*!40101 SET SQL_MODE=@OLD_SQL_MODE /;
    /
    !40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS /;
    /
    !40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS /;
    /
    !40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT /;
    /
    !40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS /;
    /
    !40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION /;
    /
    !40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

    -- Dump completed on 2011-06-22 19:18:39
    @



  • Very funny.
    [quote]
    utf8_bin
    utf8_bin
    utf8_bin
    utf8_bin
    utf8_bin
    utf8_bin
    utf8_bin
    utf8_bin
    utf8_bin
    utf8_bin
    utf8_bin
    [/quote]
    If you remove that:
    !http://dl.dropbox.com/u/32965023/result.png(none)!

    Edit: BTW, I added the last two lines to test.

    Edit2: I adjusted the uft8_bin's in the quote to exactly match the number in your dump



  • I removed uft8_bin from table option tab in MySQL Admin. but each time I insert data using MySQL Query Browser, colliding method is added to all tables of database.

    I'm going to create re-database with mysql cli.

    Thank you very much :-)

    UPDATED:
    I do everything from beginning, and remove every colliding method. Now everything works fine:
    !http://s1.picofile.com/file/6846154924/screenshot9.png()!
    Thanks again.


Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.