Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Qt MS Access umlauts

Qt MS Access umlauts

Scheduled Pinned Locked Moved Solved General and Desktop
10 Posts 3 Posters 763 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Offline
    M Offline
    mki2
    wrote on last edited by mki2
    #1

    Hi

    I have an MS Access database with two tables: test, Test_utf8 (see images below)

    The data in the column example of table test was entered directly in MS Access. The data in Test_utf8 was imported from a text file with utf-8 encoding.

    b972ecab-66c5-4c2f-8fe6-6a30bf3c5e6f-grafik.png ![alt text](image url)

    40b64097-40fd-4dd7-b64a-e959a7f19095-grafik.png

    The code is as follows:

    QSqlDatabase db = QSqlDatabase::addDatabase("QODBC3");
    QString connString = "DBQ=umlaute.mdb;Driver={Microsoft Access Driver (*.mdb,*.accdb)};FIL={MS Access};"
    QSqlQuery query;
    QString sql_statement = "SELECT example FROM test;" //or "SELECT example_utf8 FROM Test_utf8"
    
    db.setDatabaseName(connString);
    if (db.open()) {
        QFile file("output.txt");
        if(file.open(QFile::WriteOnly)) {
            QTextStream out(&file);
            out.setCodec("windows-1252") // or "UTF-8" if Test_utf8
            query.prepare(sql_statement);
            query.exec();
            while (query.next()) {
                qDebug() << query.value(0).toByteArray();
                out << query.value(0).toByteArray() << " | " << QString::fromUtf8(query.value(0).toByteArray()) << " | " ;                                                    
                out << QString::fromLatin1(query.value(0).toByteArray()) << " | ";
                out << query.value(0).toByteArray().toHex(' ') << Qt::endl;
            }
        }
    }
    else {
        qDebug() << "could not open database";
        qDebug() << db.lastError();
    }
    
    db.close();
    

    When I try to query the data I get different results depending on which table i am querying.

    Example for the word "hellö"
    When querying table test I get: 78b959f4-7ec6-42ca-bf93-33c5592c95ff-grafik.png which is a replacement character and my guess is it indicates an encoding problem.

    With table Test_utf8 I get:396d9811-e0b7-48f1-9fee-29d9047ad58f-grafik.png which corresponds to an "ö"

    I am however at a loss as to how to get the "ö" when querying table test. Any help, suggestions or ideas would by much appreciated.

    Cheers

    PS: I am using Qt 5.15.0 MinGW 64-bit, the 64-bit Microsoft Access Driver (*mdb,*accdb) and the database was created with 64-bit MS Access

    1 Reply Last reply
    0
    • M Offline
      M Offline
      mki2
      wrote on last edited by
      #10

      @hskoglund using odbc2 did not make a difference. technically switching the unicode settings on my PC worked but a whole lot of other problems with other software arose. So I switched it back.

      I did find this in the documentation (https://doc.qt.io/qt-6/qsqlquery.html):
      d52de0e0-14c5-45e7-8490-6087d665f701-grafik.png

      So I moved the query to after if(db.open()), as shown below

      QSqlDatabase db = QSqlDatabase::addDatabase("QODBC3");
      QString connString = "DBQ=umlaute.mdb;Driver={Microsoft Access Driver (*.mdb,*.accdb)};FIL={MS Access};"
      QString sql_statement = "SELECT example FROM test;" //or "SELECT example_utf8 FROM Test_utf8"
      
      db.setDatabaseName(connString);
      if (db.open()) {
          QSqlQuery query; //moved query to here
          QFile file("output.txt");
          if(file.open(QFile::WriteOnly)) {
              QTextStream out(&file);
              out.setCodec("windows-1252") // or "UTF-8" if Test_utf8
              query.prepare(sql_statement);
              query.exec();
              while (query.next()) {
                  qDebug() << query.value(0).toByteArray();
                  out << query.value(0).toByteArray() << " | " << QString::fromUtf8(query.value(0).toByteArray()) << " | " ;                                                    
                  out << QString::fromLatin1(query.value(0).toByteArray()) << " | ";
                  out << query.value(0).toByteArray().toHex(' ') << Qt::endl;
              }
          }
      }
      else {
          qDebug() << "could not open database";
          qDebug() << db.lastError();
      }
      
      db.close();
      

      Now i get 4a5368f4-f9fc-4684-b4a6-15f8e3dd9d4a-grafik.png
      when querying table test, which is correct.

      Thank you for your time and help!

      1 Reply Last reply
      0
      • C Offline
        C Offline
        ChrisW67
        wrote on last edited by
        #2

        @mki2 said in Qt MS Access umlauts:

        When querying table test I get: 'hell\xEF\xBF\xBD" which is a replacement character and my guess is it indicates an encoding problem.

        Those three bytes are the UTF-8 encoding of the Unicode U+FFFD used to replace an incoming character whose value is unknown or unrepresentable in Unicode. My guess is that the data in the column is encoded in Windows-1252, that is a bare 0xF6 byte to represent the 'ö' character.

        A single 0xF6 byte does, indeed, make no sense if you try to interpret it as UTF-8.

        With table Test_utf8 I get: "hell\xC3\xB6" which corresponds to an "ö"

        This is the UTF-8 encoding on the Unicode code point U+00F6 (LATIN SMALL LETTER O WITH DIAERESIS).

        In your shot from Access of the UTF-8 column you see these two bytes as "Ã" and "¶" as that is what they represent in the Windows code page.

        1 Reply Last reply
        0
        • hskoglundH Offline
          hskoglundH Offline
          hskoglund
          wrote on last edited by
          #3

          To add to @ChrisW67: you could try converting from UTF-8, say like this:

          ...
          query.exec();
              while (query.next()) {
                  qDebug() << QString::fromUtf8(query.value(0).toByteArray());
              }
          ...
          
          1 Reply Last reply
          0
          • M Offline
            M Offline
            mki2
            wrote on last edited by
            #4

            Thanks for the replies!
            So it is an encoding issue.
            I tried @hskoglund suggestion, but it doesn't change the output result. I still get: 2813417c-792c-4896-a0da-4f8d3a13e964-grafik.png

            adding the following into the while loop:

            qDebug() << "output variant: " << query.value(0);
            

            already yields: 05105fa2-727f-4be1-b859-c32f2bb15650-grafik.png

            so i would need to specify the encoding before the query is executed? How could I do this?
            As far as I am aware I can't specify an encoding, e.g. charset="Windows-1252" in the connection string for odbc and ms access.

            The actual ms access database I will have to work with will unfortunately be like table test and not like table Test_utf8. I just wanted to see what result I get when I am sure the data in the field is utf-8 encoded.

            1 Reply Last reply
            0
            • hskoglundH Offline
              hskoglundH Offline
              hskoglund
              wrote on last edited by
              #5

              Hi, you could also try converting from -1252:
              qDebug() << QString::fromLatin1(query.value(0).toByteArray())

              M 1 Reply Last reply
              2
              • hskoglundH hskoglund

                Hi, you could also try converting from -1252:
                qDebug() << QString::fromLatin1(query.value(0).toByteArray())

                M Offline
                M Offline
                mki2
                wrote on last edited by
                #6

                @hskoglund I tried that. But it doesn't seem to me that it solves the problem. I modified my code above to output a txt. here is what I get when querying table test

                546faeee-7f6a-44ae-9d38-2d6ab924d6a9-grafik.png

                and this is what I get when I query table Test_utf8:

                ab564963-906d-4865-aa29-2933f993e8c2-grafik.png

                I think regarding Test_utf8 the output is as expected.
                However, regarding table test the "information" is "lost" upon being read from the database?

                1 Reply Last reply
                0
                • hskoglundH Offline
                  hskoglundH Offline
                  hskoglund
                  wrote on last edited by
                  #7

                  If you try a hex dump, say like:
                  qDebug() << query.value(0).toByteArray().toHex(' ');
                  can you see the Windows-1252 characters (i.e. the ones bigger than 0x7f)?

                  M 1 Reply Last reply
                  1
                  • hskoglundH hskoglund

                    If you try a hex dump, say like:
                    qDebug() << query.value(0).toByteArray().toHex(' ');
                    can you see the Windows-1252 characters (i.e. the ones bigger than 0x7f)?

                    M Offline
                    M Offline
                    mki2
                    wrote on last edited by
                    #8

                    @hskoglund

                    This is what I get:

                    e1d184e1-a6af-4ee6-9cfb-0f66ee2c8eeb-grafik.png

                    a769092f-56ad-4f9b-8d00-a2d317b139ae-grafik.png

                    1 Reply Last reply
                    0
                    • hskoglundH Offline
                      hskoglundH Offline
                      hskoglund
                      wrote on last edited by
                      #9

                      Aha, now i recognize that � (ef bf bd). I remember for example this post which had more or less the same problem. I used a workaround (bypassing Qt's ODBC driver + talking directly to ODBC) I have the code somewhere but I ditched GitHub when Microsoft bought it so the link doesn't work.

                      Here's some stuff you could try:
                      Try using ODBC2 instead of 3:
                      QSqlDatabase::addDatabase("QODBC");

                      Try changing the Unicode settings in your Windows PC
                      https://stackoverflow.com/a/69268839/233402

                      1 Reply Last reply
                      2
                      • M Offline
                        M Offline
                        mki2
                        wrote on last edited by
                        #10

                        @hskoglund using odbc2 did not make a difference. technically switching the unicode settings on my PC worked but a whole lot of other problems with other software arose. So I switched it back.

                        I did find this in the documentation (https://doc.qt.io/qt-6/qsqlquery.html):
                        d52de0e0-14c5-45e7-8490-6087d665f701-grafik.png

                        So I moved the query to after if(db.open()), as shown below

                        QSqlDatabase db = QSqlDatabase::addDatabase("QODBC3");
                        QString connString = "DBQ=umlaute.mdb;Driver={Microsoft Access Driver (*.mdb,*.accdb)};FIL={MS Access};"
                        QString sql_statement = "SELECT example FROM test;" //or "SELECT example_utf8 FROM Test_utf8"
                        
                        db.setDatabaseName(connString);
                        if (db.open()) {
                            QSqlQuery query; //moved query to here
                            QFile file("output.txt");
                            if(file.open(QFile::WriteOnly)) {
                                QTextStream out(&file);
                                out.setCodec("windows-1252") // or "UTF-8" if Test_utf8
                                query.prepare(sql_statement);
                                query.exec();
                                while (query.next()) {
                                    qDebug() << query.value(0).toByteArray();
                                    out << query.value(0).toByteArray() << " | " << QString::fromUtf8(query.value(0).toByteArray()) << " | " ;                                                    
                                    out << QString::fromLatin1(query.value(0).toByteArray()) << " | ";
                                    out << query.value(0).toByteArray().toHex(' ') << Qt::endl;
                                }
                            }
                        }
                        else {
                            qDebug() << "could not open database";
                            qDebug() << db.lastError();
                        }
                        
                        db.close();
                        

                        Now i get 4a5368f4-f9fc-4684-b4a6-15f8e3dd9d4a-grafik.png
                        when querying table test, which is correct.

                        Thank you for your time and help!

                        1 Reply Last reply
                        0

                        • Login

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • Users
                        • Groups
                        • Search
                        • Get Qt Extensions
                        • Unsolved