Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Problem with Japanese encoding/display
QtWS25 Last Chance

Problem with Japanese encoding/display

Scheduled Pinned Locked Moved General and Desktop
20 Posts 4 Posters 10.6k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • F Offline
    F Offline
    Franzk
    wrote on 19 Aug 2011, 06:52 last edited by
    #4

    I wouldn't say it's obvious to me. I would expect Qt to handle the encoding bit from the db correctly. To be sure however, I think you could try the following:

    @QString theString = QString::fromUtf8(query.value(x).toByteArray());@

    And see if that yields the desired results.

    Of course this kind of hard coding will disqualify any possibility of changing encoding in the future (as if you would want to move away from unicode).

    "Horse sense is the thing a horse has which keeps it from betting on people." -- W.C. Fields

    http://www.catb.org/~esr/faqs/smart-questions.html

    1 Reply Last reply
    0
    • B Offline
      B Offline
      BonRouge
      wrote on 19 Aug 2011, 06:58 last edited by
      #5

      Thanks. That doesn't seem to do it though.

      Here's what I get from that: 佐�?��??��?子

      1 Reply Last reply
      0
      • F Offline
        F Offline
        Franzk
        wrote on 19 Aug 2011, 07:20 last edited by
        #6

        Did you output the actual unicode code points (numbers) and compare them with the expected value?

        "Horse sense is the thing a horse has which keeps it from betting on people." -- W.C. Fields

        http://www.catb.org/~esr/faqs/smart-questions.html

        1 Reply Last reply
        0
        • B Offline
          B Offline
          BonRouge
          wrote on 19 Aug 2011, 07:34 last edited by
          #7

          Well, I've just searched around to find out how I would do what you suggest, but I'm not finding it, so...
          How would I do that?

          1 Reply Last reply
          0
          • F Offline
            F Offline
            Franzk
            wrote on 19 Aug 2011, 07:50 last edited by
            #8

            Uhm, I would try storing some known characters in the database. Then I would read it out with the above method, using both toString() and toByteArray(). Then see what the actual data is and then try to match it to the unicode table. I'd probably put the same known characters into a QString and see what the contents are:

            @QString str = QString::fromUtf8("whatever\u03c0");@

            str = whateverπ (That's lower case pi)

            "Horse sense is the thing a horse has which keeps it from betting on people." -- W.C. Fields

            http://www.catb.org/~esr/faqs/smart-questions.html

            1 Reply Last reply
            0
            • B Offline
              B Offline
              BonRouge
              wrote on 20 Aug 2011, 14:27 last edited by
              #9

              I've been trying all sorts of things but I haven't found an answer yet.

              This character - 藤 - is this in unicode - \u85E4.
              I put that character into the database through my normal HTML/PHP web page. When I look at the database stuff in PHPMyAdmin, it looks like this - è—¤. It also looks like that when I call it in the Qt thing I'm building.

              I did this: @ QString st = snamej_t.toUtf8().toHex();@
              and got this: c3a8e28094c2a4
              I put that number into this page here - http://www.string-functions.com/hex-string.aspx - and got this - è—¤.

              I tried putting the same character (藤) into the database with my Qt interface and directly from the .cpp file. Both times, when I retrieved the data, I got something more strange - something like this - �?��.

              I was wondering again about Qt Creator and the encoding of the files. I changed the encoding of all files to UTF-8, but when I re-opened tham in Qt Creator, they seemed to have changed back to 'System'. As far as I can work out, the system encoding for this Windows PC I'm using should be unicode, because it's a Japanese OS.

              I hope you can help me find some kind of answer to this. It's driving me nuts.

              Thanks a lot.

              1 Reply Last reply
              0
              • F Offline
                F Offline
                Franzk
                wrote on 22 Aug 2011, 10:25 last edited by
                #10

                Try that page's "Character Encoding Errors Analyzer":http://www.string-functions.com/encodingerror.aspx.

                I also think that you should look into "QTextCodec::setCodecForCStrings()":http://doc.trolltech.com/latest/qtextcodec.html#setCodecForCStrings. The results look like latin-1 versions of utf-8 encoded text.

                "Horse sense is the thing a horse has which keeps it from betting on people." -- W.C. Fields

                http://www.catb.org/~esr/faqs/smart-questions.html

                1 Reply Last reply
                0
                • G Offline
                  G Offline
                  goetz
                  wrote on 22 Aug 2011, 12:42 last edited by
                  #11

                  Maybe "this older thread":http://developer.qt.nokia.com/forums/viewthread/7048 is of help for you.

                  http://www.catb.org/~esr/faqs/smart-questions.html

                  1 Reply Last reply
                  0
                  • B Offline
                    B Offline
                    BonRouge
                    wrote on 22 Aug 2011, 15:56 last edited by
                    #12

                    Thank you both.

                    I put that one character and the strange output into that error-analyzer page and got this:

                    Displaying 4 results
                    utf-8 (65001, Unicode (UTF-8)) -> Windows-1252 (1252, Western European (Windows))
                    utf-8 (65001, Unicode (UTF-8)) -> windows-1254 (1254, Turkish (Windows))
                    utf-8 (65001, Unicode (UTF-8)) -> windows-1256 (1256, Arabic (Windows))
                    utf-8 (65001, Unicode (UTF-8)) -> windows-1258 (1258, Vietnamese (Windows))

                    I tried this:
                    @QTextCodec::setCodecForCStrings(QTextCodec::codecForName("UTF-8"));@
                    but it didn't seem to do anything.

                    The thread Volker pointed me to seemed very promising, but... I tried removing the collation of the MySql database through PHPMyAdmin, but it wouldn't seem to let me. When I changed it to utf8_general_ci (from utf8_unicode_ci) I was able to put the character into the database via my Qt UI and read it in PHPMyAdmin, but when I looked at my webpage (which uses a UTF-8 character set) I just got a question mark.

                    Thanks for any more help. Sorry if this is just getting boring now...

                    1 Reply Last reply
                    0
                    • F Offline
                      F Offline
                      Franzk
                      wrote on 22 Aug 2011, 20:12 last edited by
                      #13

                      Could you write up an example with a database dump we can test?

                      "Horse sense is the thing a horse has which keeps it from betting on people." -- W.C. Fields

                      http://www.catb.org/~esr/faqs/smart-questions.html

                      1 Reply Last reply
                      0
                      • B Offline
                        B Offline
                        BonRouge
                        wrote on 23 Aug 2011, 00:48 last edited by
                        #14

                        Here's a minimal case. (Is this enough?)

                        @
                        SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";

                        /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT /;
                        /
                        !40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS /;
                        /
                        !40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION /;
                        /
                        !40101 SET NAMES utf8 */;

                        CREATE TABLE IF NOT EXISTS students (
                        id smallint(3) NOT NULL auto_increment,
                        sname varchar(30) collate utf8_unicode_ci default NULL,
                        snamej mediumtext collate utf8_unicode_ci NOT NULL,
                        email varchar(60) collate utf8_unicode_ci default NULL,
                        email2 varchar(50) collate utf8_unicode_ci NOT NULL,
                        phone varchar(20) collate utf8_unicode_ci default NULL,
                        mobile varchar(15) collate utf8_unicode_ci default NULL,
                        dob date default NULL,
                        dobY year(4) NOT NULL default '0000',
                        dobM smallint(2) default NULL,
                        dobD smallint(2) default NULL,
                        uclass varchar(20) collate utf8_unicode_ci default NULL,
                        info longtext collate utf8_unicode_ci,
                        intro varchar(30) collate utf8_unicode_ci default NULL,
                        lessons decimal(2,1) NOT NULL,
                        freect smallint(2) NOT NULL,
                        level mediumtext collate utf8_unicode_ci NOT NULL,
                        type varchar(20) collate utf8_unicode_ci default NULL,
                        uctype varchar(20) collate utf8_unicode_ci default NULL,
                        old tinytext collate utf8_unicode_ci NOT NULL,
                        ssdiscount tinytext collate utf8_unicode_ci,
                        paidforby mediumtext collate utf8_unicode_ci,
                        paidforby_id int(11) NOT NULL,
                        paysfor mediumtext collate utf8_unicode_ci NOT NULL,
                        paysfor_id int(11) NOT NULL,
                        intschool tinytext collate utf8_unicode_ci,
                        booked varchar(1) collate utf8_unicode_ci NOT NULL,
                        startdate varchar(10) collate utf8_unicode_ci NOT NULL,
                        notcomenotes longtext collate utf8_unicode_ci NOT NULL,
                        paysfor2 varchar(30) collate utf8_unicode_ci NOT NULL,
                        pass varchar(8) collate utf8_unicode_ci NOT NULL,
                        onapack tinytext collate utf8_unicode_ci NOT NULL,
                        joint tinytext collate utf8_unicode_ci NOT NULL,
                        e1onlist tinyint(1) NOT NULL,
                        e2onlist tinyint(1) NOT NULL,
                        address varchar(200) collate utf8_unicode_ci NOT NULL,
                        PRIMARY KEY (id)
                        ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1428 ;

                        INSERT INTO students (id, sname, snamej, email, email2, phone, mobile, dob, dobY, dobM, dobD, uclass, info, intro, lessons, freect, level, type, uctype, old, ssdiscount, paidforby, paidforby_id, paysfor, paysfor_id, intschool, booked, startdate, notcomenotes, paysfor2, pass, onapack, joint, e1onlist, e2onlist, address) VALUES
                        (1007, 'Noriko Sato', '佐藤 紀子', 'noriko@phonecompany.jp', '', '022-333-9999', '090-2222-0000', '1971-12-10', 1971, 12, 10, '', '', '', '0.0', 0, '', '', 'Korean 50', '', '', '', 0, '', 0, '', 'y', '1226732681', '', '', '26ndjokmh4', '', '', 1, 0, '');

                        @

                        The one character I keep referring to is here: è—¤ (in the 'snamej' field).

                        1 Reply Last reply
                        0
                        • B Offline
                          B Offline
                          BonRouge
                          wrote on 29 Aug 2011, 05:12 last edited by
                          #15

                          Sorry, but... bump.

                          I'm getting nowhere with this encoding issue and still hoping for help.

                          Thanks a lot.

                          1 Reply Last reply
                          0
                          • B Offline
                            B Offline
                            BonRouge
                            wrote on 10 Sept 2011, 13:45 last edited by
                            #16

                            Hi. I'm not really sure what to do anymore but bump this again. Any ideas on this?

                            1 Reply Last reply
                            0
                            • V Offline
                              V Offline
                              vsorokin
                              wrote on 10 Sept 2011, 14:06 last edited by
                              #17

                              Can you run
                              SHOW VARIABLES; command on your MySql server and show output?

                              --
                              Vasiliy

                              1 Reply Last reply
                              0
                              • B Offline
                                B Offline
                                BonRouge
                                wrote on 10 Sept 2011, 14:21 last edited by
                                #18

                                Thanks for the response. Here's what I got:

                                Variable_name Value
                                auto_increment_increment 1
                                auto_increment_offset 1
                                automatic_sp_privileges ON
                                back_log 50
                                basedir /
                                binlog_cache_size 32768
                                bulk_insert_buffer_size 8388608
                                character_set_client utf8
                                character_set_connection utf8
                                character_set_database latin1
                                character_set_filesystem binary
                                character_set_results utf8
                                character_set_server latin1
                                character_set_system utf8
                                character_sets_dir /usr/share/mysql/charsets/
                                collation_connection utf8_general_ci
                                collation_database latin1_swedish_ci
                                collation_server latin1_swedish_ci
                                completion_type 0
                                concurrent_insert 1
                                connect_timeout 10
                                datadir /var/lib/mysql/
                                date_format %Y-%m-%d
                                datetime_format %Y-%m-%d %H:%i:%s
                                default_week_format 0
                                delay_key_write ON
                                delayed_insert_limit 100
                                delayed_insert_timeout 300
                                delayed_queue_size 1000
                                div_precision_increment 4
                                keep_files_on_create OFF
                                engine_condition_pushdown OFF
                                expire_logs_days 0
                                flush OFF
                                flush_time 0
                                ft_boolean_syntax + -><()~*:""&|
                                ft_max_word_len 84
                                ft_min_word_len 4
                                ft_query_expansion_limit 20
                                ft_stopword_file (built-in)
                                group_concat_max_len 1024
                                have_archive YES
                                have_bdb NO
                                have_blackhole_engine YES
                                have_compress YES
                                have_community_features NO
                                have_profiling NO
                                have_crypt YES
                                have_csv YES
                                have_dynamic_loading YES
                                have_example_engine YES
                                have_federated_engine YES
                                have_geometry YES
                                have_innodb YES
                                have_isam NO
                                have_merge_engine YES
                                have_ndbcluster NO
                                have_openssl NO
                                have_ssl NO
                                have_query_cache YES
                                have_raid NO
                                have_rtree_keys YES
                                have_symlink YES
                                hostname biz107.inmotionhosting.com
                                init_connect
                                init_file
                                init_slave
                                innodb_additional_mem_pool_size 1048576
                                innodb_autoextend_increment 8
                                innodb_buffer_pool_awe_mem_mb 0
                                innodb_buffer_pool_size 134217728
                                innodb_checksums ON
                                innodb_commit_concurrency 0
                                innodb_concurrency_tickets 500
                                innodb_data_file_path ibdata1:10M:autoextend
                                innodb_data_home_dir
                                innodb_adaptive_hash_index ON
                                innodb_doublewrite ON
                                innodb_fast_shutdown 1
                                innodb_file_io_threads 4
                                innodb_file_per_table OFF
                                innodb_flush_log_at_trx_commit 1
                                innodb_flush_method
                                innodb_force_recovery 0
                                innodb_lock_wait_timeout 50
                                innodb_locks_unsafe_for_binlog OFF
                                innodb_log_arch_dir
                                innodb_log_archive OFF
                                innodb_log_buffer_size 1048576
                                innodb_log_file_size 5242880
                                innodb_log_files_in_group 2
                                innodb_log_group_home_dir ./
                                innodb_max_dirty_pages_pct 90
                                innodb_max_purge_lag 0
                                innodb_mirrored_log_groups 1
                                innodb_open_files 300
                                innodb_rollback_on_timeout OFF
                                innodb_support_xa ON
                                innodb_sync_spin_loops 20
                                innodb_table_locks ON
                                Variable_name Value
                                innodb_thread_concurrency 8
                                innodb_thread_sleep_delay 10000
                                innodb_use_legacy_cardinality_algorithm ON
                                interactive_timeout 30
                                join_buffer_size 131072
                                key_buffer_size 805306368
                                key_cache_age_threshold 300
                                key_cache_block_size 1024
                                key_cache_division_limit 100
                                language /usr/share/mysql/english/
                                large_files_support ON
                                large_page_size 0
                                large_pages OFF
                                lc_time_names en_US
                                license GPL
                                local_infile ON
                                locked_in_memory OFF
                                log ON
                                log_bin OFF
                                log_bin_trust_function_creators OFF
                                log_error
                                log_queries_not_using_indexes OFF
                                log_slave_updates OFF
                                log_slow_queries ON
                                log_warnings 1
                                long_query_time 3
                                low_priority_updates OFF
                                lower_case_file_system OFF
                                lower_case_table_names 0
                                max_allowed_packet 5242880
                                max_binlog_cache_size 18446744073709547520
                                max_binlog_size 1073741824
                                max_connect_errors 10
                                max_connections 500
                                max_delayed_threads 20
                                max_error_count 64
                                max_heap_table_size 16777216
                                max_insert_delayed_threads 20
                                max_join_size 18446744073709551615
                                max_length_for_sort_data 1024
                                max_prepared_stmt_count 16382
                                max_relay_log_size 0
                                max_seeks_for_key 18446744073709551615
                                max_sort_length 1024
                                max_sp_recursion_depth 0
                                max_tmp_tables 32
                                max_user_connections 30
                                max_write_lock_count 18446744073709551615
                                multi_range_count 256
                                myisam_data_pointer_size 6
                                myisam_max_sort_file_size 9223372036853727232
                                myisam_mmap_size 18446744073709551615
                                myisam_recover_options OFF
                                myisam_repair_threads 1
                                myisam_sort_buffer_size 8388608
                                myisam_stats_method nulls_unequal
                                net_buffer_length 16384
                                net_read_timeout 30
                                net_retry_count 10
                                net_write_timeout 60
                                new OFF
                                old_passwords OFF
                                open_files_limit 8702
                                optimizer_prune_level 1
                                optimizer_search_depth 62
                                pid_file /var/lib/mysql/biz107.inmotionhosting.com.pid
                                plugin_dir
                                port 3306
                                preload_buffer_size 32768
                                protocol_version 10
                                query_alloc_block_size 8192
                                query_cache_limit 1048576
                                query_cache_min_res_unit 4096
                                query_cache_size 536870912
                                query_cache_type ON
                                query_cache_wlock_invalidate OFF
                                query_prealloc_size 8192
                                range_alloc_block_size 4096
                                read_buffer_size 268435456
                                read_only OFF
                                read_rnd_buffer_size 16777216
                                relay_log
                                relay_log_index
                                relay_log_info_file relay-log.info
                                relay_log_purge ON
                                relay_log_space_limit 0
                                rpl_recovery_rank 0
                                secure_auth OFF
                                secure_file_priv
                                server_id 0
                                skip_external_locking ON
                                skip_networking OFF
                                skip_show_database OFF
                                slave_compressed_protocol OFF
                                slave_load_tmpdir /tmp/
                                slave_net_timeout 3600
                                slave_skip_errors OFF
                                slave_transaction_retries 10
                                slow_launch_time 2
                                socket /var/lib/mysql/mysql.sock
                                Variable_name Value
                                sort_buffer_size 268435456
                                sql_big_selects ON
                                sql_mode
                                sql_notes ON
                                sql_warnings OFF
                                ssl_ca
                                ssl_capath
                                ssl_cert
                                ssl_cipher
                                ssl_key
                                storage_engine MyISAM
                                sync_binlog 0
                                sync_frm ON
                                system_time_zone PDT
                                table_cache 4096
                                table_lock_wait_timeout 50
                                table_type MyISAM
                                thread_cache_size 384
                                thread_stack 262144
                                time_format %H:%i:%s
                                time_zone SYSTEM
                                timed_mutexes OFF
                                tmp_table_size 33554432
                                tmpdir /tmp/
                                transaction_alloc_block_size 8192
                                transaction_prealloc_size 4096
                                tx_isolation REPEATABLE-READ
                                updatable_views_with_limit YES
                                version 5.0.92-community-log
                                version_comment MySQL Community Edition (GPL)
                                version_compile_machine x86_64
                                version_compile_os unknown-linux-gnu
                                wait_timeout 30

                                1 Reply Last reply
                                0
                                • V Offline
                                  V Offline
                                  vsorokin
                                  wrote on 10 Sept 2011, 14:40 last edited by
                                  #19

                                  Seems like server settings are good, same with my MySql server (I'm using UTF-8 for Russian symbols)

                                  Well...
                                  Just for test, right away after connection in your program try execute query:

                                  bq. SET CHARACTER SET utf8;

                                  and run program.

                                  --
                                  Vasiliy

                                  1 Reply Last reply
                                  0
                                  • B Offline
                                    B Offline
                                    BonRouge
                                    wrote on 13 Sept 2011, 14:34 last edited by
                                    #20

                                    OK. I tried that, but it didn't help.

                                    Here's the frustrating situation now.

                                    I want to use the same database with two different interfaces.

                                    Right now, the web interface is in UTF-8. I can input, retrieve and display Japanese charcters this way, but they seem to be stored as nonsense-looking characters.

                                    The Qt interface that I'm working on can input, retrieve and display Japanese characters, and they seem to be stored as Japanese characters too, rather than as nonsense-style stuff.

                                    Unfortunately, if I use the Qt interface to retrieve data that was inputted with the web interface, it displays as nonsense. If I use the web interface to retrieve data that was stored with the Qt interface, it displays as question marks.

                                    The Qt data at least looks correct in the database (viewed with phpMyAdmin). I thought maybe if I could convert all the data to be like that, it would be good, but firstly, I don't know how to do that, and secondly, as I said, when I try to display that data in a browser, it comes out as question marks, so...

                                    Any ideas??? [frustrated or confused smilie goes here]

                                    1 Reply Last reply
                                    0

                                    • Login

                                    • Login or register to search.
                                    • First post
                                      Last post
                                    0
                                    • Categories
                                    • Recent
                                    • Tags
                                    • Popular
                                    • Users
                                    • Groups
                                    • Search
                                    • Get Qt Extensions
                                    • Unsolved