Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Problem with Japanese encoding/display
Forum Updated to NodeBB v4.3 + New Features

Problem with Japanese encoding/display

Scheduled Pinned Locked Moved General and Desktop
20 Posts 4 Posters 10.8k Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • F Offline
    F Offline
    Franzk
    wrote on last edited by
    #6

    Did you output the actual unicode code points (numbers) and compare them with the expected value?

    "Horse sense is the thing a horse has which keeps it from betting on people." -- W.C. Fields

    http://www.catb.org/~esr/faqs/smart-questions.html

    1 Reply Last reply
    0
    • B Offline
      B Offline
      BonRouge
      wrote on last edited by
      #7

      Well, I've just searched around to find out how I would do what you suggest, but I'm not finding it, so...
      How would I do that?

      1 Reply Last reply
      0
      • F Offline
        F Offline
        Franzk
        wrote on last edited by
        #8

        Uhm, I would try storing some known characters in the database. Then I would read it out with the above method, using both toString() and toByteArray(). Then see what the actual data is and then try to match it to the unicode table. I'd probably put the same known characters into a QString and see what the contents are:

        @QString str = QString::fromUtf8("whatever\u03c0");@

        str = whateverπ (That's lower case pi)

        "Horse sense is the thing a horse has which keeps it from betting on people." -- W.C. Fields

        http://www.catb.org/~esr/faqs/smart-questions.html

        1 Reply Last reply
        0
        • B Offline
          B Offline
          BonRouge
          wrote on last edited by
          #9

          I've been trying all sorts of things but I haven't found an answer yet.

          This character - 藤 - is this in unicode - \u85E4.
          I put that character into the database through my normal HTML/PHP web page. When I look at the database stuff in PHPMyAdmin, it looks like this - è—¤. It also looks like that when I call it in the Qt thing I'm building.

          I did this: @ QString st = snamej_t.toUtf8().toHex();@
          and got this: c3a8e28094c2a4
          I put that number into this page here - http://www.string-functions.com/hex-string.aspx - and got this - è—¤.

          I tried putting the same character (藤) into the database with my Qt interface and directly from the .cpp file. Both times, when I retrieved the data, I got something more strange - something like this - �?��.

          I was wondering again about Qt Creator and the encoding of the files. I changed the encoding of all files to UTF-8, but when I re-opened tham in Qt Creator, they seemed to have changed back to 'System'. As far as I can work out, the system encoding for this Windows PC I'm using should be unicode, because it's a Japanese OS.

          I hope you can help me find some kind of answer to this. It's driving me nuts.

          Thanks a lot.

          1 Reply Last reply
          0
          • F Offline
            F Offline
            Franzk
            wrote on last edited by
            #10

            Try that page's "Character Encoding Errors Analyzer":http://www.string-functions.com/encodingerror.aspx.

            I also think that you should look into "QTextCodec::setCodecForCStrings()":http://doc.trolltech.com/latest/qtextcodec.html#setCodecForCStrings. The results look like latin-1 versions of utf-8 encoded text.

            "Horse sense is the thing a horse has which keeps it from betting on people." -- W.C. Fields

            http://www.catb.org/~esr/faqs/smart-questions.html

            1 Reply Last reply
            0
            • G Offline
              G Offline
              goetz
              wrote on last edited by
              #11

              Maybe "this older thread":http://developer.qt.nokia.com/forums/viewthread/7048 is of help for you.

              http://www.catb.org/~esr/faqs/smart-questions.html

              1 Reply Last reply
              0
              • B Offline
                B Offline
                BonRouge
                wrote on last edited by
                #12

                Thank you both.

                I put that one character and the strange output into that error-analyzer page and got this:

                Displaying 4 results
                utf-8 (65001, Unicode (UTF-8)) -> Windows-1252 (1252, Western European (Windows))
                utf-8 (65001, Unicode (UTF-8)) -> windows-1254 (1254, Turkish (Windows))
                utf-8 (65001, Unicode (UTF-8)) -> windows-1256 (1256, Arabic (Windows))
                utf-8 (65001, Unicode (UTF-8)) -> windows-1258 (1258, Vietnamese (Windows))

                I tried this:
                @QTextCodec::setCodecForCStrings(QTextCodec::codecForName("UTF-8"));@
                but it didn't seem to do anything.

                The thread Volker pointed me to seemed very promising, but... I tried removing the collation of the MySql database through PHPMyAdmin, but it wouldn't seem to let me. When I changed it to utf8_general_ci (from utf8_unicode_ci) I was able to put the character into the database via my Qt UI and read it in PHPMyAdmin, but when I looked at my webpage (which uses a UTF-8 character set) I just got a question mark.

                Thanks for any more help. Sorry if this is just getting boring now...

                1 Reply Last reply
                0
                • F Offline
                  F Offline
                  Franzk
                  wrote on last edited by
                  #13

                  Could you write up an example with a database dump we can test?

                  "Horse sense is the thing a horse has which keeps it from betting on people." -- W.C. Fields

                  http://www.catb.org/~esr/faqs/smart-questions.html

                  1 Reply Last reply
                  0
                  • B Offline
                    B Offline
                    BonRouge
                    wrote on last edited by
                    #14

                    Here's a minimal case. (Is this enough?)

                    @
                    SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";

                    /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT /;
                    /
                    !40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS /;
                    /
                    !40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION /;
                    /
                    !40101 SET NAMES utf8 */;

                    CREATE TABLE IF NOT EXISTS students (
                    id smallint(3) NOT NULL auto_increment,
                    sname varchar(30) collate utf8_unicode_ci default NULL,
                    snamej mediumtext collate utf8_unicode_ci NOT NULL,
                    email varchar(60) collate utf8_unicode_ci default NULL,
                    email2 varchar(50) collate utf8_unicode_ci NOT NULL,
                    phone varchar(20) collate utf8_unicode_ci default NULL,
                    mobile varchar(15) collate utf8_unicode_ci default NULL,
                    dob date default NULL,
                    dobY year(4) NOT NULL default '0000',
                    dobM smallint(2) default NULL,
                    dobD smallint(2) default NULL,
                    uclass varchar(20) collate utf8_unicode_ci default NULL,
                    info longtext collate utf8_unicode_ci,
                    intro varchar(30) collate utf8_unicode_ci default NULL,
                    lessons decimal(2,1) NOT NULL,
                    freect smallint(2) NOT NULL,
                    level mediumtext collate utf8_unicode_ci NOT NULL,
                    type varchar(20) collate utf8_unicode_ci default NULL,
                    uctype varchar(20) collate utf8_unicode_ci default NULL,
                    old tinytext collate utf8_unicode_ci NOT NULL,
                    ssdiscount tinytext collate utf8_unicode_ci,
                    paidforby mediumtext collate utf8_unicode_ci,
                    paidforby_id int(11) NOT NULL,
                    paysfor mediumtext collate utf8_unicode_ci NOT NULL,
                    paysfor_id int(11) NOT NULL,
                    intschool tinytext collate utf8_unicode_ci,
                    booked varchar(1) collate utf8_unicode_ci NOT NULL,
                    startdate varchar(10) collate utf8_unicode_ci NOT NULL,
                    notcomenotes longtext collate utf8_unicode_ci NOT NULL,
                    paysfor2 varchar(30) collate utf8_unicode_ci NOT NULL,
                    pass varchar(8) collate utf8_unicode_ci NOT NULL,
                    onapack tinytext collate utf8_unicode_ci NOT NULL,
                    joint tinytext collate utf8_unicode_ci NOT NULL,
                    e1onlist tinyint(1) NOT NULL,
                    e2onlist tinyint(1) NOT NULL,
                    address varchar(200) collate utf8_unicode_ci NOT NULL,
                    PRIMARY KEY (id)
                    ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1428 ;

                    INSERT INTO students (id, sname, snamej, email, email2, phone, mobile, dob, dobY, dobM, dobD, uclass, info, intro, lessons, freect, level, type, uctype, old, ssdiscount, paidforby, paidforby_id, paysfor, paysfor_id, intschool, booked, startdate, notcomenotes, paysfor2, pass, onapack, joint, e1onlist, e2onlist, address) VALUES
                    (1007, 'Noriko Sato', '佐藤 紀子', 'noriko@phonecompany.jp', '', '022-333-9999', '090-2222-0000', '1971-12-10', 1971, 12, 10, '', '', '', '0.0', 0, '', '', 'Korean 50', '', '', '', 0, '', 0, '', 'y', '1226732681', '', '', '26ndjokmh4', '', '', 1, 0, '');

                    @

                    The one character I keep referring to is here: è—¤ (in the 'snamej' field).

                    1 Reply Last reply
                    0
                    • B Offline
                      B Offline
                      BonRouge
                      wrote on last edited by
                      #15

                      Sorry, but... bump.

                      I'm getting nowhere with this encoding issue and still hoping for help.

                      Thanks a lot.

                      1 Reply Last reply
                      0
                      • B Offline
                        B Offline
                        BonRouge
                        wrote on last edited by
                        #16

                        Hi. I'm not really sure what to do anymore but bump this again. Any ideas on this?

                        1 Reply Last reply
                        0
                        • V Offline
                          V Offline
                          vsorokin
                          wrote on last edited by
                          #17

                          Can you run
                          SHOW VARIABLES; command on your MySql server and show output?

                          --
                          Vasiliy

                          1 Reply Last reply
                          0
                          • B Offline
                            B Offline
                            BonRouge
                            wrote on last edited by
                            #18

                            Thanks for the response. Here's what I got:

                            Variable_name Value
                            auto_increment_increment 1
                            auto_increment_offset 1
                            automatic_sp_privileges ON
                            back_log 50
                            basedir /
                            binlog_cache_size 32768
                            bulk_insert_buffer_size 8388608
                            character_set_client utf8
                            character_set_connection utf8
                            character_set_database latin1
                            character_set_filesystem binary
                            character_set_results utf8
                            character_set_server latin1
                            character_set_system utf8
                            character_sets_dir /usr/share/mysql/charsets/
                            collation_connection utf8_general_ci
                            collation_database latin1_swedish_ci
                            collation_server latin1_swedish_ci
                            completion_type 0
                            concurrent_insert 1
                            connect_timeout 10
                            datadir /var/lib/mysql/
                            date_format %Y-%m-%d
                            datetime_format %Y-%m-%d %H:%i:%s
                            default_week_format 0
                            delay_key_write ON
                            delayed_insert_limit 100
                            delayed_insert_timeout 300
                            delayed_queue_size 1000
                            div_precision_increment 4
                            keep_files_on_create OFF
                            engine_condition_pushdown OFF
                            expire_logs_days 0
                            flush OFF
                            flush_time 0
                            ft_boolean_syntax + -><()~*:""&|
                            ft_max_word_len 84
                            ft_min_word_len 4
                            ft_query_expansion_limit 20
                            ft_stopword_file (built-in)
                            group_concat_max_len 1024
                            have_archive YES
                            have_bdb NO
                            have_blackhole_engine YES
                            have_compress YES
                            have_community_features NO
                            have_profiling NO
                            have_crypt YES
                            have_csv YES
                            have_dynamic_loading YES
                            have_example_engine YES
                            have_federated_engine YES
                            have_geometry YES
                            have_innodb YES
                            have_isam NO
                            have_merge_engine YES
                            have_ndbcluster NO
                            have_openssl NO
                            have_ssl NO
                            have_query_cache YES
                            have_raid NO
                            have_rtree_keys YES
                            have_symlink YES
                            hostname biz107.inmotionhosting.com
                            init_connect
                            init_file
                            init_slave
                            innodb_additional_mem_pool_size 1048576
                            innodb_autoextend_increment 8
                            innodb_buffer_pool_awe_mem_mb 0
                            innodb_buffer_pool_size 134217728
                            innodb_checksums ON
                            innodb_commit_concurrency 0
                            innodb_concurrency_tickets 500
                            innodb_data_file_path ibdata1:10M:autoextend
                            innodb_data_home_dir
                            innodb_adaptive_hash_index ON
                            innodb_doublewrite ON
                            innodb_fast_shutdown 1
                            innodb_file_io_threads 4
                            innodb_file_per_table OFF
                            innodb_flush_log_at_trx_commit 1
                            innodb_flush_method
                            innodb_force_recovery 0
                            innodb_lock_wait_timeout 50
                            innodb_locks_unsafe_for_binlog OFF
                            innodb_log_arch_dir
                            innodb_log_archive OFF
                            innodb_log_buffer_size 1048576
                            innodb_log_file_size 5242880
                            innodb_log_files_in_group 2
                            innodb_log_group_home_dir ./
                            innodb_max_dirty_pages_pct 90
                            innodb_max_purge_lag 0
                            innodb_mirrored_log_groups 1
                            innodb_open_files 300
                            innodb_rollback_on_timeout OFF
                            innodb_support_xa ON
                            innodb_sync_spin_loops 20
                            innodb_table_locks ON
                            Variable_name Value
                            innodb_thread_concurrency 8
                            innodb_thread_sleep_delay 10000
                            innodb_use_legacy_cardinality_algorithm ON
                            interactive_timeout 30
                            join_buffer_size 131072
                            key_buffer_size 805306368
                            key_cache_age_threshold 300
                            key_cache_block_size 1024
                            key_cache_division_limit 100
                            language /usr/share/mysql/english/
                            large_files_support ON
                            large_page_size 0
                            large_pages OFF
                            lc_time_names en_US
                            license GPL
                            local_infile ON
                            locked_in_memory OFF
                            log ON
                            log_bin OFF
                            log_bin_trust_function_creators OFF
                            log_error
                            log_queries_not_using_indexes OFF
                            log_slave_updates OFF
                            log_slow_queries ON
                            log_warnings 1
                            long_query_time 3
                            low_priority_updates OFF
                            lower_case_file_system OFF
                            lower_case_table_names 0
                            max_allowed_packet 5242880
                            max_binlog_cache_size 18446744073709547520
                            max_binlog_size 1073741824
                            max_connect_errors 10
                            max_connections 500
                            max_delayed_threads 20
                            max_error_count 64
                            max_heap_table_size 16777216
                            max_insert_delayed_threads 20
                            max_join_size 18446744073709551615
                            max_length_for_sort_data 1024
                            max_prepared_stmt_count 16382
                            max_relay_log_size 0
                            max_seeks_for_key 18446744073709551615
                            max_sort_length 1024
                            max_sp_recursion_depth 0
                            max_tmp_tables 32
                            max_user_connections 30
                            max_write_lock_count 18446744073709551615
                            multi_range_count 256
                            myisam_data_pointer_size 6
                            myisam_max_sort_file_size 9223372036853727232
                            myisam_mmap_size 18446744073709551615
                            myisam_recover_options OFF
                            myisam_repair_threads 1
                            myisam_sort_buffer_size 8388608
                            myisam_stats_method nulls_unequal
                            net_buffer_length 16384
                            net_read_timeout 30
                            net_retry_count 10
                            net_write_timeout 60
                            new OFF
                            old_passwords OFF
                            open_files_limit 8702
                            optimizer_prune_level 1
                            optimizer_search_depth 62
                            pid_file /var/lib/mysql/biz107.inmotionhosting.com.pid
                            plugin_dir
                            port 3306
                            preload_buffer_size 32768
                            protocol_version 10
                            query_alloc_block_size 8192
                            query_cache_limit 1048576
                            query_cache_min_res_unit 4096
                            query_cache_size 536870912
                            query_cache_type ON
                            query_cache_wlock_invalidate OFF
                            query_prealloc_size 8192
                            range_alloc_block_size 4096
                            read_buffer_size 268435456
                            read_only OFF
                            read_rnd_buffer_size 16777216
                            relay_log
                            relay_log_index
                            relay_log_info_file relay-log.info
                            relay_log_purge ON
                            relay_log_space_limit 0
                            rpl_recovery_rank 0
                            secure_auth OFF
                            secure_file_priv
                            server_id 0
                            skip_external_locking ON
                            skip_networking OFF
                            skip_show_database OFF
                            slave_compressed_protocol OFF
                            slave_load_tmpdir /tmp/
                            slave_net_timeout 3600
                            slave_skip_errors OFF
                            slave_transaction_retries 10
                            slow_launch_time 2
                            socket /var/lib/mysql/mysql.sock
                            Variable_name Value
                            sort_buffer_size 268435456
                            sql_big_selects ON
                            sql_mode
                            sql_notes ON
                            sql_warnings OFF
                            ssl_ca
                            ssl_capath
                            ssl_cert
                            ssl_cipher
                            ssl_key
                            storage_engine MyISAM
                            sync_binlog 0
                            sync_frm ON
                            system_time_zone PDT
                            table_cache 4096
                            table_lock_wait_timeout 50
                            table_type MyISAM
                            thread_cache_size 384
                            thread_stack 262144
                            time_format %H:%i:%s
                            time_zone SYSTEM
                            timed_mutexes OFF
                            tmp_table_size 33554432
                            tmpdir /tmp/
                            transaction_alloc_block_size 8192
                            transaction_prealloc_size 4096
                            tx_isolation REPEATABLE-READ
                            updatable_views_with_limit YES
                            version 5.0.92-community-log
                            version_comment MySQL Community Edition (GPL)
                            version_compile_machine x86_64
                            version_compile_os unknown-linux-gnu
                            wait_timeout 30

                            1 Reply Last reply
                            0
                            • V Offline
                              V Offline
                              vsorokin
                              wrote on last edited by
                              #19

                              Seems like server settings are good, same with my MySql server (I'm using UTF-8 for Russian symbols)

                              Well...
                              Just for test, right away after connection in your program try execute query:

                              bq. SET CHARACTER SET utf8;

                              and run program.

                              --
                              Vasiliy

                              1 Reply Last reply
                              0
                              • B Offline
                                B Offline
                                BonRouge
                                wrote on last edited by
                                #20

                                OK. I tried that, but it didn't help.

                                Here's the frustrating situation now.

                                I want to use the same database with two different interfaces.

                                Right now, the web interface is in UTF-8. I can input, retrieve and display Japanese charcters this way, but they seem to be stored as nonsense-looking characters.

                                The Qt interface that I'm working on can input, retrieve and display Japanese characters, and they seem to be stored as Japanese characters too, rather than as nonsense-style stuff.

                                Unfortunately, if I use the Qt interface to retrieve data that was inputted with the web interface, it displays as nonsense. If I use the web interface to retrieve data that was stored with the Qt interface, it displays as question marks.

                                The Qt data at least looks correct in the database (viewed with phpMyAdmin). I thought maybe if I could convert all the data to be like that, it would be good, but firstly, I don't know how to do that, and secondly, as I said, when I try to display that data in a browser, it comes out as question marks, so...

                                Any ideas??? [frustrated or confused smilie goes here]

                                1 Reply Last reply
                                0

                                • Login

                                • Login or register to search.
                                • First post
                                  Last post
                                0
                                • Categories
                                • Recent
                                • Tags
                                • Popular
                                • Users
                                • Groups
                                • Search
                                • Get Qt Extensions
                                • Unsolved