Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. How can I convert Unicode string into Shift-JIS?
QtWS25 Last Chance

How can I convert Unicode string into Shift-JIS?

Scheduled Pinned Locked Moved General and Desktop
11 Posts 5 Posters 7.6k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Z Offline
    Z Offline
    Zain
    wrote on 7 May 2013, 10:00 last edited by
    #1

    Hello all,

    I have QString in Unicode format coming from japan country page source.
    I need to convert that Unicode string into Japanese character format.
    I have tried this code but it's crashing my program.

    @
    QString htmlString="Amazonベーシック ハイスピードHDMIケーブル 2.0m (タイプAオス- タイプAオス、イーサネット、
    3D、オーディオリターン対応)";

     QTextCodec *codec = QTextCodec::codecForName("Shift-JIS");
     QByteArray encodedString = codec->fromUnicode(htmlString);
    
    
    QString htmlString1=encodedString.data();
    

    @

    Please help me out where I am doing wrong or any other way to achive this.

    Thanks in advance.

    1 Reply Last reply
    0
    • S Offline
      S Offline
      SGaist
      Lifetime Qt Champion
      wrote on 7 May 2013, 10:58 last edited by
      #2

      Hi,

      You don't check whether codec is null. Are you sure you have one that support "Shift-JIS" ?

      Interested in AI ? www.idiap.ch
      Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

      1 Reply Last reply
      0
      • Z Offline
        Z Offline
        Zain
        wrote on 7 May 2013, 11:25 last edited by
        #3

        Thanks SGaist for quick reply.

        Yes I need to convert into Shift-JIS format.
        There is no content type is set on main page source head tag.This is coming from Japanese country web page which supports Shift-JIS so it will work for me.

        1 Reply Last reply
        0
        • R Offline
          R Offline
          raven-worx
          Moderators
          wrote on 7 May 2013, 11:50 last edited by
          #4

          what SGaist meant was if you have the codec on your system.
          Since QTextCodec::codecForName() returns a null pointer if it can't find the codec which will lead to a crash since you accessing the pointer right in the next line.

          --- SUPPORT REQUESTS VIA CHAT WILL BE IGNORED ---
          If you have a question please use the forum so others can benefit from the solution in the future

          1 Reply Last reply
          0
          • S Offline
            S Offline
            SGaist
            Lifetime Qt Champion
            wrote on 7 May 2013, 11:55 last edited by
            #5

            I understood you right. My question is: Are you sure the text codec is available ? In my list of text codecs I have Shift_JIS (I don't think the - or _ is really relevant, I tried with both and didn't got a crash.

            Take a look at the output of "QTextCodec::availableCodecs()":http://qt-project.org/doc/qt-4.8/qtextcodec.html#availableCodecs

            Also, try to run application with a debugger to see what happens

            Interested in AI ? www.idiap.ch
            Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

            1 Reply Last reply
            0
            • Z Offline
              Z Offline
              Zain
              wrote on 7 May 2013, 13:05 last edited by
              #6

              Here is the output of @ QTextCodec::availableCodecs(); @

              ("UTF-8", "ISO-8859-1", "latin1", "CP819", "IBM819", "iso-ir-100", "csISOLatin1", "ISO-8859-15", "latin9", "UTF-32LE", "UTF-32BE", "UTF-32", "UTF-16LE", "UTF-16BE", "UTF-16", "System", "roman8", "hp-roman8", "csHPRoman8", "TIS-620", "ISO 8859-11", "WINSAMI2", "WS2", "Apple Roman", "macintosh", "MacRoman", "windows-1258", "CP1258", "windows-1257", "CP1257", "windows-1256", "CP1256", "windows-1255", "CP1255", "windows-1254", "CP1254", "windows-1253", "CP1253", "windows-1252", "CP1252", "windows-1251", "CP1251", "windows-1250", "CP1250", "IBM866", "CP866", "csIBM866", "IBM874", "CP874", "IBM850", "CP850", "csPC850Multilingual", "ISO-8859-16", "iso-ir-226", "latin10", "ISO-8859-14", "iso-ir-199", "latin8", "iso-celtic", "ISO-8859-13", "ISO-8859-10", "iso-ir-157", "latin6", "ISO-8859-10:1992", "csISOLatin6", "ISO-8859-9", "iso-ir-148", "latin5", "csISOLatin5", "ISO-8859-8", "ISO 8859-8-I", "iso-ir-138", "hebrew", "csISOLatinHebrew", "ISO-8859-7", "ECMA-118", "greek", "iso-ir-126", "csISOLatinGreek", "ISO-8859-6", "ISO-8859-6-I", "ECMA-114", "ASMO-708", "arabic", "iso-ir-127", "csISOLatinArabic", "ISO-8859-5", "cyrillic", "iso-ir-144", "csISOLatinCyrillic", "ISO-8859-4", "latin4", "iso-ir-110", "csISOLatin4", "ISO-8859-3", "latin3", "iso-ir-109", "csISOLatin3", "ISO-8859-2", "latin2", "iso-ir-101", "csISOLatin2", "KOI8-U", "KOI8-RU", "KOI8-R", "csKOI8R", "Iscii-Mlm", "Iscii-Knd", "Iscii-Tlg", "Iscii-Tml", "Iscii-Ori", "Iscii-Gjr", "Iscii-Pnj", "Iscii-Bng", "Iscii-Dev", "TSCII", "GB18030", "GBK", "GB2312", "CP936", "MS936", "windows-936", "EUC-JP", "ISO-2022-JP", "Shift_JIS", "JIS7", "SJIS", "MS_Kanji", "EUC-KR", "cp949", "Big5", "Big5-HKSCS", "Big5-ETen", "CP950")

              The program crashing problem is resolved.
              But while trying to append my final string to text box like follows.

              @ui->txtPageSource->appendPlainText(htmlString1);@

              getting same string not converted into Japanese lang.

              while expected Result is:

              Amazonベーシック ハイスピードHDMIケーブル 2.0m (タイプAオス- タイプAオス、イーサネット、3D、オーディオリターン対応)

              I find one thing that my string is containing HTMLcode number and might be required to convert in Unicode.
              I am confused here.

              1 Reply Last reply
              0
              • S Offline
                S Offline
                SGaist
                Lifetime Qt Champion
                wrote on 7 May 2013, 13:20 last edited by
                #7

                Wait, before going any further... Why not use:

                @ ui->txtPageSource->appendHtml(htmlString);@

                ?

                No need to do any conversion

                Interested in AI ? www.idiap.ch
                Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

                1 Reply Last reply
                0
                • T Offline
                  T Offline
                  tobias.hunger
                  wrote on 7 May 2013, 13:49 last edited by
                  #8

                  htmlString contains no Japanese characters at all from what I see. There is just plain Latin1 text with HTML escape sequences that should be ignored by QString.

                  1 Reply Last reply
                  0
                  • Z Offline
                    Z Offline
                    Zain
                    wrote on 7 May 2013, 13:55 last edited by
                    #9

                    Hey SGaist thanks now its working for Text Box.
                    But I need to put same result on QTableWidgetItem.

                    I am trying but getting same older string format.

                    1 Reply Last reply
                    0
                    • Z Offline
                      Z Offline
                      Zain
                      wrote on 8 May 2013, 06:40 last edited by
                      #10

                      Thanks Tobias Hunger for reply,

                      Can you please suggest me how can I convert this Latin1 text to their respective unicode or html format by which I can get respective Japanese characters at the time of adding string to QTableWidgetItem?

                      1 Reply Last reply
                      0
                      • N Offline
                        N Offline
                        Nagata
                        wrote on 9 May 2013, 13:21 last edited by
                        #11

                        Below is sample code that converts from html format string to QString.

                        @QString htmlString="Amazonベーシック ...";
                        ui->textEdit->setHtml(htmlString);

                        QString str;
                        QRegExp rx("&#(\d+);");
                        int pos1 = 0, pos2 = 0;
                        while ((pos2 = rx.indexIn(htmlString, pos2)) != -1) {
                        str.append(htmlString.mid(pos1, pos2-pos1));
                        str.append(QChar(rx.cap(1).toInt())); // "&#xxxxx;" -> QChar(xxxxx)
                        pos2 += rx.matchedLength();
                        pos1 = pos2;
                        }
                        str.append(htmlString.mid(pos1));
                        ui->textEdit_2->setText(str);
                        @

                        Hope it helps.

                        1 Reply Last reply
                        0

                        7/11

                        7 May 2013, 13:20

                        • Login

                        • Login or register to search.
                        7 out of 11
                        • First post
                          7/11
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • Users
                        • Groups
                        • Search
                        • Get Qt Extensions
                        • Unsolved