How can i convert Utf-8 encoding to normal text

  • Hi,

    I want to display the Original Text from Utf-8 encoding,how can i do it i have tried several methods but not succeeded.
    Is there a chance to do so

    QFile uni("bengalifile.txt");;
        QTextStream out(&uni);
        QString str = out.readAll();
        const char *buff = str.toStdString().data();
        qDebug()<<"BUFF DATA IS"<<buff;
        QString s = QString::fromUtf8(buff,strlen(buff));
        qDebug()<<"STR DATA IS"<<str;
        qDebug()<<"S DATA IS"<<s;
        QTextCodec *codec = QTextCodec::codecForName("UTF-8");
        QByteArray decode = codec->fromUnicode(str);
        qDebug()<<"The Decoded String is"<<decode;
        qDebug()<<"The Decoded String EDIT IS"<<decode;

    I am getting the Output of Utf-8 encoding only

    But if i directly put the encoding characters in to QString::fromUtf8("encoding"),then i am able to see the original Text

    QString s = QString::fromUtf8("Encoding");
    O/P:- S DATA IS "Bengali Text"

    Please Guide me,

    Thanks in advance,

  • Moderators

    @Rohith said:

    I want to display the Original Text from Utf-8 encoding,how can i do it i have tried several methods but not succeeded.

    what exactly does that mean? What do you expect as original text?

    Also regarding "you only get UTF-8 text", the Qt docs for QTextCodec::fromUnicode(QString) say:

    Converts str from Unicode to the encoding of this codec, and returns the result in a QByteArray.

    so when you create a text codec for UTF-8 you will get UTF-8.

  • Try to make a gui application and output your text to a text edit and it will display properly.

    QByteArray string = "হ্যালো  ওয়ার্ল্ড ";
    QTextCodec *codec = QTextCodec::codecForName("UTF-8");
    QString encodedString = codec->toUnicode(string);  
     ui->textEdit->setText(encodedString );

  • @raven-worx

    "I want to display the Original Text from Utf-8 encoding,how can i do it i have tried several methods but not succeeded."

    This mean, i will get some data from Encoded data from the server and the data will be in UTF-8 encoding and i am storing the data into a file that i have received from server assume that BENGALI Text was encoded in to UTF-8 encoding by the server and was sent to me and for the end user when ever he opts to see the data i want to show him the Bengali Text by using the UTF-8 encoding data that was present in the File.

    Server->sends UTF-8 encoding of some bengali text->the UTF-8 encoding stored in to file->user wants to see data in file->it should be converted to bengali from UTF-8 and to be shown to the user.

    I hope you understood now

    Thanks in advance,

  • Moderators

    unfortunately i do not understand yet.
    Unicode should contain the Bengali characters no? So it should be displayed correctly?!
    Where do you dont get your desired output displayed? Is it only about the qDebug output?

  • @raven-worx

    Let me try one more time

    I am converting Bengali letters into** Utf-8 encoding** and i am storing the **converted Utf-8 encoding of Bengali letters in to a file **and, i want to read the data present in the file i.e i want to the Utf-8 encoded text from file and i want to convert encoding i.e Utf-8 encoded data into bengali letters and then i want to display the Bengali data that came from Utf-8 encoding that was present in file.

  • Moderators

    @Rohith said:

    convert encoding i.e Utf-8 encoded data into bengali letters

    the question is what encoding do you expect for the "bengali letters"?!
    Since the bengali characters can already be stored using UTF-8 encoding!

    Still u didn't tell me where you have problems displaying it...

  • @raven-worx
    Here I am trying to convert Bengali Unicode to Bengali text and print that.
    Bengali Unicode is present in file, and I am trying to read that Unicode from file and convert it to original Bengali text

    This is Original Bengali Text :- "সবাইকে শুভ সকাল"
    The unicode reading from file :- \u09b8\u09ac\u09be\u0987\u0995\u09c7 \u09b6\u09c1\u09ad \u09b8\u0995\u09be\u09b2

    O/P to be shown after reading unicode from file :- সবাইকে শুভ সকাল

    We are getting problem that whenever we are trying to read unicode from file dynamically and convert to original text.

    As we are passing unicode Text statically its working fine.

  • Moderators

    @Rohith said:

    The unicode reading from file :- \u09b8\u09ac\u09be\u0987\u0995\u09c7 \u09b6\u09c1\u09ad \u09b8\u0995\u09be\u09b2

    You mean this is the content of the file as text-representation?
    Can you please upload an example file?

  • @raven-worx

    Yeah that is what i mean

    \u09b8\u09ac\u09be\u0987\u0995\u09c7 \u09b6\u09c1\u09ad \u09b8\u0995\u09be\u09b2

    save this content in to a .txt file and try to read the content form the file and present it as normal text

    here i am not getting how to upload file in this forum

  • Moderators

    finally we get together, since this is an important info.
    You then have to "parse" the text data to the desired unicode representation.

    QString str = ...; // read from the file
    int idx = -1;
    while ( ( idx = str.indexOf("\\u") ) != -1 )
          int uc = str.mid(idx+2, 4).toInt(0,16);
          str.replace(idx, 6, QChar(uc));

    The reason it worked "statically" for you was, that the compiler already did the correct interpretation for you during compilation.

  • @raven-worx

    Thanks it worked....!

  • Moderators

    just to clarify:
    To transfer this information this way is rather resource wasteful.
    Since you create for 1 Unicode character 6 ASCII characters. This means (roughly - not exact) a factor of 6. Additionally you have to do the parsing.

    If possible i would change it, so the unicode text is transfered as "RAW" unicode in binary format.

  • @raven-worx

    If possible please provide me that too..!


  • Moderators

    instead of converting the unicode string to the escaped characters send it directly in binary form. When you have a QString already you can call QString::toUtf8() and send the returned QByteArray directly. On the client its enough to do QString::fromUtf8( receivedUtf8ByteArray.constData() )

    It depends how you implemented the transfer.
    But theoretically it should be enough to replace your unicode escaping code on the server with this approach.

