Encoding and locale problem
-
Hello everyone.
I have some situation with reading .docx and .txt files. Sounds foolish? I think so too, but it's a real problem in my case, and i have not found nothing specific in google.
The main difficulty lies in the fact, what i can't read .docx files at all. Is there some solution?
Secondary problem is in encoding. When i read .txt files on english - all just perfect. But text on russian shows in wrong encoding, even then i set default encoding CP-1251. Any ideas?
Many thanks in advance.
Here is a code:
@compare::compare()
{
exampleOldVersion = new QDir("C:/Users/House15/Desktop/exampleOldVersion");
exampleNewVersions = new QDir("C:/Users/House15/Desktop/exampleNewVersions");
exampleOfResults = new QDir("C:/Users/House15/Desktop/exampleOfResult");ListOfOldFiles = exampleOldVersion->entryList(QDir::Files,QDir::NoSort); ListOfNewFiles=exampleNewVersions->entryList(QDir::Files,QDir::NoSort); stringInOldFile = new char[80]; stringInNewFile= new char[80]; oldVersion=new QFile(exampleOldVersion->absolutePath()+"/"+ListOfOldFiles.first()); QFile *newVersionOfOldFile; for(int i=0; i<ListOfNewFiles.count();i++) { newVersionOfOldFile = new QFile(exampleNewVersions->absolutePath()+"/"+ListOfNewFiles.at(i)); if(newVersionOfOldFile) newVersions.append(newVersionOfOldFile); else { ListOfNewFiles.removeAt(i); i--; } newVersionOfOldFile=NULL; } oldVersion->open(QIODevice::ReadWrite); QTextStream in(&(*oldVersion)); QString testString=in.device()->readAll(); in.device()->seek(0); in>>stringInOldFile; qDebug()<<testString+" \n\n "+stringInOldFile; oldVersion->close();
}@
-
docx is a compressed file of several XML files. You may want to extract it first and then read file by file.
qDebug() supports only ANSI characters as I experienced. Try printing text in text edit widget (QTextEdit class).
You can also add
@CODECFORSRC = UTF-8@
in your project file.Note: char also only supports ANSI characters. Consider replacing them with QString.
-
[quote author="Jake007" date="1330792854"]docx is a compressed file of several XML files. You may want to extract it first and then read file by file.
qDebug() supports only ANSI characters as I experienced. Try printing text in text edit widget (QTextEdit class).
You can also add
@CODECFORSRC = UTF-8@
in your project file.Note: char also only supports ANSI characters. Consider replacing them with QString.[/quote]
You meen for docx documents need QXml classes? Extracting text from xml is standart procedure, or i need write parser for this purpose?
-
Another problem. I've notice strange behavior of Ascii code. Then i use russian text, Ascii is always 63 (only '/r' displays like 13) on each character. I use QTextCodec("CP1251") - is it possible cause of this problem? Then i use english text - all displays normal.