QTextStream writing cellida characters (é,è,ë,ĉ,ç) to disk appear as ?
-
Hi. A little bit of background on my situation starts off with the general idea of the application. I am trying to build a data entry application using web sockets with both a mobile and desktop version.
Using a websocket server running on windows and built through qt i receive qstrings and process my data without any problems however there seems to be a bug when a websocket server is running on a windows machine and interpreting QStrings from a mac0S or iOS client
Characters such as:
À é è ê ę ç
Appear as ? When parsed by the windows server.. i believe this is because of the nature of characters on mac and ios using cedilla combining character.
However strings with such characters sent from an android or pc linux client are not lost in parsing.
I am using TextMessages to pass information. Any help would be greatly appreciated :D
Edit:
Please see post 6, the problem is caused when writing cellida characters to disk using QTextStream -
Fixed, the solution was quite simple but I still believe this may be a bug. As cellida characters received from a macOS or iOS environment fail to be autoDetected as UTF-8, and the fallback to system on windows renders the characters as question marks to data on disk.
Solution:
bool FileIO::writeFile(const QString& data, const QString& filename) { qDebug() << data; qDebug() << filename; if (filename.isEmpty()) return false; QFile file(cDir.path() + cDir.separator() + filename); if (!file.open(QFile::WriteOnly | QFile::Truncate)) return false; QTextStream out(&file); out.setCodec("UTF-8"); //Makes sure to always force UTF-8 out << data; file.close(); return true; }
-
- are all clients built by you using Qt?
- how exactly are you sending (encoode) the data sent?
- how exactly do you receive (decode) the received data on the windows server?
- where do you see the
'?'
?
For example you should send UTF-8 binary data over the web-socket and do proper encoding on the other side from UTF-8.
-
@raven-worx Yes all clients are from Qt
I am receiving a json stringifyied object where a string is received like this
{ "messageIndex":"0", "Mautent":["mess":"djhkgs","ax":"568hfduy4"], "data": "{'name':'TéÈç', ...}" }
This data is received from a TextInput and then sent from any client on the different OS's through sendTextMessage();
From there on I receive the object on a windows machine running a Qt websocketServer, search the string to verify authentication, either parse or not based on the result.
However text pulled from a macOS or iOS client results in a question mark where cellida characters were placed.
I believe that for example ç character which is encoded in utf-8 as 0xc3 0xa7 gets passed as 0x63 0xcc 0xa7
-
@arozon said in WebSocket Server on windows misrepresenting strings with cellida characters (é,è,ë,ĉ,ç) from macOS:
"data": "{'name':'TéÈç', ...}"
so you have a JSON object which has another JSON object but the nested JSON data is transfered as string?? o.O
Why?!Also you didn't show how you are composing your JSON data and also how you are decomposing it.
-
@raven-worx Not quite, but yeah. Its a json object where one of the objects holds a string representing the data that can be easily parshed directly. I didnt make it over complicated as strings are very small and the data entry will not be excessive. I perform a handshake before hand, but thats irrelevant.
In terms of setting data quite simple..
// SENDING THE DATA TO THE SERVER (SEND AS TEXT MESSAGE) function getDataString() { var mess = JSON.parse('{"messageIndex":"","Mautent":["mess":"","ax":""], "data": ""}'); var dataobj = JSON.parse('{"name":"", ...}'); dataobj.name = ftextname.text; //TéÈç ... mess.data = JSON.stringify(dataobj); ... return JSON.stringify(mess); } // PARSING THE RECEIVED STRING function parseMessage(message) { try { var parsed = JSON.parse(message); // authenticate if (auth) { file.writeFile(parsed.data, getRandomName()); } } catch (e) { // Oh well, but whatever... } } // ON THE C++ SIDE (file.writeFile) (QML PLUGIN) bool FileIO::writeFile(const QString& data, const QString& filename) { if (filename.isEmpty()) return false; QFile file(cDir.path() + cDir.separator() + filename); if (!file.open(QFile::WriteOnly | QFile::Truncate)) return false; QTextStream out(&file); out << data; file.close(); return true; }
-
After working through more debug, the culprit seems to be when saving the file. Up to that point my string remains intact.
Changing the main topic to QTextStream encoding. When writing the stream to the disk, cellida characters are lost.bool FileIO::writeFile(const QString& data, const QString& filename) { if (filename.isEmpty()) return false; QFile file(cDir.path() + cDir.separator() + filename); if (!file.open(QFile::WriteOnly | QFile::Truncate)) return false; QTextStream out(&file); out << data; file.close(); return true; }
-
Fixed, the solution was quite simple but I still believe this may be a bug. As cellida characters received from a macOS or iOS environment fail to be autoDetected as UTF-8, and the fallback to system on windows renders the characters as question marks to data on disk.
Solution:
bool FileIO::writeFile(const QString& data, const QString& filename) { qDebug() << data; qDebug() << filename; if (filename.isEmpty()) return false; QFile file(cDir.path() + cDir.separator() + filename); if (!file.open(QFile::WriteOnly | QFile::Truncate)) return false; QTextStream out(&file); out.setCodec("UTF-8"); //Makes sure to always force UTF-8 out << data; file.close(); return true; }