Compare strings with characters like "šđčćž"



  • Hi all,

    what is the best way to compare two strings that contains characters like "šđčćž"?

    Tnx,
    Zgembo



  • Like for equality?:

        QString str1("šđčćž");
        QString str2("šđčćž");
        QString str3("sdccz");
    
        qInfo() << bool(str1 == str2);
        qInfo() << bool(str1 == str3);
    


  • @fcarney Tnx for suggestion, but it does not work in my case.



  • @Zgembo said in Compare strings with characters like "šđčćž":

    does not work in my case

    Then show us the case that it does not work with some code? My crystal ball is busted today.



  • @Zgembo I think the best is to use QString::localeAwareCompare()



  • @fcarney
    This is the part of the code that I use and it does not work.

    for (int i = 0; i < data->rowCount(); i++) {
    		record = data->record(i);
    		courseName = record.value("naziv").toString();
    		if (bool(courseName == QString("Uvod u informacione sisteme")))
    			uisEnrolled.append(record.value("broj").toInt());
    		else if (bool(courseName == QString("Tehnologije i sistemi za podršku korisnicima")))
    			tiszpEnrolled.append(record.value("broj").toInt());
    		else if(bool(courseName == QString("Računovodstveni informacioni sistemi").toUtf8()))
    			risEnrolled.append(record.value("broj").toInt());
    	}
    

    First if clause works because it does not have any special characters. The second and third do not work. Data is read from the database.


  • Qt Champions 2018

    @Zgembo said in Compare strings with characters like "šđčćž":

    if (bool(courseName == QString("Uvod u informacione sisteme")))

    Why do you cast a bool to a bool?!

    Did you check whether courseName contains correct string?



  • @Zgembo Can you try this

    for (int i = 0; i < data->rowCount(); i++) {
    		record = data->record(i);
    		courseName = record.value("naziv").toString();
    		if (courseName.localeAwareCompare(QStringLiteral("Uvod u informacione sisteme")) == 0)
    			uisEnrolled.append(record.value("broj").toInt());
    		else if (courseName.localeAwareCompare(QStringLiteral("Tehnologije i sistemi za podršku korisnicima")) == 0)
    			tiszpEnrolled.append(record.value("broj").toInt());
    		else if(courseName.localeAwareCompare(QStringLiteral("Računovodstveni informacioni sistemi")) == 0)
    			risEnrolled.append(record.value("broj").toInt());
    	}
    

    QStringLiteral() will ensure string is encoded in UTF8 and localeAwareCompare() should do the job!



  • @KroMignon said in Compare strings with characters like "šđčćž":

    @Zgembo Can you try this

    for (int i = 0; i < data->rowCount(); i++) {
    		record = data->record(i);
    		courseName = record.value("naziv").toString();
    		if (courseName.localeAwareCompare(QStringLiteral("Uvod u informacione sisteme")) == 0)
    			uisEnrolled.append(record.value("broj").toInt());
    		else if (courseName.localeAwareCompare(QStringLiteral("Tehnologije i sistemi za podršku korisnicima")) == 0)
    			tiszpEnrolled.append(record.value("broj").toInt());
    		else if(courseName.localeAwareCompare(QStringLiteral("Računovodstveni informacioni sistemi")) == 0)
    			risEnrolled.append(record.value("broj").toInt());
    	}
    

    QStringLiteral() will ensure string is encoded in UTF8 and localeAwareCompare() should do the job!

    I have modified the code to use QStringLiteral() and it works. Thank you. This is the code snippet.

    for (int i = 0; i < data->rowCount(); i++) {
    		record = data->record(i);
    		courseName = record.value("naziv").toString();
    		if (courseName == QStringLiteral("Uvod u informacione sisteme"))
    			uisEnrolled.append(record.value("broj").toInt());
    		else if (courseName == QStringLiteral("Tehnologije i sistemi za podršku korisnicima"))
    			tiszpEnrolled.append(record.value("broj").toInt());
    		else if(courseName == QStringLiteral("Računovodstveni informacioni sistemi"))
    			risEnrolled.append(record.value("broj").toInt());
    	}
    


  • @Zgembo Your welcome ;)

    The problem with your previous code is that QString("Tehnologije i sistemi za podršku korisnicima") will translate your string as UNICODE string not UTF-8.
    QString::fromUtf8("Tehnologije i sistemi za podršku korisnicima") will also works, but QStringLiteral() will generate the string once a compilation, so you will have better performances.
    ==> take a look at QStringLiteral explained for more details.



  • @KroMignon said in Compare strings with characters like "šđčćž":

    @Zgembo Your welcome ;)

    The problem with your previous code is that QString("Tehnologije i sistemi za podršku korisnicima") will translate your string as UNICODE string not UTF-8.
    QString::fromUtf8("Tehnologije i sistemi za podršku korisnicima") will also works, but QStringLiteral() will generate the string once a compilation, so you will have better performances.
    ==> take a look at QStringLiteral explained for more details.

    @KroMignon thank you for your time and detailed explanation.


  • Qt Champions 2018

    @KroMignon said in Compare strings with characters like "šđčćž":

    will translate your string as UNICODE string not UTF-8.

    What do you mean with unicode here?
    Since Qt5 QString(const char*) will treat the char array as utf-8 but if your compiler correctly parses the source as utf-8 is another question (msvc has problems with it) therefore my recommendation is to not use anything but latin1 in the source code and do a proper translation.



  • @Christian-Ehrlicher said in Compare strings with characters like "šđčćž":

    What do you mean with unicode here?

    From the QString documentation:

    QString::QString(const char *str)
    Constructs a string initialized with the 8-bit string str. The given const char pointer is converted to Unicode using the fromUtf8() function.
    You can disable this constructor by defining QT_NO_CAST_FROM_ASCII when you compile your applications. This can be useful if you want to ensure that all user-visible strings go through QObject::tr(), for example.


  • Qt Champions 2018

    @KroMignon said in Compare strings with characters like "šđčćž":

    to Unicode using the fromUtf8() function.

    So the documentation states exactly what I wrote - QStringLiteral("foo") and QString("foo") both interpret the string as utf-8 and converts it to it's internal QString representation (which is utf-16).
    The only difference is that QStringLiteral() does it at compile time and QString(const char*) at run time.



  • @Christian-Ehrlicher In my experience with QString, QStringLiteral(const char*) != QString(const char*) but QStringLiteral(const char*) == QString::fromUtf8(const char*).


  • Qt Champions 2018

    @KroMignon @Christian-Ehrlicher is right, it all depends on the compiler. source code is 8 bit and your compiler can treat it with any encoding it likes. if you need non-ASCII, you should encode it with the C++11 unicode literals.



  • @aha_1980 said in Compare strings with characters like "šđčćž":

    it all depends on the compiler.

    @aha_1980 @Christian-Ehrlicher , I agree with you, but as I write before, when using QStringLiteral() or QString::fromUtf8() I always works like I expect it to work. I do multi-platform development (Windows XP/7, Linux ARM/x86 and Android), so there are many different compilers I have to use.
    This was for me, the working solution.
    I just sharing experience.



  • @Christian-Ehrlicher
    Didn't we see this same issue with chinese symbols in the source code? How would you solve this "doing a proper translation"? How would you store the comparison strings for checking data from the database? Using tr()?

    Is the problem here with the actual character set used to write the source code? I cannot reproduce the error, but I would like to know how to reproduce this.
    @Zgembo
    What compiler, editor, OS, and version of Qt are you running?
    Do you know what the character encoding is for the source files?

    Edit:
    A QStringLiteral compiles a read only object in memory that stores the string. A QString would just grab the stored string that the compiler stored at compile time. Somehow that stored string is different than the string object that is generated by QStringLiteral. This is why it fails.

    I noticed my version of Qt Creator sets its encoding to "System". I checked the encoding of the source files I have and they are Utf-8. So I would guess that other people's system have different encodings and that is where the issue is. Is this correct?

    Also, in this case, what is the database character encoding?


  • Qt Champions 2018

    @fcarney : when you have utf-8 encoded text in your source code you have to make sure the compiler knows this. On Linux this is no problem since the default locale is utf-8 by default. On Windows you have to pass /utf-8 to the msvc compiler to be really sure.

    With 'proper translation' I mean tr(), yes.

    'A QStringLiteral compiles a read only object in memory that stores the string.'

    What do you mean by 'string' here? It's stored as utf-16 so QString can access it without doing a conversion first which is faster than first creating it from an utf-8 encoded char* array.



  • @Christian-Ehrlicher said in Compare strings with characters like "šđčćž":

    What do you mean by 'string' here?

    I was trying to grasp if that the QStringLiteral stores something different than the string literal the compiler stores when creating the temporary QString object. Which would explain why QStringLiteral("šđčćž") != QString("šđčćž") on some systems. So by string I mean "šđčćž" as interpreted by the compiler as a literal. I hope I am using the right words.


Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.