[SOLVED]how can i manipulate big files?



  • Hi everyone,
    Im sorry if this is a doublepost,
    but i can not find this problem in other topics.

    I'm trying to put two strings together when reading from 2 different files.
    Each file contains a lot of lines with on each line only 4 characters.
    Each file is in total 2,2 mb.

    I thought i knew how to read and write from a file and how to put 2 strings together.
    But when i run my code, nothing seems to happen, but it takes me 100% user proces space.

    I show you some of my code:

    @//this wil read from a file
    QFile file1(ui->file1->text());
    if (!file1.open(QIODevice::ReadOnly | QIODevice::Text))
    return;

    QString line = file1.readLine();
    while (!line.isNull())
    {
    ui->textEdit->append(line);;
    line = file1.readLine();
    }
    file1.close();
    }@

    @//with this i try to write to a file:
    QFile uitbest(ui->uitvoerfile->text());
    if (!uitbest.open(QIODevice::WriteOnly | QIODevice::Text))
    return;
    QTextStream ts(&uitbest);
    ts << str << "\n";
    uitbest.close();@

    @//and with this i connect 2 strings.

    QString line = file1.readLine();
    while (!line.isNull())
    {

    if (!file2.open(QIODevice::ReadOnly | QIODevice::Text))
    return;
    line.append(file2.readline);
    writeline(line);
    line = file1.readLine();
    }
    file1.close();
    file2.close();
    }@

    The problem seems to be that the program runs everything in his "head" instead of printing the new formed string to textfile or textbox.
    And when i get results after a long long time, it seems that he just doesnt have done the job. I get onrealistic results.

    So:
    -Does anybody knows what the problem is?
    -Does QT running the prog. first in ram and then printing it?
    -Is the file to large to proces? 2,2 mb.



  • Hi,

    2,2 MB of text is not really much, there are other problems.

    As far as I see from your code, you do everything in the main thread, so if you have long running processes, you block your UI.

    your first code that reads from a file appands each line seperatly to the text edit. This is not the most performant way to read files. The text edit always has to appaned the string, calculate the new visible stings, scroll areas etc, emit update events, recalculate size hints if needed etc. If you would read the complete file at once and only set it once to the UI it would be much faster. Especially, the UI will not update if you don't return to the event loop, update is an asynchronous operation.

    The code you use to write to a file is generally ok, but do you call it once or more often? You don't append to the file, so the only content would be the content of str.

    your code of appending the two strings always reads the first line of the second file, is that what you wanted to do? You try to reopen file 2 in each loop (each line of file1) and never close it in between, why? Afaik open fails if it is called on an already open file.

    Some general hints:

    it would help more, if you post full functions and so we see also, which sub functions are called. I'm not sure whether your writeline(line); calles the second code plock or not.



  • Hi Gerolf:

    Thanks for the reply!
    But 'offcourse' i don't seem to understand a few of your suggestions.

    [quote author="Gerolf" date="1298194691"]
    As far as I see from your code, you do everything in the main thread
    [/quote]
    I never did anything with threads, so i wont ask you stupid questions about it. I have to read more about it to implement. But i understand that the program will act 'nice' to the system.

    [quote]your first code that reads from a file appands each line seperatly to the text edit. This is not the most performant way to read files
    [/quote]
    i know, but to put the writing to the textedit here, is just for testing;
    the output has to be the same in a file or txtbox or stdout.

    [quote]Especially, the UI will not update if you don't return to the event loop, update is an asynchronous operation [/quote]
    do you mean that i have to update the ui within the while loop?
    [quote]
    you call it once or more often? You don't append to the file, so the only content would be the content of str. [/quote]
    i feel stupid but thanks!!

    [quote]your code of appending the two strings always reads the first line of the second file, is that what you wanted to do? You try to reopen file 2 in each loop (each line of file1) and never close it in between, why? Afaik open fails if it is called on an already open file.[/quote]
    I think i understand what you mean, so with your words i putted the open cast from the second file out of the loop.
    And it seems to work.

    BUT!!!
    i got it a little better with your help, but still i get some strange output.

    You asked for the whole code: (
    i started all over again so the naming is different and i put everything 1 func. so it is easier to read (also for me)

    @void MainWindow::joinfiles()
    {
    //first set file vars.
    QFile file1(ui->in_file1->text());
    QFile file2(ui->in_file2->text());
    QFile outf(ui->out_file->text());
    QTextStream txtstr(&outf);
    //then try to open the files
    if (!file1.open(QIODevice::ReadOnly | QIODevice::Text))
    return;
    if (!file2.open(QIODevice::ReadOnly | QIODevice::Text))
    return;
    if (!outf.open(QIODevice::Append | QIODevice::Text))
    return;
    //QIODevice::WriteOnly |

    QString line = "";
    int nr_of_chars = 5;       //this is a funny one and i think this is the problem.
    while (!file1.atEnd())
    {
    txtstr.reset();
     line.append(file1.readLine(nr_of_chars));
     line.append(file2.readLine( nr_of_chars));
     ui->testtxtedit->setText(line);
     txtstr << line << "\n" ;
    txtstr.flush()       // .reset() tried also but it doesnt seems to do anything.
    

    }
    file1.close();
    file2.close();
    outf.close();
    }@

    I wil give you some output also:

    @
    OUTCOME:
    NR_OF_CHARS = 5
    File Textedit
    qwerWWWW qwerWWWW
    qwerWWWW <space>
    <space> tyuiWWWW
    <space> <space>
    qwerWWWW asdfWWWW
    <space> <space>
    tyuiWWWW sdfgWWWW
    qwerWWWW <space>
    <space> fghjWWWW
    tyuiWWWW
    <space>
    <space>
    qwerWWWW
    <space>
    tyuiWWWW
    <space>
    asdfWWWW
    qwerWWWW
    <space>
    tyuiWWWW
    <space>
    asdfWWWW
    <space>
    <space>
    qwerWWWW
    <space>
    tyuiWWWW
    <space>
    asdfWWWW
    <space>
    sdfgWWWW

    @ And an other 20 more under file wich i dont post because of the amount of nothing.

    In the textedit everything seems to go ok, accept that somehow he gives me an empty line between every string. Can you help me with that?

    But the output in the file makes no sence for me. I tried to flush and/or reset the stream. i tried differnt values for the nr_of_chars in the .readline(). But im doing something terrible wrong with writing to a file.
    Do You know why the outcome is so strange??

    Edit: the WWWW after each word is good that was in the file, i checked it with different words.



  • Hi,

    first 2 things, txtstr.reset(); is not needed, as you don't manipulate the output behavior (see the docs of "QTextStream::reset":http://doc.qt.nokia.com/latest/qtextstream.html#reset , also flush is not needed, everything is flushed when you close the stream. flush is only needed if you always want to read the data in the file.

    If you open the output file only once, append is also not needed if you don't want to keep old content in the file. In your current example, it would not be needed.

    Why do you use ReadLine(5)? do you want to read a line of 5 characters?

    The empty lines are a result of your reading logic, look at "QIODevice::readLine":http://doc.qt.nokia.com/latest/qiodevice.html#readLine which is the function of the base class of QFile. If your file looks like this:
    @
    qwert
    qwert
    qwert
    tyuit

    @

    you would get run the loop 8 times (5 characters and for readLine(5) and 5 times a '\n' character. Especially on windows, filine breaks can be \r\n, which means this would also result in 2 readLine(5) calls for each line:

    @
    qwer
    qwer
    qwer
    tyui

    @



  • You duplicate your output because you constantly append to variable line and append that to the output file:

    first loop:

    • line is ""
    • you append "abc"
    • line is "abc"
    • you append "abc" to the empty file
    • file contains "abc"

    next loop

    • line is "abc"
    • you append "xyz"
    • line is "abcxyz"
    • you append "abcxyz" to the file
    • file contains "abcabcxyz"

    Second, if you call readLine(5), you do not trim the lines to 5 chars. If a line is longer, it is split into chunks of at most 5 chars and you just get more "virtual lines" than actual lines in the file. Read the "docs":http://doc.qt.nokia.com/4.7/qiodevice.html#readLine-2. If you need only the first 5 chars of a line, read the whole line and cut afterwards (QString::left() is your friend).

    Also: if you deal with text files, why don't you read the source files using QTextStream too? It handles the CR, LF, CRLF line endings from the various operating systems for you too. And it handles the decoding for you. If you have unicode characters consisting of more than one byte (UTF-8 and non-ASCII characters!) you will not get those correctly, because QIODevice's readLine() counts bytes, not characters.



  • Both of you,
    Thank you VERY much !!!

    with your help i managed to work it out!
    here's the code;

    @void MainWindow::joinfiles()
    {
    //first set vars.
    QFile f1(ui->in_file1->text());
    QFile f2(ui->in_file2->text());
    QFile outf(ui->out_file->text());
    QTextStream txtstr(&outf);
    QTextStream file1(&f1);
    QTextStream file2(&f2);
    QString line1;
    QString line2;

    //then try to open the files
    if (!f1.open(QIODevice::ReadOnly | QIODevice::Text))
          return;
    if (!f2.open(QIODevice::ReadOnly | QIODevice::Text))
          return;
    if (!outf.open(QIODevice::WriteOnly | QIODevice::Text))
          return;
    
    while (!file1.atEnd())
    {
     //read both lines
     line1 = file1.readLine();
     line2 = file2.readLine();
     //And print with the .left()
     ui->testtxtedit->append(line1.left(4) + line2.left(4));
     txtstr << line1.left(4) + line2.left(4) << "\n";
    }
    f1.close();
    f2.close();
    outf.close();
    

    }@

    @@-Volker:
    as you suggested i do use now textstreams to read also,

    @@-Gerolf:
    i wil look into threads now,
    looked very interresting so far.

    Thanks alot again to both off you!!



  • The code works. Be aware, that you miss contents of file2 if that file has more lines than file1. You leave the loop, once file1 is completely read.

    Regarding threads: Use these only if you absolutely must. Be sure to read the "wiki article on threading":http://developer.qt.nokia.com/wiki/Threads_Events_QObjects before you start.



  • @Volker,
    It sounds like you would do it otherwise? Can you tell me where to look for?
    yeah, i know that i perhaps miss content, but thats not in my case. I know the files are just the same size. But thank you for warning!



  • That depends on the size of the files and the time it takes to process them. That was the point of Gerolf - as long as you are in the loop, the UI is blocked. If that is a few seconds, it might be acceptable, if not, you might consider putting it in a background thread (take care that no other thread is writing the same file then!).



  • ok, i understand. And yes. then i do need to work with threads.


Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.