Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Opening a large CSV file in a QTableWidget is really slow.
QtWS25 Last Chance

Opening a large CSV file in a QTableWidget is really slow.

Scheduled Pinned Locked Moved General and Desktop
9 Posts 6 Posters 5.6k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P Offline
    P Offline
    pepita
    wrote on 4 Nov 2014, 15:49 last edited by
    #1

    Hello everybody;

    I have a Qt application fully working. One of the tasks that the app does is opening and displaying CSV files in a QTableWidget. Although the app doesn't crash with large files, it becomes really slow.

    I've tried to open these files also with OpenOffice, and he also takes his time (of course not as much as my app)

    I'm wondering if this can be optimized, or it's just that large files requires more time to be proccessed.

    The file I'm experiencing this lag with has 10749 rows and 265 columns, in 4.2Mb.

    I leave here my code snippet to manage the file:

    @
    void CsvViewer::openCsvFile ()
    {
    QMessageBox msgBox ( this );
    msgBox.setStandardButtons( QMessageBox::NoButton);
    msgBox.setWindowTitle( "Loading..." );
    msgBox.setText( "Loading CSV file..." ); // This text doesn't appear when the app freezes...

    QObject *signalEmitter = sender();
    if(signalEmitter == ui->actionOpen )
    {
        _openedFile = QFileDialog::getOpenFileName (0, "Open CSV file",QDir::currentPath(),"CSV Files(*.csv)");
    
        if (!_openedFile.endsWith (".csv"))
        {
            QMessageBox::warning (0,"Error","Selected file is not a CSV File","Ok");
            return;
        }
    }
    
    QFile file (_openedFile);
    
    msgBox.show ();
    
    if (file.open(QIODevice::ReadOnly | QIODevice::Text))
    {
        QString data = file.readAll();
        file.close ();
        data.remove( QRegExp("\r") );
        QChar character;
        QTextStream textStream(&data);
    
        while (!textStream.atEnd())
        {
            textStream >> character;
    
            if (character == ';')
            {
                _cellData<<_readBuffer;               
                _readBuffer.clear();                        
                _countCells++;
            }
            else if (character == '\n')
            {
                _cellData<<_readBuffer;              
                _readBuffer.celar();                        
                _countCells++;
                _countRows++;
            }
            else if (textStream.atEnd())
            {
                _readBuffer.append(character);       
            }
            else
                _readBuffer.append(character);      
        }
    }
    else
        return;
    
    
    ui->tableWidget->setRowCount          ( _countRows );
    ui->tableWidget->setColumnCount    ( (_countCells/_countRows) );
    
    
    for(unsigned int r = 0; r < _countRows; r++)
        for(unsigned int c = 0; c < (_countCells/_countRows); c++)
            ui->tableWidget->setItem(r, c, new QTableWidgetItem(_cellData[c + (_countCells/_countRows) * r]));
    
    msgBox.setAttribute( Qt::WA_DeleteOnClose ); 
    msgBox.close ();
    

    }
    @

    If there's no solution to this, I'm trying to show a QMessageBox warning the user to wait til the file is loaded (I've tried QProgressDialog with no success, it was making the app even slower). But I get no text in the box, as the whole app is freezing while loading the file. Only get the windowTitle.

    Thanks for help, really appreciated.

    1 Reply Last reply
    0
    • J Offline
      J Offline
      JKSH
      Moderators
      wrote on 4 Nov 2014, 16:35 last edited by
      #2

      Hi,

      I haven't done any benchmarking, but I believe that reading 1 QChar at a time is inefficient.

      Try QString::split() (it also makes your code a lot simpler):

      @
      QString data = file.readAll();
      file.close ();

      QStringList rowData = data.split('\n');
      _countRows = rowData.count();

      for (const QString& row : rowData)
      {
      QStringList cells = row.split(',');
      _countCells += cells.count();

      // TODO: Store these strings in the table
      

      }
      @

      Note: data.remove( QRegExp("\r") ); is not necessary. Since you opened the QFile with QIODevice::Text, Qt automatically removes all \r characters when you read the file.

      Qt Doc Search for browsers: forum.qt.io/topic/35616/web-browser-extension-for-improved-doc-searches

      1 Reply Last reply
      0
      • A Offline
        A Offline
        ambershark
        wrote on 5 Nov 2014, 01:02 last edited by
        #3

        What JKSH says is absolutely true. Reading char by char (or byte by byte) is the slowest you can be. You should read in buffer chunks that match your filesystem for best performance or at least read line by line.

        Doing file.readAll() into a QString could be equally inefficient if you had a really large file. Since you have no idea what the file size could be what if someone gave you a 1gb csv file to your program. Doing file.readAll() would not be happy. And then when you added it to your view it would essentially double the memory used by your app.

        Also doing a QString::split() on a large QString would be super slow as well.

        Your best bet is reading in line by line. It is not the most optimized for speed but is a good balance between performance and memory usage.

        Something like:

        @
        while (!file.atEnd())
        {
        QString line = file.readLine();
        // process csv line here, as JKSH says using split is great for this
        QStringList tokens = line.split(',');
        // now you have a list of each item in the csv, you will need to handle
        }
        @

        In the example above make sure to handle commas in the data. They are usually quoted or escaped. That is beyond the scope of the question though.

        Finally, it could be the view that is being slow. There are ways to deal with views of large datasets that are highly optimized. Adding a ton of data to a single GUI view will make things crawl. You can add "windows" into the data, having just the data in the viewport rendered and part of the object. This will make it a ton faster. It can get complicated though.

        My L-GPL'd C++ Logger github.com/ambershark-mike/sharklog

        1 Reply Last reply
        0
        • J Offline
          J Offline
          JKSH
          Moderators
          wrote on 5 Nov 2014, 01:24 last edited by
          #4

          Very good points. Thanks, ambershark!

          Qt Doc Search for browsers: forum.qt.io/topic/35616/web-browser-extension-for-improved-doc-searches

          1 Reply Last reply
          0
          • A Offline
            A Offline
            andrep
            wrote on 5 Nov 2014, 19:52 last edited by
            #5

            Use QTableView and a custom model, not QTableWidget for large data sets.

            1 Reply Last reply
            0
            • P Offline
              P Offline
              pepita
              wrote on 6 Nov 2014, 07:20 last edited by
              #6

              Ok, thank you all.

              I will make all the changes that you suggest, and let you know if I get better results...

              1 Reply Last reply
              0
              • P Offline
                P Offline
                pepita
                wrote on 11 Nov 2014, 11:26 last edited by
                #7

                Hi again;

                I've made all the changes without any success. Well, with reading line by line and spliting the code is much cleaner and elegant than before, that's for sure. But in terms of effiency, I'm less than a second faster, which is not an improvement at all.

                Concerning the Model-based table, this 5.7 Mb CSV-file is taking 11 seconds with the QTableWidget, and 34 with the QTableView and QStandardItemModel,... so definitely, I don't see how this can solve the problem...

                And agreeing with ambershark, is populating the view which is being slow (Commenting out the for-loop eliminates the delay)... I do not see how to optimize it.

                And also I still having this issue with the QMessageBox, that doesn't show the text in it, just the window title.

                Leave the fixed code here for you to have a look, thanks again for your help and tips:

                In the constructor:
                @
                model = new QStandardItemModel();
                ui->tableView->setModel(model);
                @

                The function:
                @
                void CsvViewer::openCsvFile ()
                {
                QMessageBox msgBox ( this );
                msgBox.setStandardButtons( QMessageBox::NoButton);
                msgBox.setWindowTitle( "Loading..." );
                msgBox.setText( "Loading CSV file..." ); // This text doesn't appear when the app freezes...

                    QObject *signalEmitter = sender();
                    if(signalEmitter == ui->actionOpen )
                    {
                        _openedFile = QFileDialog::getOpenFileName (0, "Open CSV file",QDir::currentPath(),"CSV Files(*.csv)");
                 
                    if (!_openedFile.endsWith (".csv"))
                        {
                            QMessageBox::warning (0,"Error","Selected file is not a CSV File","Ok");
                            return;
                        }
                    }
                 
                    QFile file &#40;_openedFile&#41;;
                 
                    msgBox.show ();
                 
                if (_openedFile.open(QIODevice::ReadOnly | QIODevice::Text))
                {
                    while (!_openedFile.atEnd())
                    {
                        QString line = _openedFile.readLine();
                        line.remove( QRegExp("\n") );
                        _cellData.append (line.split(';'));
                        _countRows++;
                    }
                    _countCols =  _cellData.count () / _countRows;
                }
                else
                    return;
                 
                 
                    model->setRowCount ( _countRows );
                    model->setColumnCount ( _countCols );
                 
                 
                    for(unsigned int r = 0; r < _countRows; r++)
                        for(unsigned int c = 0; c < _countCols; c++)
                            model->setItem(r, c, new QStandardItem(_cellData[c + (_countCols) * r]));
                 
                    msgBox.setAttribute( Qt::WA_DeleteOnClose );
                    msgBox.close ();
                }
                

                @

                Thanks. Best regards.

                G 1 Reply Last reply 17 Dec 2015, 18:27
                0
                • P pepita
                  11 Nov 2014, 11:26

                  Hi again;

                  I've made all the changes without any success. Well, with reading line by line and spliting the code is much cleaner and elegant than before, that's for sure. But in terms of effiency, I'm less than a second faster, which is not an improvement at all.

                  Concerning the Model-based table, this 5.7 Mb CSV-file is taking 11 seconds with the QTableWidget, and 34 with the QTableView and QStandardItemModel,... so definitely, I don't see how this can solve the problem...

                  And agreeing with ambershark, is populating the view which is being slow (Commenting out the for-loop eliminates the delay)... I do not see how to optimize it.

                  And also I still having this issue with the QMessageBox, that doesn't show the text in it, just the window title.

                  Leave the fixed code here for you to have a look, thanks again for your help and tips:

                  In the constructor:
                  @
                  model = new QStandardItemModel();
                  ui->tableView->setModel(model);
                  @

                  The function:
                  @
                  void CsvViewer::openCsvFile ()
                  {
                  QMessageBox msgBox ( this );
                  msgBox.setStandardButtons( QMessageBox::NoButton);
                  msgBox.setWindowTitle( "Loading..." );
                  msgBox.setText( "Loading CSV file..." ); // This text doesn't appear when the app freezes...

                      QObject *signalEmitter = sender();
                      if(signalEmitter == ui->actionOpen )
                      {
                          _openedFile = QFileDialog::getOpenFileName (0, "Open CSV file",QDir::currentPath(),"CSV Files(*.csv)");
                   
                      if (!_openedFile.endsWith (".csv"))
                          {
                              QMessageBox::warning (0,"Error","Selected file is not a CSV File","Ok");
                              return;
                          }
                      }
                   
                      QFile file &#40;_openedFile&#41;;
                   
                      msgBox.show ();
                   
                  if (_openedFile.open(QIODevice::ReadOnly | QIODevice::Text))
                  {
                      while (!_openedFile.atEnd())
                      {
                          QString line = _openedFile.readLine();
                          line.remove( QRegExp("\n") );
                          _cellData.append (line.split(';'));
                          _countRows++;
                      }
                      _countCols =  _cellData.count () / _countRows;
                  }
                  else
                      return;
                   
                   
                      model->setRowCount ( _countRows );
                      model->setColumnCount ( _countCols );
                   
                   
                      for(unsigned int r = 0; r < _countRows; r++)
                          for(unsigned int c = 0; c < _countCols; c++)
                              model->setItem(r, c, new QStandardItem(_cellData[c + (_countCols) * r]));
                   
                      msgBox.setAttribute( Qt::WA_DeleteOnClose );
                      msgBox.close ();
                  }
                  

                  @

                  Thanks. Best regards.

                  G Offline
                  G Offline
                  godfather82gh
                  wrote on 17 Dec 2015, 18:27 last edited by
                  #8

                  @pepita how was _cellData, _countRows and _countCols declared?

                  kshegunovK 1 Reply Last reply 17 Dec 2015, 18:42
                  0
                  • G godfather82gh
                    17 Dec 2015, 18:27

                    @pepita how was _cellData, _countRows and _countCols declared?

                    kshegunovK Offline
                    kshegunovK Offline
                    kshegunov
                    Moderators
                    wrote on 17 Dec 2015, 18:42 last edited by kshegunov
                    #9

                    @pepita
                    Hello, I'll pitch in with some basic suggestions:
                    Firstly do not use QStandardItemModel, instead subclass the QAbstractItemModel class and do your processing in the select() also employing the signals for beginning/ending insertion of rows/columns to have the process better optimized.

                    for(unsigned int r = 0; r < _countRows; r++)
                        for(unsigned int c = 0; c < _countCols; c++)
                            model->setItem(r, c, new QStandardItem(_cellData[c + (_countCols) * r]));
                    

                    I'm pretty sure that these two take most of the time and not the string splitting. Imagine the amount of allocations you're doing for such a dataset! On each new allocation the OS will go and try to find free memory to put your object in the heap, the shear number of objects will make this slow.

                    Additionally, If you are really after speed, you can consider threading the processing. For example you could start a single thread that will read the file and put the lines in a thread safe queue, and have 2-3 threads (depending on the number of cores) process a chunk of for example 100 rows per thread at a time. If the order is important you still are going to need to put a barrier so the worker threads will be providing the output in the order of the input, but you will get better performance.

                    Additionally, and probably most importantly, do not go through the data multiple times. You in fact don't need to know the number of rows beforehand, do you? Just go through the data set once and put all the data in the model.

                    I hope these pointers help.
                    Kind regards.

                    Read and abide by the Qt Code of Conduct

                    1 Reply Last reply
                    0

                    • Login

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • Users
                    • Groups
                    • Search
                    • Get Qt Extensions
                    • Unsolved