Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Big SQLITE query select crashes
QtWS25 Last Chance

Big SQLITE query select crashes

Scheduled Pinned Locked Moved Unsolved General and Desktop
sqlite3libqt5sql
9 Posts 3 Posters 4.4k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Offline
    S Offline
    stormracer
    wrote on 20 Jul 2016, 10:24 last edited by
    #1

    Hello,
    i'm very new to QT. I develop an application to display scientific data. I need to select only a subset of this data to create an image. For example i need to select all values which are over a threshold.
    My idea is to write all this data to an sql database, because it's easy so select the interesting values. I choose an sqlite database for first.

    With small amount of data, all is fine. But test with huge data are not successful.

    For example i insert 167351352 rows.
    But if i try to get this back it crashes with an:

    segfault at 4 ip 00007f892445e1dd sp 00007ffd37b0cb50 error 6 in libQt5Sql.so.5.7.0[7f892443c000+43000]
    

    after selecting 67106384 rows.

    If i try to read this data with python it works. Therefor the database file is not corrupt.

    This code is for reading the data back:

    query.prepare("SELECT TOF FROM Neutrons WHERE X BETWEEN :xl AND :xr AND Y BETWEEN :yu AND :yd;");
    
    query.bindValue(":xl", this->roi_xl);
    query.bindValue(":xr", this->roi_xr);
    query.bindValue(":yd", this->roi_yd);
    query.bindValue(":yu", this->roi_yu);
    
    bool ok = query.exec();
    qDebug() <<  "Status: " << ok;
    int c = 0;
    
    while (query.next()) {
        c++;
    }
    

    The variable c is only for counting and the debugger shows the value 67106384 for c when it crashes.

    Is there something wrong? This program crashes on two different pc at exactly the same place. The Database file has a size from 2,9 GB (2.914.419.712 Bytes). One special point i noticed is, when i divide the number of selected rows with 2^16 the result is nearly 1024, which is a magic number (pagesize or so).

    My platform is Ubuntu 14.04 x64
    Qt Creator 4.0.2 Based on Qt 5.7.0 (GCC 4.9.1 20140922 (Red Hat 4.9.1-10), 64 bit)

    In python is a function called fetchmany(10000000) which querys a block of data. Is there something similar in QT?

    Greetings
    Christian

    1 Reply Last reply
    0
    • S Offline
      S Offline
      SGaist
      Lifetime Qt Champion
      wrote on 20 Jul 2016, 21:10 last edited by
      #2

      Hi and welcome to devnet,

      Can you check the memory consumption you have when running your application ?

      Do you mean you call fetchmany repeatedly to get all the data ?

      Interested in AI ? www.idiap.ch
      Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

      1 Reply Last reply
      1
      • S Offline
        S Offline
        stormracer
        wrote on 22 Jul 2016, 07:22 last edited by
        #3

        Hi,

        if i run my program with the Valgrind Analyser from QT Creator i get the problem "Invalid write of size 4" reported.
        I've uploaded the reported messages as an image here Klick to Image.

        The program need about 1.4GB memory in the Ubuntu task manager.

        With Python i call the fetchmany function in a loop. It seems to be better for python to use this function. It returns a list with the given number of elements.

        Christian

        K 1 Reply Last reply 22 Jul 2016, 13:04
        0
        • S stormracer
          22 Jul 2016, 07:22

          Hi,

          if i run my program with the Valgrind Analyser from QT Creator i get the problem "Invalid write of size 4" reported.
          I've uploaded the reported messages as an image here Klick to Image.

          The program need about 1.4GB memory in the Ubuntu task manager.

          With Python i call the fetchmany function in a loop. It seems to be better for python to use this function. It returns a list with the given number of elements.

          Christian

          K Offline
          K Offline
          kshegunov
          Moderators
          wrote on 22 Jul 2016, 13:04 last edited by
          #4

          @stormracer
          Hi,

          get the problem "Invalid write of size 4" reported

          We got that from:

          segfault at 4 ip 00007f892445e1dd sp 00007ffd37b0cb50 error 6 in libQt5Sql.so.5.7.0[7f892443c000+43000]
          

          The question @SGaist asked was rather to ascertain if you have enough memory to actually hold the dataset.

          The program need about 1.4GB memory in the Ubuntu task manager.

          How much do you have in total?

          On a related note I'm not convinced you've chosen a very good way of storing your data. I don't expect querying a, or actually the data retrieval from a, 180 million rows table to be very fast.

          Kind regards.

          Read and abide by the Qt Code of Conduct

          1 Reply Last reply
          0
          • S Offline
            S Offline
            stormracer
            wrote on 22 Jul 2016, 19:13 last edited by
            #5

            Hello,

            sorry for the missing information. My PC has 12GB Memory and i tested it with a new booted system and about 9GB of free memory. More than enough i think. During the execution of the query there is no significant increase in memory consumption.

            For a very small test i use c code and the system sqlite3 library. With this program i'm able to get all data out of the database. It took about 40sec.

            Do you know a better way to manage this data? For example i want to save graphic pixel with a color value. Therefor i created a table with x,y,color columns. Now i need all colors which are in a specific range and display them in a histogram. With sql it's easy to select only the necessary values.

            Greetings,
            Christian

            K 1 Reply Last reply 23 Jul 2016, 12:44
            0
            • S Offline
              S Offline
              SGaist
              Lifetime Qt Champion
              wrote on 22 Jul 2016, 22:18 last edited by
              #6

              Do you mean you are trying to get the histogram from an image ?

              Interested in AI ? www.idiap.ch
              Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

              S 1 Reply Last reply 24 Jul 2016, 14:19
              0
              • S stormracer
                22 Jul 2016, 19:13

                Hello,

                sorry for the missing information. My PC has 12GB Memory and i tested it with a new booted system and about 9GB of free memory. More than enough i think. During the execution of the query there is no significant increase in memory consumption.

                For a very small test i use c code and the system sqlite3 library. With this program i'm able to get all data out of the database. It took about 40sec.

                Do you know a better way to manage this data? For example i want to save graphic pixel with a color value. Therefor i created a table with x,y,color columns. Now i need all colors which are in a specific range and display them in a histogram. With sql it's easy to select only the necessary values.

                Greetings,
                Christian

                K Offline
                K Offline
                kshegunov
                Moderators
                wrote on 23 Jul 2016, 12:44 last edited by
                #7

                @stormracer
                Hello,

                Do you know a better way to manage this data? For example i want to save graphic pixel with a color value. Therefor i created a table with x,y,color columns. Now i need all colors which are in a specific range and display them in a histogram. With sql it's easy to select only the necessary values.

                Even with the most naive vector of structures and full fledged sorting/searching in it you'd most probably get better performance. You have plenty of memory, and the standard library provides a bunch of numerical algorithms, so I don't see a good reason to put all the data in a database, unless it's for permanent storage. Plus with QVector (or the other containers) you can easily have multithreaded code working on your dataset (the QtConcurrent framework).

                Kind regards.

                Read and abide by the Qt Code of Conduct

                1 Reply Last reply
                0
                • S SGaist
                  22 Jul 2016, 22:18

                  Do you mean you are trying to get the histogram from an image ?

                  S Offline
                  S Offline
                  stormracer
                  wrote on 24 Jul 2016, 14:19 last edited by
                  #8

                  @SGaist said:

                  Do you mean you are trying to get the histogram from an image ?

                  It's not a real image. It like data from a sensor which serves Coordinates in x and y and additional a specific value.

                  One part of the app is to display an 2D histogram with the coordinates. Than i select an region of interest in the 2D histogram and plot a 1D histogram with the third value which are in the limits of the selected coordinates.

                  @kshegunov, my development pc has this amount of memory. But for other pc i can't rely on this and i will get larger data files later. Therefor i search for a solution for managing large amount of data. My idea for sql was that database systems are made for efficient data storing and searching. I choose sqlite because it doesn't need a server in the background.

                  Kind regards
                  Christian

                  K 1 Reply Last reply 24 Jul 2016, 14:33
                  0
                  • S stormracer
                    24 Jul 2016, 14:19

                    @SGaist said:

                    Do you mean you are trying to get the histogram from an image ?

                    It's not a real image. It like data from a sensor which serves Coordinates in x and y and additional a specific value.

                    One part of the app is to display an 2D histogram with the coordinates. Than i select an region of interest in the 2D histogram and plot a 1D histogram with the third value which are in the limits of the selected coordinates.

                    @kshegunov, my development pc has this amount of memory. But for other pc i can't rely on this and i will get larger data files later. Therefor i search for a solution for managing large amount of data. My idea for sql was that database systems are made for efficient data storing and searching. I choose sqlite because it doesn't need a server in the background.

                    Kind regards
                    Christian

                    K Offline
                    K Offline
                    kshegunov
                    Moderators
                    wrote on 24 Jul 2016, 14:33 last edited by kshegunov
                    #9

                    @stormracer
                    Hi,

                    my development pc has this amount of memory. But for other pc i can't rely on this and i will get larger data files later. Therefor i search for a solution for managing large amount of data. My idea for sql was that database systems are made for efficient data storing and searching. I choose sqlite because it doesn't need a server in the background.

                    Yes, somewhat, for general purpose data. Seeing you're searching for a set of particles in space (2D space?), and assuming you're wanting to mostly search and extract from the dataset not modifying it, I'd consider some sort of HDD swapped (or database backed) k-d tree. From my experience SQL databases behave rather poorly for scientific data, especially when the datasets are big. Also you could try getting a stripped down b-tree implementation (basically a database without the bells and whistles) that you can use for storage of the data and do the indexing/searching by yourself. Or the most basic approach is to work with the data as you'd do normally, but swap the dataset instead of holding it in memory (e.g. QCache comes to mind here).

                    PS.
                    Depending on how exactly you want to build the histogram you could also simply discretize the space (if I'm correct that you're working with floating point numbers for coordinates).

                    Read and abide by the Qt Code of Conduct

                    1 Reply Last reply
                    1

                    1/9

                    20 Jul 2016, 10:24

                    • Login

                    • Login or register to search.
                    1 out of 9
                    • First post
                      1/9
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • Users
                    • Groups
                    • Search
                    • Get Qt Extensions
                    • Unsolved