Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Which Container is best suited for holding 10 millions of records (String, String) ?
Qt 6.11 is out! See what's new in the release blog

Which Container is best suited for holding 10 millions of records (String, String) ?

Scheduled Pinned Locked Moved Unsolved General and Desktop
21 Posts 12 Posters 6.7k Views 6 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • mrjjM Offline
    mrjjM Offline
    mrjj
    Lifetime Qt Champion
    wrote on last edited by
    #5

    Hi
    It would be good to know
    what you need 10 millions of records for ?

    1 Reply Last reply
    0
    • V vpn9507

      why don't you use sql ? its not a good idea to save too much data in memory

      ksranjith786K Offline
      ksranjith786K Offline
      ksranjith786
      wrote on last edited by ksranjith786
      #6

      @vpn9507 The response time must be in 5 milli seconds. Using sql, response time would get degraded. And deployment of SQL is not recommended for our application.

      artwawA 1 Reply Last reply
      0
      • Chris KawaC Chris Kawa

        One more consideration for not storing it all in memory: let's say each string is about 100 characters, A QString needs 2 bytes for each character and you have two of them for an element. A very rough estimate: 10 000 000 * ((100 *2 ) * 2) = 4Gb of data and that's not counting the weight of the structure itself (i.e. all the pointers the map needs to create its nodes) and all the other memory your app needs. If you use that the OS is gonna swap your memory like there's no tomorrow causing it to slow down significantly.

        ksranjith786K Offline
        ksranjith786K Offline
        ksranjith786
        wrote on last edited by
        #7

        @Chris-Kawa Could you please let me know how did you evaluate the memory "A very rough estimate: 10 000 000 * ((100 *2 ) * 2) = 4Gb " ; I did not get why did you multiply with 2 at the end.

        K J.HilkJ ksranjith786K 3 Replies Last reply
        0
        • ksranjith786K ksranjith786

          @vpn9507 The response time must be in 5 milli seconds. Using sql, response time would get degraded. And deployment of SQL is not recommended for our application.

          artwawA Offline
          artwawA Offline
          artwaw
          wrote on last edited by
          #8

          @ksranjith786
          As for container classes there is quite handy overview of them here. Actual choice depends on many factors, how the data would be accessed and managed etc. however it is your choice.
          As for "deployment of SQL" you actually do not have to deploy anything besides your apps running environment, SQLite is well integrated.
          As for the actual storage of that amount of data... You can choose to have it all in the memory using container classes - assuming you have enough of the memory ofc., then you can put it in the memory based SQLite db or you can use SQLite with regular file based db, write your self a model that will handle the data the way you need them.
          Well written model would be able to cache some of the data in the memory making it available instantly performing some cache optimization in the background (read in advance/write when idle).
          I do not know your use case though so it is just a speculation.

          For more information please re-read.

          Kind Regards,
          Artur

          ksranjith786K 1 Reply Last reply
          0
          • ksranjith786K ksranjith786

            @Chris-Kawa Could you please let me know how did you evaluate the memory "A very rough estimate: 10 000 000 * ((100 *2 ) * 2) = 4Gb " ; I did not get why did you multiply with 2 at the end.

            K Offline
            K Offline
            koahnig
            wrote on last edited by
            #9

            @ksranjith786

            I think that part is trowing you off:
            (100* 2) = 100 character as 2 byte
            Therefor the mulitplication at the end is because of 2 strings per entry.

            @Chris-Kawa said in Which Container is best suited for holding 10 millions of records (String, String) ?:

            let's say each string is about 100 characters, A QString needs 2 bytes for each character

            Vote the answer(s) that helped you to solve your issue(s)

            1 Reply Last reply
            1
            • ksranjith786K ksranjith786

              @Chris-Kawa Could you please let me know how did you evaluate the memory "A very rough estimate: 10 000 000 * ((100 *2 ) * 2) = 4Gb " ; I did not get why did you multiply with 2 at the end.

              J.HilkJ Offline
              J.HilkJ Offline
              J.Hilk
              Moderators
              wrote on last edited by
              #10

              @ksranjith786

              10 000 000 entries, 100 Strings of 2 Pairs and 2 bytes memory each


              Be aware of the Qt Code of Conduct, when posting : https://forum.qt.io/topic/113070/qt-code-of-conduct


              Q: What's that?
              A: It's blue light.
              Q: What does it do?
              A: It turns blue.

              1 Reply Last reply
              0
              • ksranjith786K ksranjith786

                @Chris-Kawa Could you please let me know how did you evaluate the memory "A very rough estimate: 10 000 000 * ((100 *2 ) * 2) = 4Gb " ; I did not get why did you multiply with 2 at the end.

                ksranjith786K Offline
                ksranjith786K Offline
                ksranjith786
                wrote on last edited by
                #11

                @ksranjith786
                As per the link http://doc.qt.io/qt-5/containers.html#algorithmic-complexity

                The values above may seem a bit strange, but here are the guiding principles:
                • QString allocates 4 characters at a time until it reaches size 20.
                • From 20 to 4084, it advances by doubling the size each time. More precisely, it advances to the next power of two, minus 12. (Some memory allocators perform worst when requested exact powers of two, because they use a few bytes per block for book-keeping.)
                • From 4084 on, it advances by blocks of 2048 characters (4096 bytes). This makes sense because modern operating systems don't copy the entire data when reallocating a buffer; the physical memory pages are simply reordered, and only the data on the first and last pages actually needs to be copied.

                1 Reply Last reply
                0
                • K Offline
                  K Offline
                  koahnig
                  wrote on last edited by
                  #12

                  WOW 4 responses within a minute. Never saw this before.

                  Vote the answer(s) that helped you to solve your issue(s)

                  1 Reply Last reply
                  0
                  • artwawA artwaw

                    @ksranjith786
                    As for container classes there is quite handy overview of them here. Actual choice depends on many factors, how the data would be accessed and managed etc. however it is your choice.
                    As for "deployment of SQL" you actually do not have to deploy anything besides your apps running environment, SQLite is well integrated.
                    As for the actual storage of that amount of data... You can choose to have it all in the memory using container classes - assuming you have enough of the memory ofc., then you can put it in the memory based SQLite db or you can use SQLite with regular file based db, write your self a model that will handle the data the way you need them.
                    Well written model would be able to cache some of the data in the memory making it available instantly performing some cache optimization in the background (read in advance/write when idle).
                    I do not know your use case though so it is just a speculation.

                    ksranjith786K Offline
                    ksranjith786K Offline
                    ksranjith786
                    wrote on last edited by
                    #13
                    This post is deleted!
                    mrjjM VRoninV 2 Replies Last reply
                    0
                    • ksranjith786K ksranjith786

                      This post is deleted!

                      mrjjM Offline
                      mrjjM Offline
                      mrjj
                      Lifetime Qt Champion
                      wrote on last edited by
                      #14

                      @ksranjith786
                      Hi
                      You should really test with SQlite if your use case is to select amount 10 millions lines and display the subset.

                      1 Reply Last reply
                      2
                      • ksranjith786K ksranjith786

                        This post is deleted!

                        VRoninV Offline
                        VRoninV Offline
                        VRonin
                        wrote on last edited by
                        #15

                        @ksranjith786 said in Which Container is best suited for holding 10 millions of records (String, String) ?:

                        Around 5 msec for lookup

                        You probably need a proper database (not just SQLite but maybe PostgreSQL), index it properly and maybe even give it standalone-resources (i.e. run it on a dedicated machine) to achieve that performance

                        "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
                        ~Napoleon Bonaparte

                        On a crusade to banish setIndexWidget() from the holy land of Qt

                        kshegunovK 1 Reply Last reply
                        1
                        • VRoninV VRonin

                          @ksranjith786 said in Which Container is best suited for holding 10 millions of records (String, String) ?:

                          Around 5 msec for lookup

                          You probably need a proper database (not just SQLite but maybe PostgreSQL), index it properly and maybe even give it standalone-resources (i.e. run it on a dedicated machine) to achieve that performance

                          kshegunovK Offline
                          kshegunovK Offline
                          kshegunov
                          Moderators
                          wrote on last edited by
                          #16

                          @VRonin said in Which Container is best suited for holding 10 millions of records (String, String) ?:

                          You probably need a proper database (not just SQLite but maybe PostgreSQL), index it properly and maybe even give it standalone-resources (i.e. run it on a dedicated machine) to achieve that performance

                          Even this may not be viable. In a typical 10/100 network you'd get about 1ms of latency from the TCP/IP go-around, which shrinks that 5ms window considerably.

                          @ksranjith786
                          How are you going to use that dataset?

                          Our use case is that, our application need to fetch offers associated for an item during item scan.

                          Elaborate on that, break it step by step for us and do say what are "offers" and "items" in this context, and most importantly what's an "item scan".

                          Read and abide by the Qt Code of Conduct

                          1 Reply Last reply
                          3
                          • E Offline
                            E Offline
                            Eduardo12l
                            wrote on last edited by Eduardo12l
                            #17

                            My db in MySQL with 3000 of records, takes at least 1.5 seconds to response.

                            kshegunovK 1 Reply Last reply
                            0
                            • E Eduardo12l

                              My db in MySQL with 3000 of records, takes at least 1.5 seconds to response.

                              kshegunovK Offline
                              kshegunovK Offline
                              kshegunov
                              Moderators
                              wrote on last edited by
                              #18

                              That's simply too long. You should inspect your database and how you use it.

                              @VRonin
                              It just occurred to me that this problem is a prime candidate for usage and testing of your big hash lib. :)

                              Read and abide by the Qt Code of Conduct

                              VRoninV 1 Reply Last reply
                              0
                              • kshegunovK kshegunov

                                That's simply too long. You should inspect your database and how you use it.

                                @VRonin
                                It just occurred to me that this problem is a prime candidate for usage and testing of your big hash lib. :)

                                VRoninV Offline
                                VRoninV Offline
                                VRonin
                                wrote on last edited by
                                #19

                                @kshegunov Lol, thanks but 5msec is not achievable even in my wildest dreams. I also suspect 10m QString as key (that are not dumped on the hard drive, only values are) are enough to blow most memory

                                "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
                                ~Napoleon Bonaparte

                                On a crusade to banish setIndexWidget() from the holy land of Qt

                                1 Reply Last reply
                                1
                                • joeQJ Offline
                                  joeQJ Offline
                                  joeQ
                                  wrote on last edited by
                                  #20

                                  oh, I think Redis may help you!

                                  Just do it!

                                  VRoninV 1 Reply Last reply
                                  0
                                  • joeQJ joeQ

                                    oh, I think Redis may help you!

                                    VRoninV Offline
                                    VRoninV Offline
                                    VRonin
                                    wrote on last edited by
                                    #21

                                    @joeQ You need to have spent quite a bit of money on RAM for your PC for that to be an option

                                    "La mort n'est rien, mais vivre vaincu et sans gloire, c'est mourir tous les jours"
                                    ~Napoleon Bonaparte

                                    On a crusade to banish setIndexWidget() from the holy land of Qt

                                    1 Reply Last reply
                                    0

                                    • Login

                                    • Login or register to search.
                                    • First post
                                      Last post
                                    0
                                    • Categories
                                    • Recent
                                    • Tags
                                    • Popular
                                    • Users
                                    • Groups
                                    • Search
                                    • Get Qt Extensions
                                    • Unsolved