Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

General question about arrays and I/o



  • I'm newbie at C++ programming
    I'm trying to write and read binary data. I can do this with no errors if I write a single number. But if I write an array then I can not correctly read that written data. For this two cases I have a picture attached to the post.
    Could you explain me please what @0x2dfd78 in data variable means (see pictures in debug mode) and how to read and display array data in console?

    #include <iostream>
    #include <cstdio>
    #include <string>
    #include <vector>
    #include <QDebug>
    
    int main()
    {
        std::string str_file = "C:\\Users\\Tasik\\Documents\\Qt_prj\\proba.bin";
        int n = str_file.length();
        char char_file[100];
    
        std::strcpy(char_file, str_file.c_str());
    
        FILE *pFile;
        pFile = fopen(char_file,"w+");
    
        int data[5] = {1, 2, 3, 4, 5};
        const int m = std::size(data);
    
        fwrite(&data, sizeof(int), sizeof(data), pFile);
        fclose (pFile);![alt text](image url)
    
        pFile = fopen(char_file,"r");
        int result;
        fread(&result, sizeof(int), m, pFile);
    
        std::cout << data << std::endl;
        std::cout << result;
    
        fclose (pFile);
    
        return 0;
    }
    

    Arrays.jpg


  • Lifetime Qt Champion

    Hi
    Here is a more c++ / Qt version

    int main()
    {
    
        QVector<int> data = {1, 2, 3, 4, 5}; // using a container and not c array
        QString filename{"e:/test.bin"}; // CHANGE ME! :)
    
        QFile file(filename);
    
        if (file.open(QFile::WriteOnly)) {
            QDataStream out(&file);
            out << data; // the << is an operator which you can define for you own types. QVector has one already
            file.close();
        }
    
        data.clear(); // clear the array so we can see it works
    
        if (file.open(QFile::ReadOnly)) {
            QDataStream in(&file);
            in >> data; // the >> is an operator which you can define for you own types. QVector has one already
        }
    
        qDebug() << data; // qDebug understand QVector and can just print it
    
        return 0;
    }
    

    As you can hopefully see its far less low level than FILE interface.
    Moreover, the << and >> we use to save the data can be used with all normal types
    (int, float, etc) and also with Qt classes so you can just stream them with no extra code.
    The << and >> handles the bytes sizes for you and all yo u then need to take care of is doing save and load in same
    order or it fails.

    It also allows for saving and loading object the exact same way if you give it the
    QDataStream operators.
    Like:

    class budget
    {
      float transportation, grocery, food, stationery;
      QString key;
    public:
      budget() {}
    
    friend QDataStream &operator <<(QDataStream &stream, const budget &myclass) {
    
          stream<< myclass.food;
          stream<< myclass.grocery;
          stream<< myclass.key;
          stream<< myclass.stationery;
          stream<< myclass.transportation;
    
        return stream;
    
    }
    friend QDataStream &operator >>(QDataStream &stream, budget &myclass) {
    
        stream >> myclass.food;
        stream >> myclass.grocery;
        stream >> myclass.key;
        stream >> myclass.stationery;
        stream >> myclass.transportation;
        return stream;
    
    }
    this allows you to stream your own class 100% like vector
    budget myBuget;
    out << myBuget;
    
    

    Overall this makes the program easier to read and also far less error prone than handling it at byte level with stuff like
    fwrite(&data, sizeof(int), sizeof(data), pFile);

    I hope i sold it to you :)


  • Lifetime Qt Champion

    Hi
    The @0x2dfd78 is the address of the first element in the c array. its a standard that c arrays
    are internally a pointer to first element and hence the debugger show the starting address.
    and then the elements.

    Next thing to note is that you are using the FILE interface. Its a bit oldscool and normally iostreams are used for c++ programs.

    Anyway, the reason it not working for the array is that you read it back into one int and not the array
    int result; // this is your buffer. but file has 5 ints
    fread(&result, sizeof(int), m, pFile);
    What you mean is actually
    fread(&data, sizeof(int), m, pFile); // use the data array as target buffer

    To print it to console, you need to take each index (or use for loop)
    std::cout << data[0] << std::endl;

    as else it sees the pointer and just writes the address. (of the first element)
    It does not know its an array or its size.

    Other note is. Since you are using c++ and Qt, why not use the features Qt provides to save the data ?
    Its ok if you want to use pure c++ but Qt does offer many benefits for saving say QStrings and other Qt data types.


  • Lifetime Qt Champion

    Hi
    Here is a more c++ / Qt version

    int main()
    {
    
        QVector<int> data = {1, 2, 3, 4, 5}; // using a container and not c array
        QString filename{"e:/test.bin"}; // CHANGE ME! :)
    
        QFile file(filename);
    
        if (file.open(QFile::WriteOnly)) {
            QDataStream out(&file);
            out << data; // the << is an operator which you can define for you own types. QVector has one already
            file.close();
        }
    
        data.clear(); // clear the array so we can see it works
    
        if (file.open(QFile::ReadOnly)) {
            QDataStream in(&file);
            in >> data; // the >> is an operator which you can define for you own types. QVector has one already
        }
    
        qDebug() << data; // qDebug understand QVector and can just print it
    
        return 0;
    }
    

    As you can hopefully see its far less low level than FILE interface.
    Moreover, the << and >> we use to save the data can be used with all normal types
    (int, float, etc) and also with Qt classes so you can just stream them with no extra code.
    The << and >> handles the bytes sizes for you and all yo u then need to take care of is doing save and load in same
    order or it fails.

    It also allows for saving and loading object the exact same way if you give it the
    QDataStream operators.
    Like:

    class budget
    {
      float transportation, grocery, food, stationery;
      QString key;
    public:
      budget() {}
    
    friend QDataStream &operator <<(QDataStream &stream, const budget &myclass) {
    
          stream<< myclass.food;
          stream<< myclass.grocery;
          stream<< myclass.key;
          stream<< myclass.stationery;
          stream<< myclass.transportation;
    
        return stream;
    
    }
    friend QDataStream &operator >>(QDataStream &stream, budget &myclass) {
    
        stream >> myclass.food;
        stream >> myclass.grocery;
        stream >> myclass.key;
        stream >> myclass.stationery;
        stream >> myclass.transportation;
        return stream;
    
    }
    this allows you to stream your own class 100% like vector
    budget myBuget;
    out << myBuget;
    
    

    Overall this makes the program easier to read and also far less error prone than handling it at byte level with stuff like
    fwrite(&data, sizeof(int), sizeof(data), pFile);

    I hope i sold it to you :)



  • @mrjj ahaha not sold yet but very thanks for such detailed answer :)
    Tomorrow I will try your code (it is already deep night in Saint-Petersburg).
    The main questions that affect on my decision is the speed of read/write binary files of weigh about N*10 Gigabytes and the possibility to use threads... How do you think should I use Qt interface or low level functions (such as fwrite/fread?)?


  • Lifetime Qt Champion

    @Please_Help_me_D

    i/o is always slow, so using Qt classes should not hurt much. however, holding such large datasets in memory efficiently can be complicated.

    Can you elaborate a bit more about your problem?

    Regards



  • @Please_Help_me_D
    I'll throw in a couple of (very slightly) controversial suggestions, just for your consideration, since you are talking about such large levels of I/O.

    • The Qt, C++ and even C stdio I/O functions like fread/fwrite() use an underlying extra buffer level between the disk data and your code access. If performance is critical, and if what you are doing is very simple (e.g. just sequential access), lower level functions are named read() & write() (or, depending on OS/compiler, _read/_write()). Further OS-specific calls are also available for asynchronous I/O, which again might improve your particular situation. You would have to time on platforms to see how much difference these make.

    • I must admit I have never used this, though I have often wanted to: there is the mmap() (#include <sys/mman.h>) family of calls. This "maps" (areas of) disk files directly into memory, so you do not actually do any I/O, the data appears and is accessed just like an array of bytes in memory. Again, you would have to time.

    I don't know what the experts here think about my two points.

    Also, be aware that reading disk I/O is also a lot faster than writing it, especially (as I understand it) if you have an SSD, though I guess that would not apply to your large data, but again you could check timings.

    First thing is to address @aha_1980's request for more information on what you are trying to achieve.


  • Lifetime Qt Champion

    @Please_Help_me_D
    Hi
    The overhead from DataStream etc will be very minor if just streaming a big 10 GB memory block the same way you would with
    the FILE interface.
    So its hard to suggest what will be the best solution before we know how you have the data structured etc.

    Also what is the data ? 10 GB is massive :)



  • @mrjj I tried the example you gave with QVector and QFile and I liked it.
    @aha_1980 @JonB So I try to describe a little more the problem.
    I know Matlab pretty good and I started to learn C++ and Qt to write a program that performs mathematical operations on the data. So I'm going to read raw binary data file and store it in scientific HDF5 format (primary as a 2-dimensional array). Those files may weigh from N100 Megabytes till N100 Gigabytes. So when I read data and store in HDF5 format then I need to have access only to portions of that data. I never need to upload all the data in RAM at the same time.
    In Matlab I worked with memory mapping (memmap function) technique but now I want to use HDF5 format wich is able to replace the need to use memory mapping. I'm afraid that on windows there is some difficulties with mmap.

    Is in there a way in Qt to generate a sequance of numbers without loops? For example if I want an array with numbers {1, 2, 3, 4, 5} I write:

    int data = {1, 2, 3, 4, 5};
    

    But if I want to generate integer numbers from n to N with step dn I would get:

    int data = {n, n+dn, n+2*dn, ... , N};
    

    And also is there a way to get access to several elements of an array. For example:

    int data = {1, 2, 3, 4, 5};
    

    I how can get 2, 3 and 4th elements of data without loop?


  • Lifetime Qt Champion

    Hi
    Good to hear. :)
    ok so its HDF5 format
    Do note there exits libraries to use that format from c++.
    But if they provide benefits for your use case or not is hard to say.
    But since you want to read in the data, you will need to use mem mapped files or similar
    and you might be able to get that out of the box with a library.

    But if I want to generate integer numbers from n to N with step dn I would get:

    int data = {n, n+dn, n+2*dn, ... , N};
    Hmm. Nothing really springs to mind. Why are you against a loop ?

    And also is there a way to get access to several elements of an array. For example:
    int data = {1, 2, 3, 4, 5};
    I how can get 2, 3 and 4th elements of data without loop?
    data[index] gives access. If you need to modify the value then

    int &val = data[index];

    val = 100; // will change the table value to 100

    Do note that QFile also supports mem mapped files.
    (QFile::map )



  • @mrjj Hi))
    I installed official libraries from HDFGroup with Cpp libraries checked on while do CMake. And there also HDF5 cpp project but I don't undestand what this project do. Is it just provide simple interface to use HDF5...

    But since you want to read in the data, you will need to use mem mapped files or similar
    and you might be able to get that out of the box with a library.

    Didn't undestand that... Do you mean that HDF5 libraries uses memory mapping or do I need to read big data with mem map? I read this staff HDF5 or memory mapping and since HDF5 is well known and actively used I decided to use HDF5 instead of memory mapping.

    int data = {n, n+dn, n+2*dn, ... , N};

    Hmm. Nothing really springs to mind. Why are you against a loop ?

    Well I'm from Matlab and it taught me to avoid loops (because it is slow in Matlab) and I'm slightly uncofortable now when I use loops in case I could avoid it :)

    I need to get known with memory mapping in Qt a little better. Do you know is Qt memory mapping works on Windows? Because a week ago I was trying to install MPICH (for cluster computation, just to try) and I could not because of some error connected with lack of memap on Windows or something...


  • Lifetime Qt Champion

    @Please_Help_me_D
    Hi

    • Didn't understand that..
      I meant that using whatever HDF5 uses to allow reading those large files might just work out of the box and then maybe no need for your own memmap file or similar. Was just saying you need something extra to drive such large files and it seems HDF5 does give that via its chunked file design.

    (loops)
    Ahh, That way. Well, there is a thing with loops in c++/Qt.
    If you fill very large array in main thread, it will lag your program's interface.
    But besides that, loops are fast in c++. (generally speaking)

    (Qfile map)
    QFile map function should also work in window as far as i know.
    Windows does support it natively and i think Qfile map uses that.

    The https://github.com/ess-dmsc/h5cpp
    provides a c++ wrapper for a c library.
    This is often done to allow for object orientated programming with the
    c library and maybe hide details behind more easy to use classes than raw C code.
    you dont need to use the wrapper if you feel good with c code.
    However,

    I dont have experience with HDF5 format but looking over the docs, it really seems the way to go as it should provide you with anything you need to make a c++ program that can consume and produce such giga files.

    and the c api dont really look that bad
    https://support.hdfgroup.org/ftp/HDF5/examples/examples-by-api/hdf5-examples/1_10/C/H5D/h5ex_d_chunk.c



  • @mrjj Thank you for information
    I am working on I/O with information you provided :)



  • @mrjj one more question. I get an error: array subscript is not an integer:

    int data[5] = {1, 2, 3, 4, 5};
    int ind[3] = {0, 1, 2};
    int data2 = data[ind]; // here is that error . **ind** is highlighted by red
    

    I declare ind as integer but still can't get access to those elements of an array...

    And here is similar problem expression is not determined by a constant. Failure caused by reading a variable beyond its lifetime:

        std::string str_file = "C:\\Users\\Tasik\\Documents\\Qt_prj\\proba.bin";
        int n = str_file.length();
        char char_file[n]; // here is that error. It appears only when I launch the application
    

  • Lifetime Qt Champion

    Hi,

    ind is not an integer, it's an array of 3 integers.

    Depending on your compiler you will have to allocate your char_file array on the heap using new and then delete when done with that array.



  • @SGaist Hello
    Thank you for answer
    So is there a way to extract few elements from an array at the same time without loop?

    Depending on your compiler you will have to allocate your char_file array on the heap using new and then delete when done with that array.

    My compiler is MSVC 2017. Could you write an example of this?


  • Lifetime Qt Champion

    @Please_Help_me_D said in General question about arrays and I/o:

    So is there a way to extract few elements from an array at the same time without loop?

    Not with plain C arrays.
    But you can do this with QVector: https://doc.qt.io/qt-5/qvector.html#mid
    There is something you can do without copying anything: an array is just a pointer to first element, so:

    int data[5] = {1, 2, 3, 4, 5};
    int *data2 = &data[2]; // data2 is now [3, 4, 5].
    

    Do you really need to copy to data2? You can simply have a variable "length" containing the length of the sub-array in data.

    "My compiler is MSVC 2017. Could you write an example of this?":

    char *char_file = new char[n]; // Allocate on the heap
    ...
    delete[] char_file; // Delete when not needed anymore
    


  • @jsulm thank you for the answer
    The problem is that usually I have I know indexes are maybe like:

    int data[4] = {1, 2, 3, 4, 5};
    int ind[3] = {4, 2, 3};
    

    and then I need to get access to those elements like:

    int data2[3] = data[ind];
    

    Now I read about QVector, I hope it is able to do that.

    You know both examples that you wrote seems to me don't work properly.

        int data[5] = {1, 2, 3, 4, 5};
        int *data2 = &data[2]; // data2 is now [3].
    

    And:

        int n = 5;
        char char_file = new char[n]; // error: cannot initialize a variable of type 'char' with an rvalue of type 'char *'
        delete[] char_file; // Delete when not needed anymore
    

    Where can I read about '*' and '&' signs when using in such ways? What it gives?


  • Lifetime Qt Champion

    @Please_Help_me_D said in General question about arrays and I/o:

    seems to me don't work properly

    In what way? &data[2] points to 3 in data, so data2[0] == 3, data2[1] == 4 and data2[2] == 5

    Please read about pointers in C/C++:

    // It must be *char_file not just char_file
    char *char_file = new char[n];
    

    I edited my previous post as I forgot *



  • @jsulm

    In what way? &data[2] points to 3 in data, so data2[0] == 3, data2[1] == 4 and data2[2] == 5

    I attach the picture below. data2 is now is equal to 3 and that is it. Is it correct? data.jpg
    After I added * pointer the program works but seems to me that the length of char_file doesn't depend on n. If n=4 then length of char_file=32, n=5 then char_file=32. Is it ok?
    char.jpg


  • Lifetime Qt Champion

    @Please_Help_me_D said in General question about arrays and I/o:

    data2 is now is equal to 3 and that is it. Is it correct?

    Yes it is, you can treat a pointer as an array (actually in C/C++ an array is simply a pointer to first element of the array). So, data2[0] == 3, data[1] == 4...
    Just do

    qDebug() << data2[1];
    

    and see.

    Regarding second question: this is debugger view. Your array is for sure 4 char in size. To verify do

    char_file[4] = 1;
    

    your app should crash.



  • @jsulm Yes that works:

        int data[5] = {1, 2, 3, 4, 5};
        int *data2 = &data[2]; // data2 is now [3, 4, 5].
        qDebug() << data2[2];
    

    But what's the magic behind that?:) I debug I can see that data2 has only a single number.

    But this doesn't crash and I can see some output in terminal (it is not 100 but some letters or signs as I think it is char) even if I lauch the program not in debug mode:

        int n = 2;
        char *char_file = new char[n]; // error: cannot initialize a variable of type 'char' with an rvalue of type 'char *'
        delete[] char_file; // Delete when not needed anymore
    
        char_file[5] = 100;
    
        std::cout << char_file[5];
    

    Are there in Qt the possibility to use command line when the program stopped in debug mode? For example if it's stopped and I want to do something in real time (while the program is topped)? Like in Matlab command line in debug mode



  • @Please_Help_me_D said in General question about arrays and I/o:

    Are there in Qt the possibility to use command line when the program stopped in debug mode? For example if it's stopped and I want to do something in real time (while the program is topped)? Like in Matlab command line in debug mode

    No, this is a C++ compiled program (nothing to do with Qt), not Matlab/an interpreted language! You can print out values, and even at a pinch poke a value into a variable, but you can't start "telling" the debugger/program to go perform actions :)


  • Lifetime Qt Champion

    @Please_Help_me_D said in General question about arrays and I/o:

    But what's the magic behind that?:) I debug I can see that data2 has only a single number.

    Pointer magic :-) data2 is defined as pointer to int, that's why debugger only shows one value. But you as developer know that it's actually pointing to an array of int. Writing data2[1] is same as *(data2 + 1).
    *(data2 + 1) means: give me the value in memory at the position (data2 + 1) is pointing to.
    Keep in mind that in this case (data2 + 1) increments the pointer by 4 as sizeof(int) == 4.

    This should actually crash as you're accessing memory which was already freed:

    delete[] char_file; // Delete when not needed anymore
    char_file[5] = 100;
    std::cout << char_file[5];
    


  • @JonB thank you! Now I know that:)
    @jsulm ok that is interesting. I need to read about pointers
    I read about QVector and I can't solve the situation when I need to extract {4, 2, 3} elements from data (without loop) so that:
    data2[0] = data[4] = 5,
    data2[1] = data[2] = 3,
    data2[2] = data[3] = 4
    ,
    where data = {1, 2, 3, 4, 5};
    Seems to me that QVector::mid(int pos, int length = ...) can't do that. But of coarse:

        QVector<int> data = {1, 2, 3, 4, 5}; 
        QVector<int> data2 = data.mid(1,2); // data2 = {2, 3}
    

    this is also good to know for me.
    I'm trying to avoid loops here because it is like the main standart operation for me and I feel there should be a way to do that.



  • @Please_Help_me_D
    I will say one thing about "loops": although you may try to avoid, it is likely that even if there is a library call that will do a loop. (Not always true, some adjacent elements may be implemented by a memmove or similar.) Even your Matlab or whatever may present an an operation as "non-loop" as far as you are concerned, but under the hood that is what it will have to do. C++ is "lower-level" and more "literal" about what is going on/has to go on than a higher level language like Matlab may suggest to you. Not saying you shouldn't ask, or try to avoid, but be aware it may be inevitable.

    In your example, btw, technically there is no loop. It's just what it looks like: 3 separate statements (which cannot be optimized for non-adjacent data). And someone here will tell you these will take about 3 nanoseconds to perform.



  • @JonB Yes I understand that many function that don't use explicitly loops they use them under the hood. But I believe that in computer science there are some tricks that I don't know and that provide high perfomance of operations and specialists who write Qt functions (or Matlab functions) they implement those tricks to achieve good optimization of code.
    I'm just starting learning Qt I think I just need experience in C++ :)


  • Lifetime Qt Champion

    @Please_Help_me_D said in General question about arrays and I/o:

    I need to extract {4, 2, 3}

    You will need to do it by yourself. There is nothing for that neither in Qt nor in C++ stdlib. Even if there would be it would do nothing else as iterating over {4, 2, 3} in a loop. So, I doubt you can optimise here much.


Log in to reply