locked
Huge performance hit when using ifstream vs fopen()?

    Question

  • I was using VC++ EE, and compiled my program, and found that using ifstream for reading in big files (over 700MB) compared to doing the same thing, but using fopen()/fread() calls instead of streams resulted in massive improvement.

    These are the #'s I am getting:

    Using ifstreams : 29secs ,  26.27 MB/sec

    Using fopen()/fread() :19.37 ,  36.11MB/sec

    In both cases, I am reading in 32K at a time.

    Is this normal?  Are streams really that slow?

    From what I can search in the forums, it seems to be a by-product of compiling in Multi-threaded mode instead of Single threaded, but as there is no Single threaded mode anymore, how can I speed streams up?

    Also is there a performance/tips/hints chart that shows the differences between these 2 modes and how they affect different functions?



    Wednesday, June 21, 2006 9:48 PM

Answers

  • My first thought was hmmm, what exactly are you timing, by that i mean are you sure that the sections of code are slowed down by the read operations and not by some other code?
    And are your timing a release build?

    Given this thought i decided to run some of my own tests using the following code:

    win32 console app, main.cpp

    #include <windows.h>//for timeGetTime
    #include <fstream>
    #include <iostream>
    #pragma comment(lib,"winmm.lib")//for timeGetTime

    class SimpleTimer
    {
    private:
        DWORD f_start;
        DWORD f_finish;
    public:
        void start() {
            f_start = f_finish = timeGetTime();
        }
        void stop() {
            f_finish = timeGetTime();
        }
        DWORD elapsed() {
            return f_finish - f_start;
        }
    };

    int main()
    {   
        const int bufsize = 32 * 1024;
        char buff[bufsize];
       
        FILE* fp = 0;
        std::ifstream ifs;
        SimpleTimer timer;
        DWORD t1,t2;
               
        //test fstream.read
        timer.start();
        ifs.open("test.dat",std::ios::binary);
        while(ifs)
        {
            ifs.read(buff,bufsize);
        }
        ifs.close();
        timer.stop();
        t1 = timer.elapsed();

        //test fread
        timer.start();
        fp = ::fopen("test.dat","rb");
        while(!::feof(fp))
        {
            ::fread(buff,1,bufsize,fp);
        }   
        ::fclose(fp);
        timer.stop();
        t2 = timer.elapsed();

        std::cout << "tests complete" << std::endl;
        std::cout << "fstream\t" << t1 << " " << (float)t1 / 1000 << "s" << std::endl;
        std::cout << "fread\t" << t2 << " " << (float)t2 / 1000 << "s" << std::endl;
        std::cin.get();
        return 0;
    }

    Before running this, i created a binary data file exactly 1GB in size and filled it with random data.
    As you can see i am testing only the time it takes to open the file, read the data, close the file and a small loop overhead, all tests were carried out on release build with default project settings.

    My results were completely different to the speeds you have quoted, i tried 3 different buffer sizes (manually editing bufsize each time) just to see if that was a factor, my results were as follows:

    32k buffer reads, time open/close also
        fstream.read: 23.687s
        fread: 22.609s
    64k buffer reads, time open/close also
        fstream: 23.984s
        fread: 57.719s
    128k buffer reads, time open/close also
        fstream.read: 23.297s
        fread: 41.469s
    256k buffer reads, time open/close also
        fstream.read: 23.546s
        fread: 41.969s

    I ran a second set of tests, modifying the code so that i didnt include time to open/close the files, my results were:
    32k buffer reads
        fstream.read: 23.343s
        fread: 22.625s
    64k buffer reads
        fstream: 22.625s
        fread: 58.594s
    128k buffer reads
        fstream.read: 21.565s
        fread: 41.25s
    256k buffer reads
        fstream.read: 23.578s
        fread: 41.188s

    As you can see from these results (and hopefully replicate using the above code) fstream.read was marginally slower with a small 32k buffer, but performed nearly twice as fast with a larger buffer.
    NOTE: This example could have been made more reflective if i had performed the reads several times, and averaged the times.

    Given the results i found, i can only assume that your speed loss is not directly related to fstream.read over fread as the difference is marginal at 32k buffer, and fstream appears to hugely outperform fread with larger buffer sizes.
    Sunday, June 25, 2006 10:16 AM

All replies

  • In most cases stl streams are slower than the direct access to the CRT. As you see in the implementation istreams are very often a wrapper arround the CRT routines and this takes time.

    But can you post a small sample that we may have a look on it?

    Thursday, June 22, 2006 7:24 AM
  • I didn't post a code sample before, since I thought it was trivial.. but here it is anyway. :)


    SrcFile1.open(File1,ios::binary);
    while(SrcFile1)
    {
    SrcFile1.read(buf1,sizeof(buf1));
    processData(buf1,bytesRead);
    }
    SrcFile1.close();
    -----
    filehandle1=fopen (File1,"rb");
    while(!feof(filehandle1))
    {
    fread (buf1,1,1024*32,filehandle1);
    processData(buf1,bytesRead);
    }
    fclose (filehandle1);

    I removed error checking, misc stuff, and timer routines.  So it is pretty basic, just to show I wasn't doing anything strange.

    I would have thought that streams would have been more optimized, since that is what I assumed everyone is using now.


    For what it is worth, I have a buddy with VS2003, and it does seem that Multi-threaded stuff sure takes a pretty big performance hit compared to single-threaded stuff. 

    Friday, June 23, 2006 12:25 AM
  • My first thought was hmmm, what exactly are you timing, by that i mean are you sure that the sections of code are slowed down by the read operations and not by some other code?
    And are your timing a release build?

    Given this thought i decided to run some of my own tests using the following code:

    win32 console app, main.cpp

    #include <windows.h>//for timeGetTime
    #include <fstream>
    #include <iostream>
    #pragma comment(lib,"winmm.lib")//for timeGetTime

    class SimpleTimer
    {
    private:
        DWORD f_start;
        DWORD f_finish;
    public:
        void start() {
            f_start = f_finish = timeGetTime();
        }
        void stop() {
            f_finish = timeGetTime();
        }
        DWORD elapsed() {
            return f_finish - f_start;
        }
    };

    int main()
    {   
        const int bufsize = 32 * 1024;
        char buff[bufsize];
       
        FILE* fp = 0;
        std::ifstream ifs;
        SimpleTimer timer;
        DWORD t1,t2;
               
        //test fstream.read
        timer.start();
        ifs.open("test.dat",std::ios::binary);
        while(ifs)
        {
            ifs.read(buff,bufsize);
        }
        ifs.close();
        timer.stop();
        t1 = timer.elapsed();

        //test fread
        timer.start();
        fp = ::fopen("test.dat","rb");
        while(!::feof(fp))
        {
            ::fread(buff,1,bufsize,fp);
        }   
        ::fclose(fp);
        timer.stop();
        t2 = timer.elapsed();

        std::cout << "tests complete" << std::endl;
        std::cout << "fstream\t" << t1 << " " << (float)t1 / 1000 << "s" << std::endl;
        std::cout << "fread\t" << t2 << " " << (float)t2 / 1000 << "s" << std::endl;
        std::cin.get();
        return 0;
    }

    Before running this, i created a binary data file exactly 1GB in size and filled it with random data.
    As you can see i am testing only the time it takes to open the file, read the data, close the file and a small loop overhead, all tests were carried out on release build with default project settings.

    My results were completely different to the speeds you have quoted, i tried 3 different buffer sizes (manually editing bufsize each time) just to see if that was a factor, my results were as follows:

    32k buffer reads, time open/close also
        fstream.read: 23.687s
        fread: 22.609s
    64k buffer reads, time open/close also
        fstream: 23.984s
        fread: 57.719s
    128k buffer reads, time open/close also
        fstream.read: 23.297s
        fread: 41.469s
    256k buffer reads, time open/close also
        fstream.read: 23.546s
        fread: 41.969s

    I ran a second set of tests, modifying the code so that i didnt include time to open/close the files, my results were:
    32k buffer reads
        fstream.read: 23.343s
        fread: 22.625s
    64k buffer reads
        fstream: 22.625s
        fread: 58.594s
    128k buffer reads
        fstream.read: 21.565s
        fread: 41.25s
    256k buffer reads
        fstream.read: 23.578s
        fread: 41.188s

    As you can see from these results (and hopefully replicate using the above code) fstream.read was marginally slower with a small 32k buffer, but performed nearly twice as fast with a larger buffer.
    NOTE: This example could have been made more reflective if i had performed the reads several times, and averaged the times.

    Given the results i found, i can only assume that your speed loss is not directly related to fstream.read over fread as the difference is marginal at 32k buffer, and fstream appears to hugely outperform fread with larger buffer sizes.
    Sunday, June 25, 2006 10:16 AM