none
What is upper bound for the maximum file size that an application can process?

    Question

  • I'm trying to find it on the MSDN site but so far haven't stumbled on it yet.  Although I did find an analysis at this link:
    http://www.discinterchange.com/TechTalk_large_files_.html

    My tests on an XP systems using VC++8 to compile this snipe of code confirm the limit mentioned in the referenced analysis. 

    #include <fstream>
    using namespace std;
    #include <cstdlib>
    #include <iostream>

    int main() {
      char *filename="C:\\inFile";
      ifstream file(filename, ios::in | ios::binary | ios::ate );
      if (!file){
        cout << "cannot open file " << filename <<endl;
        return -1;
      }
      cout << "opened file " << filename <<endl;
      return 0;
    }

    When I test a file that is 3.9 GB  the file opens just fine, when I use a test file that is 4.4 GB, I can't open the file.

    My application's spec eventually calls for "seeking" around to various places in some very large (up to 145 GB) files and I suspect that I will not be able to do this on a Windows OS.

    Has anyone successfully developed any applications that process (non sequentially) very large files?  If so, what technique did you use?


    Friday, December 14, 2007 9:21 PM

Answers

  • Hi,
    Try it without ios::ate.   ifstream + ios::ate probably does not make sense.  You are basically opening a file for reading but you have moved the read pointer to the end of the file.
    Friday, January 11, 2008 9:57 PM

All replies

  • Hi,

    Sounds like the file system is formatted with FAT32?

    Here are the maximum file sizes depending on the format type:

    • NTFS - Maximum file size 16 terabytes minus 64 KB (244 minus 64 KB)
    • FAT16 - Maximum file size 4 GB

    • FAT32 - Maximum file size 4 GB

     

    http://technet2.microsoft.com/WindowsVista/en/library/5025760b-0433-4ba1-a2f4-9338915fdb4b1033.mspx?mfr=true

    Sunday, December 23, 2007 1:19 PM
  • Thank you for your information.  However, based on my results, there must be something else that I have to do, to successfully open files over 4GB with my test program.

    I checked, and according to properties on my C disk, it is NTFS.  I'm running an XP Pro system and compiling with Visual C++  2005 Express.   Although the test program was NOT built on my C drive, I moved the resulting .exe onto my C drive into the same directory where infile resides.  The test program is unable to open the 4GB+ file.

    Is something else needed? 

    Would you know if any special compiler/linker options are needed for the larger files?  I'm using a makefile to build the .exe and setting the following options: 
    CL /Zi /Yd /MD /TP /nologo -D "WIN32"  /EHsc /Zc:forScope-,wchar_t- -D "__STDC__"

    Cheers!



    Thursday, December 27, 2007 2:13 PM
  • Does anyone from Microsoft monitor these forums or is it on read by other users?     Has anyone else managed to try the test code that I supplied and  have it open a file larger than 4 GB with NTFS?

    I am doing something a bit unusual in that I'm opening the file to the end, rather than the beginning.  Could that be what XP can't handle???

    Friday, January 11, 2008 8:18 PM
  • Hi,
    Try it without ios::ate.   ifstream + ios::ate probably does not make sense.  You are basically opening a file for reading but you have moved the read pointer to the end of the file.
    Friday, January 11, 2008 9:57 PM
  • Micky,

    Thank you for responding.  Opening to the end of the file, is a technique for getting the size of the file. 

    Once the file is opened to the end,  then a tellg() call,  will get the number of bytes in the file.  After which the file pointer is set to the beginning of the file or wherever.  So it does "make sense" for me to open it to the end, since the app needs to know the file size, so that it can skip around according to the whims of a user (and we know just how whimsical users can be.)

    Have you tried this in your environment?  And does it work on files greater than 4GB?

    Cheers!
    Tuesday, January 15, 2008 9:09 PM
  • Hi,
    That's a good point.
    No sadly, I'm on holidays at the moment and am using VS on my Vista laptop.

    Have you tried opening it withot the ios::ate to verify you can open the file? Once opened try a seekg().

    Off topic, if you are just after the file size it's faster to perform a GetFileSize() or GetFileAttributesEx() since the OS does not need to perform a file open operation. Opening a file can take longer if certain anti-virus is running too.
    Tuesday, January 15, 2008 11:53 PM
  • you can get the file size without opening the file and regarding to read the file, you can employ different constructor and methods to open and seek to the location. the idea is read in blocks instead of whole and you can define the block size while opening the file.

    i don't know that in VC as i have never used it but yes in C# and in the .Net Framework, there are a lot of constructors (see filestream io) and methods defined which can exactly do that.

     

    the support on how much big the hard-disk can have depends on the file system used. if you want to get deep, i can tell you why this is so or why this is the factor.

     

    sometimes, reading file also depends on the way the classes are designed. for example, if you see notes on xmltextreader, you will find that it cannot read files more than 2GB. i don't know why this is so in ms implementation but in mono , it because of an integer that is used while opening the file.

     

    Thursday, January 17, 2008 8:01 AM
  • Hello,

    Thank you both for your responses.  I have tried opening the file at the beginning.  The large (> 4GB) test file does open, but then the program blows up trying to read the first record.  My guess is that the open and read failures are related, since my tests work fine when my test file is under the 4 GB limit.

    Getting the file size programmatically is a side issue, not the main event  My application spec requires being able to repostion the file pointer and read starting from any record (they are binary) in the file that a user may select.  I also may be reading the records in reverse order.  I have to be sure that I am able to read and jump around in files up to 145 GB on the platform. 

    To make things even more interesing (can you stand it),  my code must be designed to run on many different platforms, so I will not not be using any platform specific techniquest for processing the files, just C++ code.

    Cheers!
     
    Thursday, January 17, 2008 9:17 PM
  • see, i think portability is not a issue with .net. the operations like file read etc , i mean system.io is very well implemented in mono ( which has been succesfully ported to both linux and solaris), so i don't see any point to the fact that C++/C is the only way today to write apps that would run on all these. infact we are now writing a lot of utilities these days and some full fledged app's as well that are succesfully running on ported versions of mono in linux (debian) and solaris 10.  

     

    i did wrote some programs based on ur problem to check that but then the thing is i don't have > 4GB files to test with.

    regarding the binary read and reading backward is just a matter of reversing the pointer .

    i can send u the code in C# , perhaps you can test with ur file and then see the result.

     

     

    Friday, January 18, 2008 5:19 AM
  • Thank you for your offer.  I have no experience with .net, C# or mono.  So using it/them is not an option for me at this time.

    Based on the feedback and tests I've done so far, I am beginning to suspect that it is the VC 8 C++ compiler's implementation of file classes that is the limiting factor.  Likely the file pointer data type is only 32 bits, rather than the 64 needed to handle files larger than 4GB.

    Can someone from Microsoft confirm this?    Do the straight C file function calls suffer from the same 32 bit limit?




    Monday, January 21, 2008 9:15 PM
  •  

    Hello j666,

     

    Since you use SQL Server 2005 Express edition, it will not be possible to update/modify/open databases with sizes >4GB. It's a restriction from Microsoft. You'll need at least SQL Server 2005 Workgroup to use databases >4GB.

     

    See this URL: http://www.microsoft.com/sql/downloads/trial-software.mspx#EQD

     

    Hope this will explain it...

    Monday, February 18, 2008 9:11 AM
  •  7j4rd4n wrote:

     

    Hello j666,

     

    Since you use SQL Server 2005 Express edition, it will not be possible to update/modify/open databases with sizes >4GB. It's a restriction from Microsoft. You'll need at least SQL Server 2005 Workgroup to use databases >4GB.

     

    See this URL: http://www.microsoft.com/sql/downloads/trial-software.mspx#EQD

     

    Hope this will explain it...

     

    The question has nothing to do with databases let alone SQL Server.

    Monday, February 18, 2008 10:06 AM