none
Large files in 32-bit OS - definitive answer sought!

    Question

  • I've hunted high and low for an answer to this, but the documentation isn't specific or is confusing on this point.

    I am about to embark on a project for work where we need to access very large files well in excess of 2^32 bytes.  The in-house OS is WinXP 32-bit so I need to compile 32-bit code in VC++.

    Normally for heavy file-processing, I'd be looking at using iostreams but I can't find a definitive answer as to whether or not they'd work in the context of these file sizes.  Some sources suggest that the 'tell' functions are the only ones likely to be non-functional but I need a definitive answer that ideally doesn't involve direct API access.

    Can anyone clear this up for me?

    Happy to answer any questions to explain what I'm trying to do in case I've not been clear.

    Cheers,

    Fred

    Saturday, November 06, 2010 3:59 PM

Answers

  • Consider using CAtlFile.

     


    «_Superman_»
    Microsoft MVP (Visual C++)
    Saturday, November 06, 2010 8:24 PM
  • [Tim Roberts]

    > The 32-bit STL in Visual C++ uses longs for its file position type.

    We fixed this in VC10.  Now the STL unconditionally supports large files.

    C:\Temp>type purr.cpp
    #include <ios>
    #include <iostream>
    #include <ostream>
    using namespace std;
    
    int main() {
      cout << sizeof(streamoff) << endl;
    }
    
    C:\Temp>cl
    Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01 for 80x86
    Copyright (C) Microsoft Corporation. All rights reserved.
    
    usage: cl [ option... ] filename... [ /link linkoption... ]
    
    C:\Temp>cl /EHsc /nologo /W4 purr.cpp
    purr.cpp
    
    C:\Temp>purr
    8
    
    Friday, November 12, 2010 11:07 PM

All replies

  • There are functions like _ftelli64, _fseeki64 etc. that deal with 64 bit integer values.

     


    «_Superman_»
    Microsoft MVP (Visual C++)
    Saturday, November 06, 2010 4:09 PM
  • The in-house OS is WinXP 32-bit so I need to compile 32-bit code in VC++.

    Also ensure that you are running NTFS as FAT32 won't let you go above 4gig files.


    Microsoft Test - http://tester.poleyland.com/
    Saturday, November 06, 2010 5:18 PM
  • Yeah, I've checked that it's NTFS - the files are generated on the same system by a different third-party program, so it will work!

    I appreciate the comment about _ftelli64 etc. but I was hopeing for something that would fit in with iostream.  If not, I'll have to reaquaint myself with the old fopen...fclose procedures for binary files....a task I don't envy!

    Fred

    Saturday, November 06, 2010 7:30 PM
  • Consider using CAtlFile.

     


    «_Superman_»
    Microsoft MVP (Visual C++)
    Saturday, November 06, 2010 8:24 PM
  • Or just go directly to CreateFile/WriteFile(Ex)
    Microsoft Test - http://tester.poleyland.com/
    Saturday, November 06, 2010 10:10 PM
  • Fritzpoll wrote:
    >
    >I am about to embark on a project for work where we need to access very
    >large files well in excess of 2^32 bytes. The in-house OS is WinXP
    >32-bit so I need to compile 32-bit code in VC++.
    >
    >Normally for heavy file-processing, I'd be looking at using iostreams...
     
    Really? When I need heavy file processing, I'm always trying to eliminate
    as many layers as I can. iostreams has both a layer of C++ crud and a
    layer of C run-time crud before it gets to the APIs. That's especially
    true when random accessing files. I never know what extra buffering I am
    adding.
     
    I would use CAtlFile, as another poster suggested.
     
    >...but I can't find a definitive answer as to whether or not they'd
    >work in the context of these file sizes. Some sources suggest that
    >the 'tell' functions are the only ones likely to be non-functional
    >but I need a definitive answer that ideally doesn't involve direct
    >API access.
     
    Are you bouncing around the file? For sequential access, it doesn't
    matter. You can read and write as much as you need. For random access,
    when you need to worry about byte offsets, that's where you need to worry.
    The 32-bit STL in Visual C++ uses longs for its file position type.
    --
    Tim Roberts, timr@probo.com
    Providenza & Boekelheide, Inc.
     

    Tim Roberts, DDK MVP
    Saturday, November 06, 2010 10:43 PM
  • "Fritzpoll" wrote in message news:e21279bc-1c0e-4706-8969-a2122694e2e1...
    > I've hunted high and low for an answer to this, but the documentation
    > isn't specific or is confusing on this point.
    >
    > I am about to embark on a project for work where we need to access very
    > large files well in excess of 2^32 bytes.
    > The in-house OS is WinXP 32-bit so I need to compile 32-bit code in VC++.
    >
    > Normally for heavy file-processing, I'd be looking at using iostreams but
    > I can't find a definitive
    > answer as to whether or not they'd work in the context of these file
    > sizes.
    > Some sources suggest that the 'tell' functions are the only ones likely to
    > be
    > non-functional but I need a definitive answer that ideally doesn't involve
    > direct API access.
    >
    > Can anyone clear this up for me?
     
    I'm joining this question.
     
     
    In the meantime, I have solved this kind of problems by usin
    Boost.IOStreams.
     
    // reading
    typedef boost::iostreams::stream
    <boost::iostreams::file_descriptor_source> bio_istream;
    bio_istream bigifs("big.las"); // a big file, >10 GB
     
    // writing
    typedef boost::iostreams::stream
    <boost::iostreams::file_descriptor_sink> bio_ostream;
    bio_ostream bigofs("big.las"); // big output file
     
    I use this approach to solve processing large files with LiDAR data:
     
    http://trac.liblas.org/ticket/147#comment:15
     
    Best regards,
    --
    Mateusz Loskot, http://mateusz.loskot.net
    Charter Member of OSGeo, http://osgeo.org
     
     
    Friday, November 12, 2010 2:05 PM
  • [Tim Roberts]

    > The 32-bit STL in Visual C++ uses longs for its file position type.

    We fixed this in VC10.  Now the STL unconditionally supports large files.

    C:\Temp>type purr.cpp
    #include <ios>
    #include <iostream>
    #include <ostream>
    using namespace std;
    
    int main() {
      cout << sizeof(streamoff) << endl;
    }
    
    C:\Temp>cl
    Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01 for 80x86
    Copyright (C) Microsoft Corporation. All rights reserved.
    
    usage: cl [ option... ] filename... [ /link linkoption... ]
    
    C:\Temp>cl /EHsc /nologo /W4 purr.cpp
    purr.cpp
    
    C:\Temp>purr
    8
    
    Friday, November 12, 2010 11:07 PM
  • Stephan,

    This is an excellent news. Thank you!

    Mateusz

    Friday, November 12, 2010 11:43 PM
  • I tried sizeof(std::streamoff) on VS2010 and it is really 8 bytes but when I try this for a large file (8,458,512,045 bytes):

     

    std::ifstream file;
    file.open(TEXT("c:\\LargeFile.dat"), std::ios::binary | std::ios::in, _SH_DENYWR);
    file.seekg(0, std::ios_base::end);
    std::streamoff nFileSize = file.tellg();
    unsigned __int64 nFileSize2 = file.tellg();
    
    int nStreamoffSize= sizeof(std::streamoff);
    

    The result is:

    nFileSize == -131422546
    nFileSize2 == 18446744073578129070
    nStreamoffSize == 8

    so it internally uses a 32 bit value which overflows.

    What am I doing wrong?

    Wednesday, May 11, 2011 7:32 AM