none
Can I write to a FileStream and then read from it WITHOUT waiting on the backing disk? RRS feed

  • Question

  • My application makes many rapid web service calls to its server. Each call it serializes an object that it passes as an argument to the web service call. Often that object serializes to something pretty small... say 5KB or less... in which case, I'd rather just serialize to a MemoryStream and then send the bytes from that MemoryStream. But sometimes that object serializes to something pretty huge... say 50MB (50,000KB)... in which case, serializing to a MemoryStream would severely compromise the Large Object Heap... and so, would be better serialized to a FileStream.

    It occurs to me that if a FileStream has a large enough buffer size, then I should be able to write to it, seek back to the beginning, then read from it, and never actually get out of the memory buffer... such that my thread should (theoretically) perform roughly the same speed as memory up to that buffer size. True?

    (Yes, I am going to do some measurements... but I could easily miss cases where it would behave differently... or related issues that could negate the results... things the collective intelligence here will often tell me. Thanks!!)

    Is there a reliable high-performance approach to having a file-backed memory stream that only uses the file if it needs to grow larger? (I am hoping the answer is "Yeah, just use FileStream with buffer size set to the size of the memory stream you want.")

    Thursday, March 21, 2013 7:31 PM

All replies

  • Have you cosidered using the memorystream class.  When you construct a memory stream with a buffer size of zero it creates an expandable stream that uses an allocate function to obtain memory for the buffer.  I think the memory stream can be setup as a hash table, but not sure.


    jdweng

    Friday, March 22, 2013 9:48 AM
  • Have you cosidered using the memorystream class.  When you construct a memory stream with a buffer size of zero it creates an expandable stream that uses an allocate function to obtain memory for the buffer.  I think the memory stream can be setup as a hash table, but not sure.

    If I use a MemoryStream, then as the serializer continues to add data beyond 85KB, each growth of the MemoryStream will allocate another byte array, twice as big as the prior, in Large Object Heap (LOH), resulting in hideous fragmentation of the LOH... and fragmentation of the LOH often leads to premature Out of Memory exceptions.  Not to mention it will copy the bytes log(N) times as it grows, resulting in O(N*logN) behavior that might be as slow as O(N) to disk, if it doesn't prematurely run out of memory.

    That is actually what I am working on getting away from.

    Friday, March 22, 2013 10:09 AM
  • If you use a List<byte> I think that would meet your requirement.  When the size of the memory grows large it will end up in swap memory space which becomes a temporary file on the hard disk.

    jdweng

    Friday, March 22, 2013 10:22 AM
  • Hi Tcc,

    >>Can I write to a FileStream and then read from it WITHOUT waiting on the backing disk?

    Yes, I think you can.


    Ghost,
    Call me ghost for short, Thanks
    To get the better answer, it should be a better question.

    Sunday, March 24, 2013 9:26 AM
  • part of your requirements, kind of contradict:

    ***

    ...if a FileStream has a large enough buffer size, then I should be able to write to it, seek back to the beginning, then read from it, and never actually get out of the memory buffer... such that my thread should (theoretically) perform roughly the same speed as memory up to that buffer size. True?

    ***

    When you are streaming is over your mem-max-limit-size, and assume it go into file base mode, how will a file base can "perform roughly the same speed as memory " ?

    maybe you can try this with SSD.

    it's a compromise/trade-off you have to absorb, when your class go into file-based mode.

    i suggest you write your own this custom streaming object, and encapsulate it with your implementation (auto switch to file-based so and so), instead of directly using .NET streaming class.

    Monday, March 25, 2013 4:41 AM
  • part of your requirements, kind of contradict:

    ***

    ...if a FileStream has a large enough buffer size, then I should be able to write to it, seek back to the beginning, then read from it, and never actually get out of the memory buffer... such that my thread should (theoretically) perform roughly the same speed as memory up to that buffer size. True?

    ***

    When you are streaming is over your mem-max-limit-size, and assume it go into file base mode, how will a file base can "perform roughly the same speed as memory " ?

    I said "up to that buffer size".  My hope was that if I stay under the buffer size, then the speed would be as fast as memory.  If I go over the buffer size, then clearly it'll be operating at file speed... and that is fine... because the alternative (memory fragmentation raising the risk of premature Out of Memory Exceptions is not okay).

    My limited testing so far has been a little inconclusive... some cases seem pretty close to memory speed, but other cases seem dramatically slower.  I need to look for sources of the variation before drawing conclusions.

    Wednesday, April 3, 2013 5:01 PM
  • it's very hard to measure mem vs file

    in filebase mode, you are subject to how OS decided to do it's io task, how fragement is your hdd, and maybe you running file indexing service affecting it as well, etc ....

    so many factors over i/o

    even if you worked out the measurement/performance testing, these results are based on your tested hardware/configuration setup.

    the final deployment hardware can have different setup, and yield different result.

    my point is , the measurement is really redundant. you just have to accept it's definitely will be slower.

    Thursday, April 4, 2013 4:34 AM