Parsing raw request data in an HttpModule RRS feed

  • Question

  • User-1049946619 posted
    For those who may not be aware, there is an EXCELLENT thread on handling large file uploads in .NET located here: http://www.asp.net/Forums/ShowPost.aspx?tabindex=1&PostID=55127 The thread works through the entire process of reading raw request data *without* loading the request all-at-once (e.g., the normal process via IIS). This allows uploading (almost) arbitrarily large files, without loading the entire file (or files!) into memory and using the Request.Files object. For the record, I personally beat myself up trying to find just such a mechanism for over a year before finding this excellent thread. I saw all the same ideas I had suggested, then thrown out, until someone finally 'cracked the nut.' Everyone involved in that effort -- I owe you a beer. Or two. Or a case. :-) Not being as advanced an ASP.NET programmer as these folks, the problem I'm now dealing with -- and I know there are others with this issue -- is how to go about parsing the raw request data. I don't want a 'compile-and-go' solution. I have the basic building blocks, and I know it. I can load a 'chunk' of the request into a buffer. I have identified the boundary, and loaded that into another buffer. But... now what? What I need is an algorithm, pseudocode, you know... Something to give me an idea of a good, logical breakdown of the parsing / scanning task. I've searched the Internet quite a bit here without coming up with the high-level guidance I need. So, I'm throwing myself on the mercy of all of you experts out there... Please help! Again, I appreciate that I need to write my own code. What I need is an idea of how these things are 'normally' done -- especially in the .NET environment (I found MANY Java parsers, but my Java is a lot like my Hungarian... And, no, I don't speak Hungarian). Here's the idea I have at this point. 1 - Convert the byte array with the current 'chunk' of request data, wholesale, into a string. Append it to the string that still exists from the last 'chunk' if there is one. 2 - Convert the byte array containing the boundary into a string. 3 - Use the built-in string parsing stuff (e.g., String.IndexOf(...), etc.) to identify where the boundaries occur. When a boundary is found, check the following couple of lines (delineated by newlines) to see if it is a file, what its name is, etc. and store for later use if it is a file. If not a file, do step 3 again starting from here. 4 - If it's a file, find the next boundary *or* the end of the chunk. 5 - Get a substring containing only what *should be* file content, and somehow convert that back into a byte array (I think I can do both conversions with the System.Encoding namespace), and cache that byte array in a global variable. However, if we reached the end of the chunk before we reached a boundary, cache from the first boundary, to the end of the chunk, minus the length of the boundary (so we don't walk on a boundary that might exist between this chunk and the next). Remove whatever is cached from the string I'm working with. 6 - Write whatever is in the cache byte array out to a file in my 'spool' directory (basically a standard location on disk, e.g., C:\Spool\, a directory which is a GUID representing this upload, followed by the file name). Increment a position indicator to show how far we've processed into this chunk. 7 - If any part of the chunk is still 'unprocessed,' go back to step 3. Start from the current position. If not, go back to step one (reading in the next 'chunk', converting to a string, appending to whatever I still have, etc., til the end). Not perfect, but you get the idea. Am I on the right track? I tend to think there should be a better / more efficient way to parse the byte arrays without converting them to strings. It seems like extra effort for the sake of ease -- not that I've gotten it to work -- yet. Someone, PLEASE, tell me if I'm on the right track, or way off-base. Thanks in advance. You guys rock! - Don
    Sunday, January 25, 2004 8:24 PM

All replies

  • User1220193301 posted
    If you are good with C++, then you might want to look at: http://www.codeguru.com/isapi/ISAPIUpload.html I hope this helps del
    Monday, January 26, 2004 10:45 AM
  • User-1049946619 posted
    Thanks del! That does help, in that it provides an example of the same rough kind of task. I think the basic question I have is this: Given the environment I'm working in, do I have to perform type conversions in order to parse the incoming byte stream? Or is there a good way to work with the bytes directly, as I receive them? Thanks, - Don
    Monday, January 26, 2004 9:34 PM
  • User1220193301 posted
    You are more than welcome.. that example helped me too. As for comparing the bytes, I don't think that .NET provides a way for doing that. You can convert the bytes to a string and do your comparison or you can write you own method(s) that compare bytes directly. del
    Tuesday, January 27, 2004 9:05 AM