Prevent Duplicate Message Submission
- So I've been asked to prevent business partners from submitting the same message (normally file) twice. Any good ideas on how to do this? I am thinking of making a BizTalk solution that will receive all files and use a pipeline that takes an MD5 and logs to a database somewhere then on the send side (to send the file I've just received on to another BizTalk solution) I would have a Pipeline check this database.
I don't like accessing a DB in a pipeline, but the only other thing I could think of would be an Orchestration (which I would receive XmlDocument and just not touch it in). Which sounds better? Will the Orchestration option increase my memory footprint even if I never do any operaton on the received message?
Answers
the resource article is showing use of streams that will negatively impact memory footprint
Do you mean loading entire message into memory? Use ReadOnlySeekableStream or VirtualStream as wrappers to avoid this.
Here is MSDN article on the subject (http://msdn.microsoft.com/en-us/library/ee377071%28BTS.10%29.aspx):
Use streaming to minimize the memory footprint required when loading messages in pipelinesThe following techniques describe how to minimize the memory footprint of a message when loading the message into a pipeline.
Use ReadOnlySeekableStream and VirtualStream to process a message from a pipeline component
It is considered a best practice to avoid loading the entire message into memory inside pipeline components. A preferable approach is to wrap the inbound stream with a custom stream implementation, and then as read requests are made, the custom stream implementation reads the underlying, wrapped stream and processes the data as it is read (in a pure streaming manner). This can be very hard to implement and may not be possible, depending on what needs to be done with the stream. In this case, use the ReadOnlySeekableStream and VirtualStream classes exposed by the Microsoft.BizTalk.Streaming.dll. An implementation of these is also provided in Arbitrary XPath Property Handler (BizTalk Server Sample) (http://go.microsoft.com/fwlink/?LinkId=160069) in the BizTalk SDK.ReadOnlySeekableStream ensures that the cursor can be repositioned to the beginning of the stream. The VirtualStream will use a MemoryStream internally, unless the size is over a specified threshold, in which case it will write the stream to the file system. Use of these two streams in combination (using VirtualStream as persistent storage for the ReadOnlySeekableStream ) provides both “seekability” and “overflow to file system” capabilities. This accommodates the processing of large messages without loading the entire message into memory. The following code could be used in a pipeline component to implement this functionality.
Kiryl Kavalenka My Blog- Marked As Answer byDan RosanovaMVPMonday, November 09, 2009 4:12 AM
All Replies
- Hi,
One other option can be to get the message compare it with your database in your receivepipeline. Log somewhere if duplicate (for auditing) and do not pass it to message box. If new message and not found in db, take an md5 and log to database and pass to messagebox.
Regards,
Tariq Majeed
Please mark it as answer if it helps - That's pretty good! Sounds like what I want. I do a lot of debatching and wanted to be able to get the MD5 before the messages start getting dispatched, but I think the stream in the Pipeline is forward only read only with events, which I think would stop me from getting the MD5 before the debatching.
-Dan - Debatching will take place during Disassemble stage of the pipeline. You can put all your checking duplicates logic (i.e. read the whole file, compute MD5, compare it with DB (with logging if not duplicate), throw exception if duplicate) into a custom Decode component.
I think the stream in the Pipeline is forward only read only with events, which I think would stop me from getting the MD5 before the debatching.
No, it's not. You can do whatever you want with the stream during Decoding stage. A stream wrapper may be required in some cases, but not in yours.Here is a decent resource on the subject:- So this is acutally working (although I would point out that the resource article is showing use of streams that will negatively impact memory footprint and do some things I thought you weren't supposed to in a pipeline).
I got it working, but my messages were always empty so after debugging I decided I would just set the position on the stream to zero and it works great. Now my only other issue is suppose I find this duplicate message, what do I do then?
Effectively I want the message to simply not come out of the receive Pipeline. I guess I could throw an exception and just deal with the suspended messages. Can I set any of the properties in the ErrorReport so that I catch these from any application.
Kind Regards,
-Dan - You are not using ESB exception handling, are you?
Kiryl Kavalenka My Blog the resource article is showing use of streams that will negatively impact memory footprint
Do you mean loading entire message into memory? Use ReadOnlySeekableStream or VirtualStream as wrappers to avoid this.
Here is MSDN article on the subject (http://msdn.microsoft.com/en-us/library/ee377071%28BTS.10%29.aspx):
Use streaming to minimize the memory footprint required when loading messages in pipelinesThe following techniques describe how to minimize the memory footprint of a message when loading the message into a pipeline.
Use ReadOnlySeekableStream and VirtualStream to process a message from a pipeline component
It is considered a best practice to avoid loading the entire message into memory inside pipeline components. A preferable approach is to wrap the inbound stream with a custom stream implementation, and then as read requests are made, the custom stream implementation reads the underlying, wrapped stream and processes the data as it is read (in a pure streaming manner). This can be very hard to implement and may not be possible, depending on what needs to be done with the stream. In this case, use the ReadOnlySeekableStream and VirtualStream classes exposed by the Microsoft.BizTalk.Streaming.dll. An implementation of these is also provided in Arbitrary XPath Property Handler (BizTalk Server Sample) (http://go.microsoft.com/fwlink/?LinkId=160069) in the BizTalk SDK.ReadOnlySeekableStream ensures that the cursor can be repositioned to the beginning of the stream. The VirtualStream will use a MemoryStream internally, unless the size is over a specified threshold, in which case it will write the stream to the file system. Use of these two streams in combination (using VirtualStream as persistent storage for the ReadOnlySeekableStream ) provides both “seekability” and “overflow to file system” capabilities. This accommodates the processing of large messages without loading the entire message into memory. The following code could be used in a pipeline component to implement this functionality.
Kiryl Kavalenka My Blog- Marked As Answer byDan RosanovaMVPMonday, November 09, 2009 4:12 AM

