locked
Large Flat File Debatching RRS feed

  • Question

  • I have a flat file batch to process which will be of around 250 MB. All individual messages need to be processed and was thinking of various options that I have of de-batching viz. envelop, inside the orchestration. Has anyone seen any processing of a message of that size and what should be the processing time that I can expect to get this done.
    Monday, February 8, 2010 3:37 PM

Answers

  • processing 250MB file utilises lot of CPU and also causes throttling. Also it depends on how busy will be your Biztalk server with other message processing I would advice you to write small c# component that will break the flatfile and send smaller files to BizTalk.
    KiranMP
    Monday, February 8, 2010 3:45 PM
  • Hi,

    Do you also need to convert the messages to Xml?

    The best place to disassemble large messages is in a pipeline. You can use out-of-the-box flat file disassembler or create your own disassembler if that does not fit. Make sure you take a streaming approach to keep pressure on resources (CPU, memory, etc) as low as possible.

    This link has some information about large messges: http://msdn.microsoft.com/en-us/library/aa560481(BTS.20).aspx 

    The good thing about the pipeline approach is that individual disassembled messages will be put in the messagebox as soon as they are completly read. You don't have to wait until the whole batch completes.


    HTH,

    Randal van Splunteren - MVP, MCTS BizTalk Server
    http://biztalkmessages.vansplunteren.net

    Please mark as answered if this answers your question.

    Check out the PowerShell provider for BizTalk: http://psbiztalk.codeplex.com
    Monday, February 8, 2010 3:46 PM
    Moderator
  • Hi Dan,

    The only reason to do it from an orchestration would be if you needed to deal with the flat file as a batch (e.g. know the record count, start/stop time of debatching), but that would cuase what Kiran is aying. I suggest to follow Randal/Kiran's advice though. There a sample implementations like: http://www.codeproject.com/KB/biztalk/DebatchingFlatfile.aspx or http://www.richardhallgren.com/efficient-grouping-and-debatching-of-big-files-using-biztalk-2006/ (if you need grouping).

    Regards,

    Steef-Jan Wiggers
    MCTS BizTalk Server
    http://soa-thoughts.blogspot.com/
    If this answers your question please mark it accordingly


    BizTalk Server
    Monday, February 8, 2010 5:18 PM
    Moderator
  • Just if someone would come across a similar situation, find following the solution that I followed.
    The batch file I started with had a two tier batching like an Invoices and the InvoiceDetails batched in each Invoice.
    Used the out of box Flat File Disassembler  to de-batch the file into Invoices in a pipeline component and then used the NodeList and Looping using the Enumerator to de-batch invoice details in an orchestration. The process seems to work as expected with performance not bad looking at the initial flat file size of 250 to 300 mb.
    • Marked as answer by DPS Bali Monday, March 1, 2010 1:47 PM
    Monday, February 15, 2010 6:11 PM

All replies

  • processing 250MB file utilises lot of CPU and also causes throttling. Also it depends on how busy will be your Biztalk server with other message processing I would advice you to write small c# component that will break the flatfile and send smaller files to BizTalk.
    KiranMP
    Monday, February 8, 2010 3:45 PM
  • Hi,

    Do you also need to convert the messages to Xml?

    The best place to disassemble large messages is in a pipeline. You can use out-of-the-box flat file disassembler or create your own disassembler if that does not fit. Make sure you take a streaming approach to keep pressure on resources (CPU, memory, etc) as low as possible.

    This link has some information about large messges: http://msdn.microsoft.com/en-us/library/aa560481(BTS.20).aspx 

    The good thing about the pipeline approach is that individual disassembled messages will be put in the messagebox as soon as they are completly read. You don't have to wait until the whole batch completes.


    HTH,

    Randal van Splunteren - MVP, MCTS BizTalk Server
    http://biztalkmessages.vansplunteren.net

    Please mark as answered if this answers your question.

    Check out the PowerShell provider for BizTalk: http://psbiztalk.codeplex.com
    Monday, February 8, 2010 3:46 PM
    Moderator
  • Hi Dan,

    The only reason to do it from an orchestration would be if you needed to deal with the flat file as a batch (e.g. know the record count, start/stop time of debatching), but that would cuase what Kiran is aying. I suggest to follow Randal/Kiran's advice though. There a sample implementations like: http://www.codeproject.com/KB/biztalk/DebatchingFlatfile.aspx or http://www.richardhallgren.com/efficient-grouping-and-debatching-of-big-files-using-biztalk-2006/ (if you need grouping).

    Regards,

    Steef-Jan Wiggers
    MCTS BizTalk Server
    http://soa-thoughts.blogspot.com/
    If this answers your question please mark it accordingly


    BizTalk Server
    Monday, February 8, 2010 5:18 PM
    Moderator
  • Just if someone would come across a similar situation, find following the solution that I followed.
    The batch file I started with had a two tier batching like an Invoices and the InvoiceDetails batched in each Invoice.
    Used the out of box Flat File Disassembler  to de-batch the file into Invoices in a pipeline component and then used the NodeList and Looping using the Enumerator to de-batch invoice details in an orchestration. The process seems to work as expected with performance not bad looking at the initial flat file size of 250 to 300 mb.
    • Marked as answer by DPS Bali Monday, March 1, 2010 1:47 PM
    Monday, February 15, 2010 6:11 PM