Segregating flat files based on a promoted property RRS feed

  • Question

  • I have a requirement for splitting contents of a flat file based on one field. The input file struct is as follows:

    Header|some id for the batch|date for the batch CRLF

    Detail|batch id|Company1|192.00 CRLF

    Detail|batch id|Company2|200.00 CRLF

    Detail|batch id |Company1|180.00 CRLF

    Detail|batch id|Company2|89.02 CRLF

    I want to achieve a result output such that there will be two flat files each containing only one companies Data as follows:


    Header|some id for the batch|date for the batch CRLF

    Detail|batch id|Company1|192.00 CRLF

    Detail|batch id |Company1|180.00 CRLF


    Header|some id for the batch|date for the batch CRLF

    Detail|batch id|Company2|200.00 CRLF

    Detail|batch id|Company2|89.02 CRLF

    The solution I tried was : 

    1. debatch this file at the receive port, which will have a flat file dissasembler with the schema for header and details indicated in it

    2. the debatched contents will be subscribed by a receive shape (message type of details) in the orch, which will initialize correlation on Company Name

    3.this will be followed by loop shape which will loop until a timeout variable is set to true

    4.Within the loop, is a listen shape with left branch having a receive shape subscribed (message type of details) following the Correlation on Company name. within this branch is my logic for aggregating the messages.The right branch of listen has delay of 30  seconds, and in this branch timeout variable is set to true.

    But this results in "The instance completed without consuming all of its messages. The instance and its unconsumed messages have been suspended."

    I tried moving the listen-receive-delay within a decide shape under condition BTS.LastInterchangeMessage exists Msg_Detail. If this branch evaluates to true, I set the timeout variable to true.
    Even  this did not work. This i tried based on the suggestion made here:

    Monday, May 14, 2018 3:59 AM


All replies

  • What do you do after you split the files? Is that it?


    Monday, May 14, 2018 5:04 AM
  • Yes thats it..I just seggregate it..and club all files of same company name and send to send port where a sendpipeline with FF assembler will assemble them back to FF format.

    basic problem is : i am unable to decided when to stop waiting for the subsequent de-batched files in the orchestration. Right now delay time is arbitrary. I tried BTS.LastInterchangeMessage exists Msg_Detai but that will evaluate to true for only once and for only one CompnayName... Basically i would need to know the last messages in each company types details.

    • Edited by a.k.4.7 Monday, May 14, 2018 5:27 AM
    Monday, May 14, 2018 5:26 AM
  • In your scenario the BTS.LastInterchangeMessage might not work with more that a company involved in the same file. That said, IMHO the problem you're facing is not in the termination of the loop but in writing the aggregate (that is why the orchestration completes but since there is no "send" the message are not consumed..)

    What is the size of the file you're processing? What happens to the processed files? are they fed into another orchestration to process?


    Monday, May 14, 2018 8:26 AM
  • BTS.LastInterchangeMessage will not work. 

    Actually Orch has to wait till all messages are in. As each message comes in, the file is added to a variable of type Microsoft.XLANGs.Pipeline.SendPipelineInputMessages which is thenexecuted through a pipeline with xmlassembler component containing the envelope and doc schemas, so as to aggregate the messages -- each aggregate containing only one company's data.

    Right now I solved the problem with increasing the delay time to 30 mins, but i am pretty sure it is an adhoc solution, may not work with larger file size. With small file sizes no problem.

    But am looking for a foolproof way of knowing that i have received all messages for all company names and then i can safely exit the loop for receiving messages. IS there any way for it?

    • Edited by a.k.4.7 Monday, May 14, 2018 9:45 AM
    Monday, May 14, 2018 9:44 AM
  • A delay in a sequential convoy implies that no more messages are available. In case your file contains a very large set then your debatching is basically flooding the message box and this on a test server if what is probably causing the issues with the delay setting..

    For what you've described that is all but splitting a file into "x" files based on a field value might not be the best use of the framework. What happens to the files after this ? do they go into different systems? or are further manipulated?


    Monday, May 14, 2018 10:25 AM
  • What happens after this is: some downstream app processess these different set of files. Files for Company1 processed by App1, files of Company2 by App2 so on and so forth...

    YEs this is being tested on development environment, so are you suggesting that this problem might not occur in PROD envionrment due to higher configurations?

    Monday, May 14, 2018 10:40 AM
  • Are the Company values fixed, like just Contoso and Fabrikam, or can they change file to file?
    Monday, May 14, 2018 11:02 AM
  • As of now they are fixed, but in future we may need to segregate files for more in a way its not fixed
    Monday, May 14, 2018 11:44 AM
  • This type of operation, grouping/sorting/etc, is rather trivial in SQL vs XSL.

    So, here's my currently preferred option for doing such:

    BizTalk: Sorting and Grouping Flat File Data In SQL Instead of XSL

    You don't even need tables or anything. It can all be done in the Stored Procedure.

    • Marked as answer by a.k.4.7 Monday, May 14, 2018 2:12 PM
    Monday, May 14, 2018 12:58 PM
  • Big Thank you Johns-305. That looks like an awesome solution :)
    Monday, May 14, 2018 2:13 PM