complex source text file
-
Friday, April 27, 2012 10:02 AM
Hi All
we are getting text file as source and we need to parse it and load it in a table . the data inside a text file looks as below . Please suggest the best way to load the data in to table . the below contains 2 records like this we will have many records in a file .
Td.Name = MI_UMTS_FTP
Tc.Name = FTP_UMTS_5M
Tp.Name = Ftp_Rating_1.5
TestScenario.Name = Marais_MI_UMTS_FTP
TcRun.Id = 61950500
TcRun.StartDate = 2012/04/24 15:31:59
TcRun.EndDate = 2012/04/24 15:36:13
TcRun.Result = Passed
Error.Step =
Error.Protocol =
O.NeId = PORT Marais
O.Port.Code = xxxxxxxxxx
O.Sim.MsisdnVoice = 9999999999999999
Host.Address = xxxxxxxxxxxx
O.Msc.Name = Prox.SGAM1
O.Region.Name = BRUXELLES
ErrorCause = NoError
O.App.ThrputKbitRxAvg = 351
O.AttachSuccessful = Yes
O.AttachTime = 1.64
O.DetachResult = Yes
O.DetachTime = 2.688
O.PdpCntx.ActTime = 2.172
O.PdpCntx.Successful = Yes
Result = PASSEDTd.Name = Voice_qual
Tc.Name = VoicePesq_2.0_2G
Tp.Name = VoicePesq_2.0
TestScenario.Name = Wavre_Leuven_Voice_qual
TcRun.Id = 61950604
TcRun.StartDate = 2012/04/24 15:33:55
TcRun.EndDate = 2012/04/24 15:36:21
TcRun.Result = Passed
Error.Step =
Error.Protocol =
O.NeId = PORT Wavre_VQM
T.NeId = PORT Leuven_VQM
O.Port.Code = RTUPC01148-02
T.Port.Code = RTUPC01017-02
T.Region.Name = LEUVEN
T.RingNumber = 2
O.Msc.Name = Prox.WAVRE
O.Region.Name = WAVRE
ErrorCause = NoError
O.AttachSuccessful = Yes
O.AudioCodec = AMR_FR
O.DetachSuccessful = Yes
OT.AnswerTimeStamp = 90.887
OT.CallSetupSuccessful = Yes
OT.CallSetupTime = 7.781
OT.CallSetupTimeUnderThrld = Yes
OT.DialStartTimeStamp = 76.406
OT.HangupTimeStamp = 138.734
OT.PesqScoreAvg = 3.67
OT.PesqScoreMax = 3.82
OT.PesqScoreMin = 3.54
OT.PesqUnderThrldPerc = 0
OT.PlayErrorPerc = 0
OT.QoSEvaluateErrorPerc = 0
OT.RingDuration = 5.999
Result = PASSED
T.AttachSuccessful = Yes
T.AudioCodec = AMR_FR
T.Cli.CheckSuccessful = Yes
T.DetachSuccessful = Yes
T.RingSuccessful = Yes
TO.PesqScoreAvg = 3.63
TO.PesqScoreMax = 3.82
TO.PesqScoreMin = 3.51
TO.PesqUnderThrldPerc = 0
TO.PlayErrorPerc = 0
TO.QoSEvaluateErrorPerc = 0
Surendra Thota
- Edited by Surendra_Thota Friday, April 27, 2012 10:09 AM
All Replies
-
Friday, April 27, 2012 10:59 AM
I would read it all in as one column, use a script component to split the data into two columns using the "=" and to assign a recordID to the rows. After that, I would do a pivot on that newly added recordID to get a decent column structure.MCTS, MCITP - Please mark posts as answered where appropriate.

- Proposed As Answer by Nik - Shahriar Nikkhah Friday, April 27, 2012 1:06 PM
-
Friday, April 27, 2012 11:15 AMModerator
I had something simular, where I used a Script Component:http://microsoft-ssis.blogspot.com/2011/02/script-component-as-source-2.html
The difficult thing about your file is that the number of columns differ per record. Is that correct? What is the record delimiter in your case?
Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com | Twitter
- Proposed As Answer by Nik - Shahriar Nikkhah Friday, April 27, 2012 1:06 PM
-
Friday, April 27, 2012 12:53 PMYes the number of colmumns are not fixed . but we know the max number of fields we get . the record delimeter is 2 new lines
Surendra Thota
- Edited by Surendra_Thota Friday, April 27, 2012 12:54 PM
-
Friday, April 27, 2012 12:56 PM
I didn't notice the dynamic number of columns - I have better things to do than count them all - so I would just script it all out with a .NET script task.
Create a variable for each possible column and give them all default values.
Take in a row, parse the text before the = and determine which column it is. Put the value in the right variable.
Read the next row and so on until you've reached a new record. I do hope Td.Name is always your first column.
When you reach the next record, put all variables in one row and add it to the output.Start again with the next record until you reach the end of the file.
MCTS, MCITP - Please mark posts as answered where appropriate.

-
Friday, April 27, 2012 1:09 PM
go with the script solution you have more control
check this link , but remeber that the link does NOT have your answer but it gives you good idea what to do.
it has 3 pages read them all.
Sincerely SH -- MCITP , MCTS -- Please kindly mark the post(s) that answered your question and/or vote for the post(s).
-
Friday, April 27, 2012 1:12 PM
That's quite a nice article, thanks for the reference.go with the script solution you have more control
check this link , but remeber that the link does NOT have your answer but it gives you good idea what to do.
it has 3 pages read them all.
MCTS, MCITP - Please mark posts as answered where appropriate.

-
Friday, April 27, 2012 2:19 PM
Hi
In one my project i have handled this type of files Colmun1:column2 as a row , but now how to identify the end of row . and how to identify the endof the file.
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim strRow As String
Dim strColSeperator As String
Dim rowValues As String()
strRow = Row.line.ToString()
If strRow.Contains(";") Then
strColSeperator = (
";")
ElseIf strRow.Contains(":") Then
strColSeperator =
":"
End If
rowValues = Row.line.Split(
CChar(strColSeperator))
Row.CounterID = rowValues.GetValue(0).ToString()
Row.CounterValue = rowValues.GetValue(1).ToString()
Row.FileName = FileName.ToString()
Row.ServerName = ServerName.ToString()
Row.Year = cYear.ToString()
Row.Month = cMonth.ToString()
Row.Date = colDate.ToString()
Row.Hour = cHour.ToString()
Row.Min = cMin.ToString()
End Sub
Surendra Thota
-
Friday, April 27, 2012 2:21 PMModeratorlook for the empty row.... where the length of the array is 0 or before the split where the length of Row.line is 0.
Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com | Twitter
- Edited by SSISJoostMicrosoft Community Contributor, Moderator Friday, April 27, 2012 2:22 PM
-
Monday, April 30, 2012 10:18 AMHow can we split each row in Faltfile connection manager . when we connect to the file it is getting each column as row . the problem here is we have multiple row with 2 blank rows as delimeter . please let me know if anyone handled these type of files .
Surendra Thota
-
Tuesday, May 01, 2012 12:56 PM
Yes the number of colmumns are not fixed . but we know the max number of fields we get . the record delimeter is 2 new lines
Surendra Thota
And can we say that all records have the field "Td.Name"?
and can we say that each records first field is "Td.Name"?
If so i have have a way all i need is a sample file ( if you like you can email (SNikkhah@live.ca) me a sample file, please remove confidential information)
Sincerely SH -- MCITP , MCTS -- Please kindly mark the post(s) that answered your question and/or vote for the post(s).
-
Tuesday, May 01, 2012 3:48 PM
I HAVE YOUR ANSWER
assuming you have data like ....
--------------------
Td.Name = MI_UMTS_FTP
Tc.Name = FTP_UMTS_5M
TcRun.EndDate = 2012/04/24 15:36:13
Error.Step =
Error.Protocol =
O.PdpCntx.ActTime = 2.172
O.PdpCntx.Successful = Yes
Result = PASSED
Td.Name = Voice_qual
O.PdpCntx.ActTime = 2.172
TcRun.Id = 61950604
TcRun.EndDate = 2012/04/24 15:36:21
TcRun.Result = Passed
Error.Step =
O.AudioCodec = AMR_FR
---------------------make a DFT and make the mentioned objects like ....
In the first DER you simply split the data from the data column name, and in theseconf DFT you simply RLTrim off space
FieldColumnName = FINDSTRING([Column 0],"=",1) < 3 ? "" : SUBSTRING([Column 0],1,(FINDSTRING([Column 0],"=",1) - 1))
DataValue = FINDSTRING([Column 0],"=",1) < 3 ? "" : SUBSTRING([Column 0],(FINDSTRING([Column 0],"=",1) + 1),LEN([Column 0]))
set the script to have string outputs and use the mentioned code
' Microsoft SQL Server Integration Services Script Component ' Write scripts using Microsoft Visual Basic 2008. ' ScriptMain is the entry point class of the script. Imports System Imports System.Data Imports System.Math Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper Imports Microsoft.SqlServer.Dts.Runtime.Wrapper <Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute()> _ <CLSCompliant(False)> _ Public Class ScriptMain Inherits UserComponent Dim StartFlag As Integer = 0 Dim RowCount As Integer = 0 Dim _strDataValue As String = "" Dim _strColName As String = "" Dim _TdName As String = "" Dim _TcName As String = "" Dim _TcRunEndDate As String = "" Dim _ErrorStep As String = "" Dim _ErrorProtocol As String = "" Dim _OPdpCntxActTime As String = "" Dim _OPdpCntxSuccessful As String = "" Dim _Result As String = "" Dim _TcRunId As String = "" Dim _TcRunResult As String = "" Dim _OAudioCodec As String = "" Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer) ' ' Add your code here ' RowCount = RowCount + 1 Dim strDataValue As String Dim strColName As String strColName = Row.FieldColumnName.ToString strDataValue = Row.DataValue.ToString 'If strColName = "Td.Name" And StartFlag = 1 Then If strColName = "" And strDataValue = "" And StartFlag = 1 Then Row.TdName = _TdName.ToString Row.TcName = _TcName.ToString Row.TcRunEndDate = _TcRunEndDate 'TcRun.EndDate Row.ErrorStep = _ErrorStep ' Error.Step Row.ErrorProtocol = _ErrorProtocol ' Error.Protocol Row.OPdpCntxActTime = _OPdpCntxActTime 'O.PdpCntx.ActTime Row.OPdpCntxSuccessful = _OPdpCntxSuccessful 'O.PdpCntx.Successful Row.Result = _Result ' Result Row.TcRunId = _TcRunId 'TcRun.Id Row.TcRunResult = _TcRunResult 'TcRun.Result Row.OAudioCodec = _OAudioCodec 'O.AudioCodec StartFlag = 0 _TdName = "" _TcName = "" _TcRunEndDate = "" _ErrorStep = "" _ErrorProtocol = "" _OPdpCntxActTime = "" _OPdpCntxSuccessful = "" _Result = "" _TcRunId = "" _TcRunResult = "" _OAudioCodec = "" Row.DirectRowToMainOutputRecordsBuffer() End If If strColName = "Td.Name" Then _TdName = strDataValue ElseIf strColName = "Tc.Name" Then _TcName = strDataValue ElseIf strColName = "TcRun.EndDate" Then _TcRunEndDate = strDataValue ElseIf strColName = "Error.Step" Then _ErrorStep = strDataValue ElseIf strColName = "Error.Protocol" Then _ErrorProtocol = strDataValue ElseIf strColName = "O.PdpCntx.ActTime" Then _OPdpCntxActTime = strDataValue ElseIf strColName = "O.PdpCntx.Successful" Then _OPdpCntxSuccessful = strDataValue ElseIf strColName = "Result" Then _Result = strDataValue ElseIf strColName = "TcRun.Id" Then _TcRunId = strDataValue ElseIf strColName = "TcRun.Result" Then _TcRunResult = strDataValue ElseIf strColName = "O.AudioCodec" Then _OAudioCodec = strDataValue End If StartFlag = 1 End Sub End Classyour final result will look like
Sincerely SH -- MCITP , MCTS -- Please kindly mark the post(s) that answered your question and/or vote for the post(s).
- Proposed As Answer by Nik - Shahriar Nikkhah Tuesday, May 01, 2012 3:48 PM
-
Tuesday, May 01, 2012 3:49 PM
You may have to change the IF part of the code
If strColName = "" And strDataValue = "" And StartFlag = 1 Then
depending on your needs
I can email you the package if you like
good luck
Sincerely SH -- MCITP , MCTS -- Please kindly mark the post(s) that answered your question and/or vote for the post(s).
- Edited by Nik - Shahriar Nikkhah Tuesday, May 01, 2012 3:49 PM
-
Wednesday, May 02, 2012 7:57 AM
Hi All
i came up with this script when i debug it is working fine i debug at each location and everything is giving perfect value . but when i add dataviewer after the script component it is throwing only null values . Is there any problem in assigning the Bufferedcolumns ?
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute> _
<CLSCompliant(False)> _
Public Class ScriptMain
Inherits UserComponent
Dim sr As System.IO.StreamReader
Dim FileProcessingNumber As Long
Dim TcRunEnddate As String
Dim FilePath As String
Dim test As StringPublic Overrides Sub PreExecute()
MyBase.PreExecute()
FilePath = Variables.FileName
sr = New System.IO.StreamReader(FilePath)
FileProcessingNumber = Variables.FileProgressiveNumber
TcRunEnddate = Variables.TcRunEndDate
End SubPublic Overrides Sub PostExecute()
MyBase.PostExecute()
sr.Close()
End SubPublic Overrides Sub CreateNewOutputRows()
Dim lineIndex As Integer = 0
Dim EmptyRow As Integer = 0
Output0Buffer.AddRow()
While (Not sr.EndOfStream)
'Output0Buffer.AddRow()Dim line As String = sr.ReadLine()
If (line.Length = 0) Then
EmptyRow = EmptyRow + 1
If (EmptyRow = 2) Then
Output0Buffer.AddRow()
EmptyRow = 0
'MsgBox("new row")
GoTo line1End If
End IfDim columnArray As String() = line.Split(Convert.ToChar("="))
If (columnArray(0).Length > 0) ThenOutput0Buffer.tdname = Nothing
Output0Buffer.tcname = Nothing
Output0Buffer.testscenarioname = Nothing
Output0Buffer.tcrunid = Nothing
Output0Buffer.tcrunstartdate = Nothing
Output0Buffer.tcrunresult = Nothing
Output0Buffer.errorstep = Nothing
Output0Buffer.errorprotocol = Nothing
Output0Buffer.oneid = Nothing
Output0Buffer.tneid = Nothing
Output0Buffer.oportcode = Nothing
Output0Buffer.tportcode = Nothing
Output0Buffer.oregionname = Nothing
Output0Buffer.osimmsisdnvoice = Nothing
Output0Buffer.tsimmsisdnvoice = Nothing
Output0Buffer.server = Nothing
Output0Buffer.errorcause = Nothing
Output0Buffer.errorcausefailurereason = Nothing
Output0Buffer.errorcausephasename = Nothing
Output0Buffer.errorcauserole = Nothing
Output0Buffer.errorcausesteperror = Nothing
Output0Buffer.errorcauseuserdefined = Nothing
Output0Buffer.tsmeas = Nothing
Output0Buffer.tcrunenddate = Nothing
Output0Buffer.FileProgressiveNumber = Nothing
If (columnArray(0).ToString = "Td.Name ") Then
Output0Buffer.tdname = columnArray(1)
ElseIf (columnArray(0) = "Tc.Name ") Then
Output0Buffer.tcname = columnArray(1)
ElseIf (columnArray(0) = "TestScenario.Name ") Then
Output0Buffer.testscenarioname = columnArray(1)
ElseIf (columnArray(0) = "TcRun.Id ") Then
Output0Buffer.tcrunid = columnArray(1).Replace(" ", "0")
ElseIf (columnArray(0) = "TcRun.StartDate ") Then
Output0Buffer.tcrunstartdate = columnArray(1)
ElseIf (columnArray(0) = "TcRun.Result ") Then
Output0Buffer.tcrunresult = columnArray(1)
ElseIf (columnArray(0) = "Error.Step ") Then
Output0Buffer.errorstep = columnArray(1)
ElseIf (columnArray(0) = "Error.Protocol ") Then
Output0Buffer.errorprotocol = columnArray(1)
ElseIf (columnArray(0) = "O.NeId ") Then
Output0Buffer.oneid = columnArray(1)
ElseIf (columnArray(0) = "O.Port.Code ") Then
Output0Buffer.oportcode = columnArray(1)
ElseIf (columnArray(0) = "O.Sim.MsisdnVoice ") Then
Output0Buffer.osimmsisdnvoice = columnArray(1)
ElseIf (columnArray(0) = "T.NeId ") Then
Output0Buffer.tneid = columnArray(1)
ElseIf (columnArray(0) = "T.Port.Code ") Then
Output0Buffer.tportcode = columnArray(1)
ElseIf (columnArray(0) = "O.Region.Name ") Then
Output0Buffer.oregionname = columnArray(1)
ElseIf (columnArray(0) = "T.Sim.MsisdnVoice ") Then
Output0Buffer.tsimmsisdnvoice = columnArray(1)
ElseIf (columnArray(0) = "Server ") Then
Output0Buffer.server = columnArray(1)
ElseIf (columnArray(0) = "ErrorCause ") Then
Output0Buffer.errorcause = columnArray(1)
ElseIf (columnArray(0) = "ErrorCause.FailureReason ") Then
Output0Buffer.errorcausefailurereason = columnArray(1)
ElseIf (columnArray(0) = "ErrorCause.PhaseName ") Then
Output0Buffer.errorcausephasename = columnArray(1)
ElseIf (columnArray(0) = "ErrorCause.Role ") Then
Output0Buffer.errorcauserole = columnArray(1)
ElseIf (columnArray(0) = "ErrorCause.StepError ") Then
Output0Buffer.errorcausesteperror = columnArray(1)
ElseIf (columnArray(0) = "ErrorCause.UserDefinedErrorCause ") Then
Output0Buffer.errorcauseuserdefined = columnArray(1)
ElseIf (columnArray(0) = "TcRun.EndDate ") Then
Output0Buffer.tcrunenddate = columnArray(1)
End If
'Output0Buffer.tsmeas = TcRunEnddate
Output0Buffer.FileProgressiveNumber = FileProcessingNumber
End Ifline1: lineIndex = lineIndex + 1
End WhileEnd Sub
End Class
Surendra Thota
- Proposed As Answer by Rahul Kumar (Rahul Vairagi) Wednesday, May 02, 2012 11:44 AM
- Marked As Answer by Surendra_Thota Wednesday, May 02, 2012 12:22 PM
-
Wednesday, May 02, 2012 10:45 AM
Hi All
code is working perfectly after removing the Output0Buffer.tdname = Nothing . can anybody tell why = nothing is giving no value even after we assigning a value Output0Buffer.tdname = columnArray(1)
Surendra Thota
-
Thursday, May 03, 2012 10:55 AM
Hi All
how to put default value to a column . as of now if there is no value to a column it is giving null . but i need to put some default value . when i tried assign a value before the process it is giving only that default value even it has some value . what i understand is Output0Buffer value can be assigned only once . we cant overwrite it .
Surendra Thota

