none
Split string and load data into particular fields

    Question

  • I have following data in my file:

    'EmpID: 12345  EmpName: XYZ   Designation: Programmer Analyst   Location: California  Joining On:  Sep 10, 2013 3:10 pm Link:  https://www.abc.com'

    I need to split string by delimiter, grab field name and load into that particular field. Output i m looking for is below. Not sure if this can be done using T-SQL. Any help much appreciated.

    EmpID: 12345

    EmpName: XYZ

    Designation: Programmer Analyst

    Location: California

    Joining On:  Sep 10, 2013 3:10 pm

    Link:  https://www.abc.com'

    Monday, October 28, 2013 11:37 PM

Answers

  • I have following data in my file:

    Do you mean that you have a file you want to import to the database?

    If so, then yolu can use Bulk Insert. this is a simple Bulk Insert operation. for example if you have data file (Test01.csv) like this:

    'EmpID: 1  EmpName: d   Designation: Programmer Analyst   Location: California  Joining On:  Sep 10, 2013 3:10 pm Link:  https://www.abc.com'
    'EmpID: 2  EmpName: f   Designation: Programmer Analyst   Location: California  Joining On:  Sep 10, 2013 3:10 pm Link:  https://www.abc.com'
    'EmpID: 3  EmpName: g   Designation: Programmer Analyst   Location: California  Joining On:  Sep 10, 2013 3:10 pm Link:  https://www.abc.com'
    'EmpID: 4  EmpName: sfsd   Designation: Programmer Analyst   Location: California  Joining On:  Sep 10, 2013 3:10 pm Link:  https://www.abc.com'

    and you want to import it to this table:

    create table #test (
    	EmpID          INT,
    	EmpName        NVARCHAR(100),
    	Designation    NVARCHAR(100),
    	Location       NVARCHAR(100),
    	Joining_On     DATETIME,
    	Link           NVARCHAR(100)
    )
    GO

    Then all you need to do is build a format file (XML). like this one:

    <?xml version="1.0"?>
      <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <RECORD>
          <FIELD ID="0" xsi:type="CharFixed"                                            LENGTH="8"            />
          <FIELD ID="1" xsi:type="CharTerm"                                                               TERMINATOR="EmpName:"    />
          <FIELD ID="2" xsi:type="CharTerm"                                             MAX_LENGTH="100"  TERMINATOR='Designation: '  />
          <FIELD ID="3" xsi:type="CharTerm"   COLLATION="Hebrew_CI_AS"                  MAX_LENGTH="100"  TERMINATOR='Location: '    />
          <FIELD ID="4" xsi:type="CharTerm"   COLLATION="Hebrew_CI_AS"                  MAX_LENGTH="100"  TERMINATOR='Joining On: ' />
          <FIELD ID="5" xsi:type="CharTerm"   COLLATION="SQL_Latin1_General_CP1_CI_AS"  MAX_LENGTH="100"  TERMINATOR='Link: ' />
          <FIELD ID="6" xsi:type="CharTerm"   COLLATION="Hebrew_CI_AS"                  MAX_LENGTH="100"  TERMINATOR='&#39;\r\n' />
        </RECORD>
        <ROW>
          <COLUMN SOURCE="1" NAME="EmpID"          xsi:type="SQLINT"/>
          <COLUMN SOURCE="2" NAME="EmpName"        xsi:type="SQLNVARCHAR"/>
          <COLUMN SOURCE="3" NAME="Designation"    xsi:type="SQLNVARCHAR"/>
          <COLUMN SOURCE="4" NAME="Location"       xsi:type="SQLNVARCHAR"/>
          <COLUMN SOURCE="5" NAME="Joining_On"     xsi:type="SQLDATETIME"/>
          <COLUMN SOURCE="6" NAME="Linkn"          xsi:type="SQLNVARCHAR"/>
        </ROW>
      </BCPFORMAT>

    and use this query to import the data (the bulk insert operation will do the split according to the Format File):

    BULK INSERT #test 
    FROM 'C:\ArielyBulkInsertTesting\Test01.csv'
    WITH (
        FORMATFILE='C:\ArielyBulkInsertTesting\Test01.xml'
        , MAXERRORS = 10
        --, KEEPNULLS
        --, DATAFILETYPE = 'native'
        --, CODEPAGE='RAW'
        --, ROWTERMINATOR='\r\n'
        --, FIRSTROW = 3
    );
    GO

    If the data is not from external file then you can use any split function and split the data several times using all those values in order to get all the data split:

    "EmpID:","EmpName:","Designation:","Location:","Joining_On:","Link:"

    orjust find the string between two values to get the specific value that you need. for example if you need only the "EmpName" then just use parsing like this maybe:

    declare @Str nvarchar(1000) = 'EmpID: 4  EmpName: sfsd   Designation: Programmer Analyst   Location: California  Joining On:  Sep 10, 2013 3:10 pm Link:  https://www.abc.com'
    select SUBSTRING(@Str,CHARINDEX('EmpName:',@Str)+8,CHARINDEX('Designation:',@Str)-CHARINDEX('EmpName:',@Str)-8)

    * you can use CLR split Function for this


    [Personal Site] [Blog] [Facebook]signature

    Tuesday, October 29, 2013 7:41 AM
    Moderator
  • Hi,

    If you have a delimiter to separate the string then below is one approach to do so.

    DECLARE @str VARCHAR(500)
    SET @str='EmpID: 12345 |EmpName: XYZ  |Designation: Programmer Analyst  |Location: California |Joining On:  Sep 10, 2013 3:10 pm |Link:  https://www.abc.com'
    
    ;WITH CTE_Split AS (
    	SELECT 1 AS seq,
    		LEFT(@str,CHARINDEX('|',@str)-1) AS string, 
    		RIGHT(@str,LEN(@str) - CHARINDEX('|',@str)) AS left_string
    	UNION ALL
    	SELECT seq+1, 
    		LEFT(left_string,CHARINDEX('|',left_string)-1) AS string, 
    		RIGHT(left_string,LEN(left_string) - CHARINDEX('|',left_string)) AS left_string 
    	FROM CTE_Split WHERE CHARINDEX('|',left_string) <> 0
    	)
    SELECT * FROM CTE_split
    UNION ALL
    SELECT seq+1, left_string,'' FROM CTE_Split WHERE seq=(SELECT MAX(seq) FROM CTE_Split)


    sarat chandra sahoo

    • Marked as answer by SQL Insane Wednesday, October 30, 2013 8:42 PM
    Tuesday, October 29, 2013 7:23 AM

All replies

  • Hi there.

    there is a function to split string into rows.

    google for fnSplit.


    karepa

    Tuesday, October 29, 2013 5:18 AM
  • What is the Delimiter in the string?

    There are multiple spaces within a value (ex. Programmer Analyst), if the delimiter is single space , it would be difficult to get individual values.


    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    Tuesday, October 29, 2013 5:37 AM
  • As Sarat suggested, It looks there is no distinct value to split your text. It is possible to split again, however, there are many work behind we need to do so. 

    To avoid all, why cant we introduce a logical delimitter (it can be , or ; etc) in the file generation for each feild? This would avoid lot more work in SQL Server. Please check the feasibility of the same. 


    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    Tuesday, October 29, 2013 5:43 AM
  • Here is one approach.

    Split the string into a table variable (see dbo.fnSplitStringListXML) using space as delimiter and work from there on:

    http://www.sqlusa.com/bestpractices/training/scripts/userdefinedfunction/

    It appears that you can use <alpha>: as token.


    Kalman Toth Database & OLAP Architect IPAD SELECT Query Video Tutorial 3.5 Hours
    New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012



    Tuesday, October 29, 2013 7:14 AM
    Moderator
  • Hi,

    If you have a delimiter to separate the string then below is one approach to do so.

    DECLARE @str VARCHAR(500)
    SET @str='EmpID: 12345 |EmpName: XYZ  |Designation: Programmer Analyst  |Location: California |Joining On:  Sep 10, 2013 3:10 pm |Link:  https://www.abc.com'
    
    ;WITH CTE_Split AS (
    	SELECT 1 AS seq,
    		LEFT(@str,CHARINDEX('|',@str)-1) AS string, 
    		RIGHT(@str,LEN(@str) - CHARINDEX('|',@str)) AS left_string
    	UNION ALL
    	SELECT seq+1, 
    		LEFT(left_string,CHARINDEX('|',left_string)-1) AS string, 
    		RIGHT(left_string,LEN(left_string) - CHARINDEX('|',left_string)) AS left_string 
    	FROM CTE_Split WHERE CHARINDEX('|',left_string) <> 0
    	)
    SELECT * FROM CTE_split
    UNION ALL
    SELECT seq+1, left_string,'' FROM CTE_Split WHERE seq=(SELECT MAX(seq) FROM CTE_Split)


    sarat chandra sahoo

    • Marked as answer by SQL Insane Wednesday, October 30, 2013 8:42 PM
    Tuesday, October 29, 2013 7:23 AM
  • I have following data in my file:

    Do you mean that you have a file you want to import to the database?

    If so, then yolu can use Bulk Insert. this is a simple Bulk Insert operation. for example if you have data file (Test01.csv) like this:

    'EmpID: 1  EmpName: d   Designation: Programmer Analyst   Location: California  Joining On:  Sep 10, 2013 3:10 pm Link:  https://www.abc.com'
    'EmpID: 2  EmpName: f   Designation: Programmer Analyst   Location: California  Joining On:  Sep 10, 2013 3:10 pm Link:  https://www.abc.com'
    'EmpID: 3  EmpName: g   Designation: Programmer Analyst   Location: California  Joining On:  Sep 10, 2013 3:10 pm Link:  https://www.abc.com'
    'EmpID: 4  EmpName: sfsd   Designation: Programmer Analyst   Location: California  Joining On:  Sep 10, 2013 3:10 pm Link:  https://www.abc.com'

    and you want to import it to this table:

    create table #test (
    	EmpID          INT,
    	EmpName        NVARCHAR(100),
    	Designation    NVARCHAR(100),
    	Location       NVARCHAR(100),
    	Joining_On     DATETIME,
    	Link           NVARCHAR(100)
    )
    GO

    Then all you need to do is build a format file (XML). like this one:

    <?xml version="1.0"?>
      <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <RECORD>
          <FIELD ID="0" xsi:type="CharFixed"                                            LENGTH="8"            />
          <FIELD ID="1" xsi:type="CharTerm"                                                               TERMINATOR="EmpName:"    />
          <FIELD ID="2" xsi:type="CharTerm"                                             MAX_LENGTH="100"  TERMINATOR='Designation: '  />
          <FIELD ID="3" xsi:type="CharTerm"   COLLATION="Hebrew_CI_AS"                  MAX_LENGTH="100"  TERMINATOR='Location: '    />
          <FIELD ID="4" xsi:type="CharTerm"   COLLATION="Hebrew_CI_AS"                  MAX_LENGTH="100"  TERMINATOR='Joining On: ' />
          <FIELD ID="5" xsi:type="CharTerm"   COLLATION="SQL_Latin1_General_CP1_CI_AS"  MAX_LENGTH="100"  TERMINATOR='Link: ' />
          <FIELD ID="6" xsi:type="CharTerm"   COLLATION="Hebrew_CI_AS"                  MAX_LENGTH="100"  TERMINATOR='&#39;\r\n' />
        </RECORD>
        <ROW>
          <COLUMN SOURCE="1" NAME="EmpID"          xsi:type="SQLINT"/>
          <COLUMN SOURCE="2" NAME="EmpName"        xsi:type="SQLNVARCHAR"/>
          <COLUMN SOURCE="3" NAME="Designation"    xsi:type="SQLNVARCHAR"/>
          <COLUMN SOURCE="4" NAME="Location"       xsi:type="SQLNVARCHAR"/>
          <COLUMN SOURCE="5" NAME="Joining_On"     xsi:type="SQLDATETIME"/>
          <COLUMN SOURCE="6" NAME="Linkn"          xsi:type="SQLNVARCHAR"/>
        </ROW>
      </BCPFORMAT>

    and use this query to import the data (the bulk insert operation will do the split according to the Format File):

    BULK INSERT #test 
    FROM 'C:\ArielyBulkInsertTesting\Test01.csv'
    WITH (
        FORMATFILE='C:\ArielyBulkInsertTesting\Test01.xml'
        , MAXERRORS = 10
        --, KEEPNULLS
        --, DATAFILETYPE = 'native'
        --, CODEPAGE='RAW'
        --, ROWTERMINATOR='\r\n'
        --, FIRSTROW = 3
    );
    GO

    If the data is not from external file then you can use any split function and split the data several times using all those values in order to get all the data split:

    "EmpID:","EmpName:","Designation:","Location:","Joining_On:","Link:"

    orjust find the string between two values to get the specific value that you need. for example if you need only the "EmpName" then just use parsing like this maybe:

    declare @Str nvarchar(1000) = 'EmpID: 4  EmpName: sfsd   Designation: Programmer Analyst   Location: California  Joining On:  Sep 10, 2013 3:10 pm Link:  https://www.abc.com'
    select SUBSTRING(@Str,CHARINDEX('EmpName:',@Str)+8,CHARINDEX('Designation:',@Str)-CHARINDEX('EmpName:',@Str)-8)

    * you can use CLR split Function for this


    [Personal Site] [Blog] [Facebook]signature

    Tuesday, October 29, 2013 7:41 AM
    Moderator