none
SSIS: Change Code page 1252 ANSI to UTF-8 65001

    Question

  • Hi All;

    I have created SSIS Package to import data from OLE DB to a flat file. The code page in the flat file connection manager was set to default 1252 ANSI - Latin I. Now I have been asked to output the flat file in UTF-8 format. I have changed the code page to 65001 UTF-8 but the package will not run. I got the following message:

    the code page on input column COLUMN_NAME (184)is 1252 and is required
    to be 65001

    I have searched around and i have read that i need to change the column data type from VARCHAR to NVARCHAR!! Is this the solution?

     

    Is there is any other option? Thanks.


    • Edited by BuRaiR Wednesday, November 16, 2011 11:05 AM
    Wednesday, November 16, 2011 11:04 AM

Answers

  • Yes, you have to change all varchar/char to nvarchar/nchar

    There's no other option. You can do this in SQL Server with a view or in SSIS with Data Conversion

    • Marked as answer by BuRaiR Thursday, November 24, 2011 10:48 AM
    Wednesday, November 16, 2011 11:14 AM
  • Yes that is exactly what u need to do, as UTF-8 or UNICODE means that you are defining your file to have UNICODE characters, NVARCHAR is the UNicode equivalent on SQL side.
    So if you define your Source file to be UNICODE then the COLUMNS been read through in the CONNECTION MANAGER should also be NVARCAHR instead of VARCAHR
    Abhinav
    • Marked as answer by BuRaiR Thursday, November 24, 2011 10:50 AM
    Thursday, November 17, 2011 8:48 AM

All replies

  • Yes, you have to change all varchar/char to nvarchar/nchar

    There's no other option. You can do this in SQL Server with a view or in SSIS with Data Conversion

    • Marked as answer by BuRaiR Thursday, November 24, 2011 10:48 AM
    Wednesday, November 16, 2011 11:14 AM
  • If I understand you correctly you are trying to do what I have done in my package. I don't remember why, but I had to first convert the string fields to UNICODE and the to UTF8.

     

    You can see a picture of my Data Flow and what I have done in my "problem" case:

    http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/087b1dbf-49c1-4e18-8b9a-2e9d30f9c371

     

     

    • Proposed as answer by Shashikanta Wednesday, August 22, 2012 10:41 AM
    Wednesday, November 16, 2011 11:40 AM
  • Yes that is exactly what u need to do, as UTF-8 or UNICODE means that you are defining your file to have UNICODE characters, NVARCHAR is the UNicode equivalent on SQL side.
    So if you define your Source file to be UNICODE then the COLUMNS been read through in the CONNECTION MANAGER should also be NVARCAHR instead of VARCAHR
    Abhinav
    • Marked as answer by BuRaiR Thursday, November 24, 2011 10:50 AM
    Thursday, November 17, 2011 8:48 AM
  • Hi All,

    thank you for your replies. I have tried a lot to do this via SSIS but can't make it till now !! Don't really know where is the error. Will need your help and suggestions please.

    Mr. G. Johannessen, Thaks for link you have provided. Iam trying to follow your steps ,, but iam facing little problems with it. Hope you and other friends can assists me.

    Thanks :)

    1) The below image exaplins my data flow:

    2) In the first Data Conversion, I have changed all the Data_Type of the columns to output to new data type: Unicode String [DT_WSTR].


    Thursday, November 17, 2011 10:26 AM
  • 3) In the second Data Conversion, As you see below, i have the original column name and it gave me the output of the first Data conversion, So instead of dealing with 40 columns , now it shows me 80 columns !! But i choose the output columns from the Data Conversion.


    4) This is my Connection Manager settings.



    Please inform me if you have any more suggestions, of diferent ways in getting the file in UTF-8.

    Thank you All :)
    Thursday, November 17, 2011 10:27 AM
  • Good Morning All;

    Any one can help ?!?!?!

    Thanksssss.

    Sunday, November 20, 2011 5:15 AM
  • Helppppppppppp Plzzzzzzzz.
    Monday, November 21, 2011 5:19 AM
  • Anyone Can Help Plz?
    Monday, November 21, 2011 9:43 AM
  • Do you still get the same error as in the beginning?
    Are you sure you mapped the correct converted columns in the destination?
    MCTS, MCITP - Please mark posts as answered where appropriate.
    Answer #1: Have you tried turning it off and on again?
    Answer #2: It depends...
    Monday, November 21, 2011 10:28 AM
  • Allright, I did some testing and I can't get SSIS to export to UTF.
    All the settings are correct, I converted to Unicode and to UTF-8 and the damn thing just refuses to export to anything else than ASCII. I even created a UTF-8 file upfront, but SSIS just converts it back to ASCII.

    Maybe a bug item should be logged into Connect, because this is just plain silly.

    For the time being, you could use the workaround described here:

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=153603

    (beware, it contains powershell :))


    MCTS, MCITP - Please mark posts as answered where appropriate.
    Answer #1: Have you tried turning it off and on again?
    Answer #2: It depends...
    Monday, November 21, 2011 11:00 AM
  • Allright, I did some testing and I can't get SSIS to export to UTF.
    All the settings are correct, I converted to Unicode and to UTF-8 and the damn thing just refuses to export to anything else than ASCII. I even created a UTF-8 file upfront, but SSIS just converts it back to ASCII.

    Maybe a bug item should be logged into Connect, because this is just plain silly.

    For the time being, you could use the workaround described here:

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=153603

    (beware, it contains powershell :))


    MCTS, MCITP - Please mark posts as answered where appropriate.
    Answer #1: Have you tried turning it off and on again?
    Answer #2: It depends...

    Hi koen,

    Thanks for trying to help me out with this  :))

    I have gone through your post and the link you have provided. Well iam not really familiar with POWER SHELL and seems will need more time and research to know about the script provide in the site. I was hoping to get an answer withing the SSIS package it self but as u said it will not allow to produce a UTF-8 file. Most of the blogs on the net talks about changing the VARCHAR to NVARCHAR which iam trying to avoid it.

    Hope to find another way with Visual Studio SSIS packages soon :))

    Wednesday, November 23, 2011 7:21 AM
  • Finally it worked out for me !! Iam not sure if this will be helpful for others. The end-user here requested the results to be in UTF-8 format. Before changing the fields to NVARCHAR they were unable to open the file. After changing the fields to NVARCHAR, Now they are able to read the file but according to them the file is ANSI. 

    Below are the steps which i have done:

    1) Changed all the VARCHAR columns to NVARCHAR (Thanks All).

    2) Created SSIS Package as below;

    3) Here is the configuration i choose for the Flat File:

     

    Thanksssssss to everyone who tried to help me out with this :))


    • Edited by BuRaiR Thursday, December 01, 2011 10:29 AM
    Thursday, November 24, 2011 10:47 AM
  • Unlike some of the comments here, I am really sure you can get non-Unicode data exported into a UTF-8 flat file using SSIS.  The process is fairly easy but you have to understand what's going on.

    Here's a scenario:

    Your source is selecting varchar or char (i.e. not Nvarchar, ...) from a SQL Server DB.

    The Flat File connection manager should have code page 65001 (UTF-8) specified.

    You need to ensure that the columns being exported into the file are encoded using the 65001 code page.  So hook up a Derived Column transformation between the source and destination and use a cast such as this one on each string column:

    (DT_STR, 50, 65001)YourIncomingField

    Note that 50 is the field length so you need to ensure it matches the length of the incoming field.

    That are all the requirements, the resulting file will be UTF-8.

    Additional info: if you ended up here looking for a way to import UTF-8 files, check out this article: http://blog.hoegaerden.be/2012/05/03/ssis-flat-files-and-accents/


    MCITP SQL Server 2008 (DEV); MCTS SQL Server 2008 (BI, DEV & DBA)

    Friday, May 04, 2012 7:46 AM
  • Is there a possibility that the SSIS is changing the UTF-8 over to Unicode because of a change in DATE PRECISION?  My stuff is not unicode but I see this problem... The Code Page on the Output column is 1252 and is required to be 65001.

    R, J

    Thursday, September 13, 2012 3:25 PM
  • SQL Server 2005 - 2008 R2 does not support code page 65001,  but in SQL 2012 there is a support for codepage 65001 (UTF-8).

    Please check how down conversion and up conversion possible at http://social.technet.microsoft.com/wiki/contents/articles/14036.how-to-convert-unicode-data-to-ascii-and-back-in-sql-server.aspx

    Thx

    Sushild


    • Edited by Sushild Wednesday, October 17, 2012 11:32 PM
    Wednesday, October 17, 2012 11:28 PM
  • That article seems like a great deal of steps to take.  I have had success in the past with this by using a data conversion, which is what this article is proposing.  If you investigate the data type in the advanced edit, find it to be unicode for whatever reason, put in a data conversion and convert that column from unicode to utf.   I'm not sure you need to go all the extra steps that are in that article.

    R, J

    Thursday, October 18, 2012 11:25 AM
  • Hello Guys,

    i am having the same problem but none of the proposed solutions work for me. the thing is that my source DB is an SQL server DB table that contains Arabic fields the code page of the Source is 1256. 

    my destination DB is an SQL Compact Edition (with nvarchar fields) the destination page is 1252 

    if i use SSIS to import Data to the compact DB the arabic fields in the source Arabic destination appear rubbish 

    if i set AlwaysUseDefaultCodePage to true, an error occurs (cannot be processed because more than one code page (1252 and 1256) are specified for it )

    Please Help!!

    Tuesday, November 05, 2013 3:01 PM
  • You can try a force feed... basically, when SSIS reads the information, it samples a few lines and makes a determination.  If you have a first row pre-populated with a varchar(4000) that has alpha only, ansi only characters, the field won't flip over to Unicode but will pick up the ANSI.  Then you need to remove the fake line.    There is something in your data that forces the value to Unicode. 

    Trick it first, then feed the values in once the first line is established.   You may be able to do this in a script task... feed the first line from a script, form the fields, fill in the rest and put in another script to remove the first line.


    R, J

    Wednesday, November 06, 2013 2:20 AM
  • SQL Server 2005 - 2008 R2 does not support code page 65001,  but in SQL 2012 there is a support for codepage 65001 (UTF-8).

    Please check how down conversion and up conversion possible at http://social.technet.microsoft.com/wiki/contents/articles/14036.how-to-convert-unicode-data-to-ascii-and-back-in-sql-server.aspx

    Thx

    Sushild


    Hi Sushild,

    I think in 2008 its can be possible if we create that text file by use c# script task in ssis . Because  did is for myself create a text with script then normal export and its work file without  any problem .

    If need i can make a demo for it if any one need

    thanks all

    Muzahid

    Tuesday, April 15, 2014 7:20 AM