none
cannot convert between unicode and non-unicode string data types

    Question

  • I've got another one that is confusing me.  I've build a package that moves test data from a MS SQL 2008 RC0 database to an Oracle 10.2.0.3 database on a separate server.  I did this work in BIDS on my Windows XP workstation where I've installed the SQL 2088 RC0 client tools.  The MS SQL 2008 database is on a Windows 2008 server.  The Oracle database is on a Windows 2003 server.

     

    The job works when I run it in Debug mode.  I've made a package and migrated it to the MS SQL 2008 test server since I would prefer to run the package from there.  It gets validation errors when built like the following, "Error: Column "MERCHANT_ID" cannot convert between unicode and non-unicode string data types." for each column in a derived column transform.  When I attempt to run the package by connecting to Integration Services from SSMS on the server it gets the same validation errors and quits.  If I connect to Integration Services from SSMS on my desktop it will successfully run the package.

     

    What the derived column transform does is deal with what I consider bugs in both MS SQL and in Oracle.  Let me explain that in more detail.  In case you are not familier with Oracle it considers an empty string like '' as a NULL.  The columns in my tables are defined as NOT NULL.  Because of this behavior in Oracle our application simply uses a single space like ' ' instead of '' for string columns with no data in them.  I consider that a bug in Oracle since '' is not unknown.  What I found SSIS doing was converting ' ' to '' automatically when my OleDB source database was connected directly to my OleDB destination database.  Once I figured out what was going on it was easy enough to add a derived column transform with Expressions like the following MERCHANT_ID == " " ? MERCHANT_ID : " ".  As soon as I did that the package began working.  By working I mean working from my desktop before I attempted to migrate it to the server.

     

    Once I attempted the migration to the server and got the conversion errors I began to poke around the forums here and for SQL 2005 and see many people with the same error message but didn't come across any where it involved the errors cropping up after migration.  In my source database and my target database none of the datatypes are unicode.  They are VARCHAR in MS SQL and VARCHAR2 in Oracle.  Not NVARCHAR or NVARCHAR2.  In the metadata of the connector leading into the derived column transform I see the datatype is string [DT_STR] with code page 1252 (ANSI - Latin 1) as I would expect.  They are the same in the derived column transform and also in the metadata of the connector leaving the transform and into the OLEDB destination.

     

    After reading a bunch of post in the other forumns I tried an explicit conversion inside of the transform like the following

     

    MERCHANT_ID == " " ? (DT_STR,20,1252)(MERCHANT_ID) : " "

     

    but had no better luck with that.

     

    Any ideas?

     

    Tuesday, July 15, 2008 5:52 PM

Answers

  • I've finally had the chance to rebuild a portion of this job on the server instead of my desktop.  I found a work around but don't understand it yet.

     

    I received the same error regarding the ability to convert between unicode and non-unicode data.  It indicated that the error was on the Oracle OleDB side.  So, I decided to modify my Derived Column transformation to add new columns instead of replacing and it changed the data types to unicode automatically.  Then I had to remap my source an target columns.  Once it was all done I was able to kick off this portion of the job and its working.  What I don't get is why it is forcing this to pass Unicode.  The data types in the target Oracle database are not Unicode.  I'm wondering if it has something to due with differences between Windows 2008 and Windows XP.  On my XP desktop the data types match what I have in he database but on the Windows 2008 server they don't with regards to the Oracle Ole DB provider.  I have the same 11g client tools loaded on both machines.

     

    If time allows I'll experiment with the ADO.net provider to see if its behavior is consistent for Oracle between the two OS.  The question now is if this job will cancel itself before its finished but that is another thread.

    Tuesday, July 29, 2008 2:06 PM

All replies

  •  Joe Moyle wrote:

    What the derived column transform does is deal with what I consider bugs in both MS SQL and in Oracle.  Let me explain that in more detail.  In case you are not familier with Oracle it considers an empty string like '' as a NULL.  The columns in my tables are defined as NOT NULL.  Because of this behavior in Oracle our application simply uses a single space like ' ' instead of '' for string columns with no data in them.  I consider that a bug in Oracle since '' is not unknown.  What I found SSIS doing was converting ' ' to '' automatically when my OleDB source database was connected directly to my OleDB destination database.  Once I figured out what was going on it was easy enough to add a derived column transform with Expressions like the following MERCHANT_ID == " " ? MERCHANT_ID : " ".  As soon as I did that the package began working.  By working I mean working from my desktop before I attempted to migrate it to the server.

     

    Once I attempted the migration to the server and got the conversion errors I began to poke around the forums here and for SQL 2005 and see many people with the same error message but didn't come across any where it involved the errors cropping up after migration.  In my source database and my target database none of the datatypes are unicode.  They are VARCHAR in MS SQL and VARCHAR2 in Oracle.  Not NVARCHAR or NVARCHAR2.  In the metadata of the connector leading into the derived column transform I see the datatype is string [DT_STR] with code page 1252 (ANSI - Latin 1) as I would expect.  They are the same in the derived column transform and also in the metadata of the connector leaving the transform and into the OLEDB destination.

     

    After reading a bunch of post in the other forumns I tried an explicit conversion inside of the transform like the following

     

    MERCHANT_ID == " " ? (DT_STR,20,1252)(MERCHANT_ID) : " "

     

    but had no better luck with that.

     

    Any ideas?

     

     

    if possible, try to converting the MERCHANT_ID datatype to VARCHAR inside your sql statement.

     

    hth

    Friday, July 18, 2008 4:30 AM
  •  Duane Douglas wrote:
    if possible, try to converting the MERCHANT_ID datatype to VARCHAR inside your sql statement.

     

    hth

     

    In my OLE DB Source I'm using the Data access mode: of Table or view so I have no SQL statement to convert inside of.  If I open the Data Flow Path Editor of the connector between the source and the Derived Column transform and look at the metadata it correctly indicates DT_STR with a length of 20 for Merchant_Id.  Same thing in the connector leaving the transform and heading into my destination.

     

    I haven't found it necessary to write and SQL scripts at this point in this package since the source and destination tables have the exact same names, column names, column orders, and the datatypes are functionally equivalent.  In other tables with no '' stored in any of the source columns I don't have this issue.  As far as I can tell it has something to do with the Derived Column transform.

    Monday, July 21, 2008 2:02 PM
  •  Joe Moyle wrote:
     Duane Douglas wrote:
    if possible, try to converting the MERCHANT_ID datatype to VARCHAR inside your sql statement.

     

    hth

     

    In my OLE DB Source I'm using the Data access mode: of Table or view so I have no SQL statement to convert inside of.  If I open the Data Flow Path Editor of the connector between the source and the Derived Column transform and look at the metadata it correctly indicates DT_STR with a length of 20 for Merchant_Id.  Same thing in the connector leaving the transform and heading into my destination.

     

    I haven't found it necessary to write and SQL scripts at this point in this package since the source and destination tables have the exact same names, column names, column orders, and the datatypes are functionally equivalent.  In other tables with no '' stored in any of the source columns I don't have this issue.  As far as I can tell it has something to do with the Derived Column transform.

     

    i understand that the executing a sql statement shouldn't be necessary, but it might be necessary in this case.  i don't know if my suggestion will work, but i don't see how trying it would hurt.

     

    hth

    Wednesday, July 23, 2008 5:40 AM
  • I will give it a try.  First I'm going to try to rebuild the job on the server itself.  Been out of town and haven't had any time to work on this since its priority here is, "...when you have time..."  I'll keep the thread updated as time to experiment allows.

    Wednesday, July 23, 2008 2:14 PM
  • I've finally had the chance to rebuild a portion of this job on the server instead of my desktop.  I found a work around but don't understand it yet.

     

    I received the same error regarding the ability to convert between unicode and non-unicode data.  It indicated that the error was on the Oracle OleDB side.  So, I decided to modify my Derived Column transformation to add new columns instead of replacing and it changed the data types to unicode automatically.  Then I had to remap my source an target columns.  Once it was all done I was able to kick off this portion of the job and its working.  What I don't get is why it is forcing this to pass Unicode.  The data types in the target Oracle database are not Unicode.  I'm wondering if it has something to due with differences between Windows 2008 and Windows XP.  On my XP desktop the data types match what I have in he database but on the Windows 2008 server they don't with regards to the Oracle Ole DB provider.  I have the same 11g client tools loaded on both machines.

     

    If time allows I'll experiment with the ADO.net provider to see if its behavior is consistent for Oracle between the two OS.  The question now is if this job will cancel itself before its finished but that is another thread.

    Tuesday, July 29, 2008 2:06 PM
  • I suspect it is more of a bug in SQL Server that null strings are allowed but not defined as nulls. It's only common to have this because SQL Server allows it. Either data exists in a field or it doesn't, if it doesn't then the field should be null.
    Wednesday, April 08, 2009 12:14 AM
  • Are you saying that you consider an empty string to be equivalent to NULL?  I don't understand that.

    "" != NULL.

    NULL should represent "I don't know", or "unknown", shouldn't it?  An empty string is definitely a "known" value.  How else would you differentiate between the situation in a credit database where the person's middle name isn't known versus when a person doesn't have a middle name?  If I were that credit agency, and received new information about a person sharing the same first and last name, but also having a middle name, I'd think it would be extremely important to know and understand that distinction before I assumed they were the "same person".  I suppose you could add a boolean column to distinguish that...

    But even in programming languages NULL is not equivalent to an empty string.  In C#, these are different:

    string nullString = null;
    string emptyString = "";

    As they definitely should be!  So IMO, there's no "bug" in SQL Server about allowing NULLs to be treated differently than empty strings.  I'm fine with other engines allowing that - as NULLs can confuse people, despite their utility.

    I do agree with you - either data exists in a field or it doesn't.  If it doesn't, then the field should be null.  However, an empty string IS data.
    Todd McDermid's Blog
    Wednesday, April 08, 2009 4:12 PM