I'm implementing a DQS Knowledge Base into an ETL flow using the DQS Cleansing transform. In running about 500 records or so through, I kept getting failures. Basically the error message indicated the value "My Name Column Value" was too long for
the ZipCode column. So, for some reason, the component looks like it's getting the columns mixed up internally. I deleted all the columns except for the name column to see if it would behave differently. While it did not throw an error, I noticed in my output
that the "Corrected" rows were basically just returning the original values. These had been mapped as synonyms to a leading value and, instead of returning the leading value as the corrected, value - it just returns the original value.
Leading Value: FOOBAR
The DQS Cleansing tranform returns FUBAR instead of FOOBAR in this case.
So to summarize:
1. Columns getting mixmatched in the data flow causing errors because lengths do not match.
2. Leading value not getting returned as the corrected value.
Anyone else have these issues? Any help or insight would be appreciated.
I have also seen the same issue when using the SSIS DQS Cleansing component. It seems very random. Sometimes it mixes up the columns for a few rows (not for all rows). Sometimes it throws an error like you had where a certain column value is too long (as
a result of it mixing up columns).
Adjusting the DefaultBufferMaxSize and DefaultMaxRows seems to improve things, but it is still unreliable.
This feels like a bug with the SSIS Cleansing Component.
Using the same knowledgebase and source data, there are no problems performing cleansing using the DQS client