Need help in Eliminating the Special Char's:
-
Freitag, 13. April 2012 07:10
Hi ,
Need help in writing query:
I have table that is populating from the 3 rd party Application and the table consists of special char's like this " €‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþ".
Now the question is
1) how can i find out, how may different special char's are there in my db.
2) how can i eliminate that? i am using SQl 2005
Please help
Best Regards
Inampudi
Alle Antworten
-
Freitag, 13. April 2012 07:20
Looks like you have imported Unicode data into VARCHAR() columns. So take a look at the column definition of the involved tables and change it to NVARCHAR.
Please ensure that your working on a test copy of your database or ensure that you have a recent backup.
-
Freitag, 13. April 2012 07:42
Hi Stefan ,
Thnaks for your quick reply, i went back and checked the source column def.. they defined it as char(100).. and this column is note column(general text enter by support guys).i guess while entering/saving the text something is going wrong.Now I need to strip out all non letter characters from a series of strings in a query. i have excluded the some the Char's by using the bunch of replace function?
is this best way,please suggest..
Thanks
Inampudi
Anil Inampudi
-
Freitag, 13. April 2012 07:44
The best way? Hard to tell. I would create a table (c CHAR(1)) which holds only the valid chars. Then run a search and replace on every involved column.
You should try switching this column to NCHAR(100).
-
Freitag, 13. April 2012 07:57
Hi Anil,
You can find the special character using
SELECT * FROM TABLE WHERE COLUMN LIKE '%[^0-9a-zA-Z]%'
Other way,
There is a good article on this. Please check this.
http://www.source-code.biz/snippets/mssql/1.htm
-
Freitag, 13. April 2012 08:10
It's much simpler to write a C# program and use regular expressions than to do this in SQL. It could also be a C# function in SQL Server, but if this is a one-off, that would be over-ambitious. (Or at least I think it's simpler to do in C#. Myself, I would do it in Perl.)
Then again, with the sample you gave what would be left? Would it make sense?
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se -
Freitag, 13. April 2012 08:47
as the column uses char() datatype instead of nchar() (resp. varchar/nvarchar) you can't simply replace characters if you want that the new string is semantically identical.
Do you know if all users/application accessing this table/column do use the same codepage setting (same regional settings) because datalayer used to write data to the SQL Server may map characters set from the Unicode to 1-byte character set using codepage.
If all clients use the same regional settings and are using Unicode internal the easiest way is to read all rows and write it back to an additional row using nvarchar (etc.) and later drop the column and rename the new to the old one.
-
Freitag, 13. April 2012 08:49
I think that this is a NoGo as the meaning of the stored string will be changed and I don't think that it is acceptable - in that you case it would be probably easier just to clear to col and replace it with a blank.
And don't forget you probably don't know if the application is using MCBS and therefore the length of byte representation of a character are varying and using same escape characters which changes the interpretation of the following byte(s). The same problem occurs with some UTF encoding as well.
- Bearbeitet Daniel_Steiner Freitag, 13. April 2012 08:52
-
Freitag, 13. April 2012 09:19
a regular expression makes easy to find special characters.
SELECT * FROM TABLE WHERE COLUMN LIKE '%[^0-9a-zA-Z]%'
for more detail on wild card search. have a look into this.
http://shamas-saeed.blogspot.com/2010/07/using-wildcard-characters-in-search.html
Shamas Saeed (if Post helpful please mark as Answer) http://shamas-saeed.blogspot.com
- Bearbeitet Shamas Saeed Freitag, 13. April 2012 09:24
- Als Antwort vorgeschlagen Kalman TothMicrosoft Community Contributor, Moderator Samstag, 14. April 2012 16:59
- Als Antwort markiert Kalman TothMicrosoft Community Contributor, Moderator Donnerstag, 19. April 2012 17:34
-
Freitag, 13. April 2012 15:58Moderator
If you want to use pure T-SQL column to strip unwanted characters (everything which is not character, digit or comma, dot) then check
http://beyondrelational.com/modules/2/blogs/78/posts/11141/remove-bad-characters-from-a-string.aspx and suggestion in the comment.
Although CLR solution will be better.
For every expert, there is an equal and opposite expert. - Becker's Law
My blog- Als Antwort vorgeschlagen Kalman TothMicrosoft Community Contributor, Moderator Samstag, 14. April 2012 14:18
- Als Antwort markiert Kalman TothMicrosoft Community Contributor, Moderator Donnerstag, 19. April 2012 17:34
-
Samstag, 14. April 2012 10:01
I posted a C# function in a similar thread:
http://social.Msdn.microsoft.com/Forums/en-US/transactsql/thread/709c3d8f-a827-4c19-8d21-39875705b2ee#0b8693c4-ceee-456e-b170-84e2f88f0ce6
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se- Als Antwort markiert Kalman TothMicrosoft Community Contributor, Moderator Samstag, 14. April 2012 16:59
- Tag als Antwort aufgehoben Kalman TothMicrosoft Community Contributor, Moderator Samstag, 14. April 2012 17:00
- Als Antwort vorgeschlagen Kalman TothMicrosoft Community Contributor, Moderator Samstag, 14. April 2012 17:01
- Als Antwort markiert Kalman TothMicrosoft Community Contributor, Moderator Donnerstag, 19. April 2012 17:35
-
Samstag, 14. April 2012 16:11
You should make use pf NVARCHAR datatype. So that it can hold unicode value.
Please check link it may help you, http://stackoverflow.com/questions/6995072/remove-special-characters-in-sql
- Als Antwort markiert Kalman TothMicrosoft Community Contributor, Moderator Donnerstag, 19. April 2012 17:57

