Need help in Eliminating the Special Char's:
-
13 апреля 2012 г. 7:10
Hi ,
Need help in writing query:
I have table that is populating from the 3 rd party Application and the table consists of special char's like this " €‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþ".
Now the question is
1) how can i find out, how may different special char's are there in my db.
2) how can i eliminate that? i am using SQl 2005
Please help
Best Regards
Inampudi
Все ответы
-
13 апреля 2012 г. 7:20
Looks like you have imported Unicode data into VARCHAR() columns. So take a look at the column definition of the involved tables and change it to NVARCHAR.
Please ensure that your working on a test copy of your database or ensure that you have a recent backup.
-
13 апреля 2012 г. 7:42
Hi Stefan ,
Thnaks for your quick reply, i went back and checked the source column def.. they defined it as char(100).. and this column is note column(general text enter by support guys).i guess while entering/saving the text something is going wrong.Now I need to strip out all non letter characters from a series of strings in a query. i have excluded the some the Char's by using the bunch of replace function?
is this best way,please suggest..
Thanks
Inampudi
Anil Inampudi
-
13 апреля 2012 г. 7:44
The best way? Hard to tell. I would create a table (c CHAR(1)) which holds only the valid chars. Then run a search and replace on every involved column.
You should try switching this column to NCHAR(100).
-
13 апреля 2012 г. 7:57
Hi Anil,
You can find the special character using
SELECT * FROM TABLE WHERE COLUMN LIKE '%[^0-9a-zA-Z]%'
Other way,
There is a good article on this. Please check this.
http://www.source-code.biz/snippets/mssql/1.htm
-
13 апреля 2012 г. 8:10
It's much simpler to write a C# program and use regular expressions than to do this in SQL. It could also be a C# function in SQL Server, but if this is a one-off, that would be over-ambitious. (Or at least I think it's simpler to do in C#. Myself, I would do it in Perl.)
Then again, with the sample you gave what would be left? Would it make sense?
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se -
13 апреля 2012 г. 8:47
as the column uses char() datatype instead of nchar() (resp. varchar/nvarchar) you can't simply replace characters if you want that the new string is semantically identical.
Do you know if all users/application accessing this table/column do use the same codepage setting (same regional settings) because datalayer used to write data to the SQL Server may map characters set from the Unicode to 1-byte character set using codepage.
If all clients use the same regional settings and are using Unicode internal the easiest way is to read all rows and write it back to an additional row using nvarchar (etc.) and later drop the column and rename the new to the old one.
-
13 апреля 2012 г. 8:49
I think that this is a NoGo as the meaning of the stored string will be changed and I don't think that it is acceptable - in that you case it would be probably easier just to clear to col and replace it with a blank.
And don't forget you probably don't know if the application is using MCBS and therefore the length of byte representation of a character are varying and using same escape characters which changes the interpretation of the following byte(s). The same problem occurs with some UTF encoding as well.
- Изменено Daniel_Steiner 13 апреля 2012 г. 8:52
-
13 апреля 2012 г. 9:19
a regular expression makes easy to find special characters.
SELECT * FROM TABLE WHERE COLUMN LIKE '%[^0-9a-zA-Z]%'
for more detail on wild card search. have a look into this.
http://shamas-saeed.blogspot.com/2010/07/using-wildcard-characters-in-search.html
Shamas Saeed (if Post helpful please mark as Answer) http://shamas-saeed.blogspot.com
- Изменено Shamas Saeed 13 апреля 2012 г. 9:24
- Предложено в качестве ответа Kalman TothMicrosoft Community Contributor, Moderator 14 апреля 2012 г. 16:59
- Помечено в качестве ответа Kalman TothMicrosoft Community Contributor, Moderator 19 апреля 2012 г. 17:34
-
13 апреля 2012 г. 15:58Модератор
If you want to use pure T-SQL column to strip unwanted characters (everything which is not character, digit or comma, dot) then check
http://beyondrelational.com/modules/2/blogs/78/posts/11141/remove-bad-characters-from-a-string.aspx and suggestion in the comment.
Although CLR solution will be better.
For every expert, there is an equal and opposite expert. - Becker's Law
My blog- Предложено в качестве ответа Kalman TothMicrosoft Community Contributor, Moderator 14 апреля 2012 г. 14:18
- Помечено в качестве ответа Kalman TothMicrosoft Community Contributor, Moderator 19 апреля 2012 г. 17:34
-
14 апреля 2012 г. 10:01
I posted a C# function in a similar thread:
http://social.Msdn.microsoft.com/Forums/en-US/transactsql/thread/709c3d8f-a827-4c19-8d21-39875705b2ee#0b8693c4-ceee-456e-b170-84e2f88f0ce6
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se- Помечено в качестве ответа Kalman TothMicrosoft Community Contributor, Moderator 14 апреля 2012 г. 16:59
- Снята пометка об ответе Kalman TothMicrosoft Community Contributor, Moderator 14 апреля 2012 г. 17:00
- Предложено в качестве ответа Kalman TothMicrosoft Community Contributor, Moderator 14 апреля 2012 г. 17:01
- Помечено в качестве ответа Kalman TothMicrosoft Community Contributor, Moderator 19 апреля 2012 г. 17:35
-
14 апреля 2012 г. 16:11
You should make use pf NVARCHAR datatype. So that it can hold unicode value.
Please check link it may help you, http://stackoverflow.com/questions/6995072/remove-special-characters-in-sql
- Помечено в качестве ответа Kalman TothMicrosoft Community Contributor, Moderator 19 апреля 2012 г. 17:57

