none
SELECT LIKE alphabetic characters only - İ becomes i ?

    Question

  • Hello,
    i am trying to select all rows that contain non-alphabetic characters. 0-9, a-z and some special chars should be allowed and therefore filtered out.

    i use this regexp:

    WHERE cust LIKE '%[^A-Z0-9_.!-]%' ESCAPE '!'

    it correctly shows me values like DŽE01 and ŽC01, so the Ž is detected as non-A to Z.

    but it ignores all my rows that contain, for example, this character:

    120-I03 <-- correctly filtered out
    120-İ03 <-- incorrectly filtered out, İ should be detected as non A-Z
     

    how can i make sure that really only A to Z are checked and not slight variations (e.g. accents)?

    thanks,
    br
    Monday, February 08, 2010 10:46 AM

Answers

  • That character (the capital I with the dot) is a perfectly acceptable character in the current collation.  Try using a binary collation with the like operator as shown below.


    DECLARE @t TABLE(
    Col NVARCHAR(10)
    );
    
    INSERT INTO @t VALUES (N'DŽE01');
    INSERT INTO @t VALUES (N'ŽC01');
    INSERT INTO @t VALUES (N'ZC01');
    INSERT INTO @t VALUES (N'120-I03');
    INSERT INTO @t VALUES (N'120-İ03');
    
    SELECT Col
    FROM @t
    WHERE Col collate Latin1_General_BIN LIKE '%[^A-Z0-9_.!-]%' ESCAPE '!'

    http://jahaines.blogspot.com/
    Monday, February 08, 2010 6:03 PM
  • this might help you

    DECLARE @temp TABLE
    (
    	MyID	int IDENTITY(1,1) NOT NULL,
    	MyNText nvarchar(10) COLLATE Latin1_General_CS_AS
    )
    
    INSERT INTO @temp(MyNText)
    SELECT	N'Café' UNION ALL	-- Caps + Accent
    SELECT	N'café' UNION ALL	-- Accent
    SELECT	N'Cafe' UNION ALL	-- Caps
    SELECT	N'cafe'			-- All lower case, no accents
    
    SELECT	MyNText,
    	[Bin] = CASE WHEN MyNText LIKE '%[^-~*a-z0-9]%' COLLATE Latin1_General_BIN
    		THEN 'X'
    		ELSE ''
    		END,
    	[CS_AS] = CASE WHEN MyNText LIKE '%[^-~*a-z0-9]%' COLLATE Latin1_General_CS_AS
    		THEN 'X'
    		ELSE ''
    		END,
    	[SQL CS_AS] = CASE WHEN MyNText LIKE '%[^-~*a-z0-9]%' COLLATE SQL_Latin1_General_CP1_CS_AS
    		THEN 'X'
    		ELSE ''
    		END
    FROM	@temp
    ORDER BY MyID
    

    Chase Excellence - Success Will Follow!
    Tuesday, February 09, 2010 4:24 AM

All replies

  • I get the same result - nothing. Can you check? What collation is that I letter?

    DECLARE @ntext nvarchar(max) = N'120-I03'
    
    SELECT @ntext
    WHERE @ntext LIKE '%[^A-Z0-9_.!-]%' ESCAPE '!'
    

    Kalman Toth SQL SERVER 2012 & BI TRAINING
    New Book: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2012






    Monday, February 08, 2010 12:07 PM
  • I get the same result - nothing. Can you check?

    DECLARE @ntext nvarchar(max) = N'120-I03'
    
    
    
    SELECT @ntext
    
    WHERE @ntext LIKE '%[^A-Z0-9_.!-]%' ESCAPE '!'
    
    
    
    

    Kalman Toth, SQL Server 2008 & BI Training, OLAP, SSIS, SSRS; http://www.SQLUSA.com

     

    not working for chinese characters

     

     


    Chase Excellence - Success Will Follow!
    Monday, February 08, 2010 12:11 PM
  • i am sorry, i do not know what kind of character this is

    İ


    it's like a large "i" with a dot at the top.
    the collation of the table is Latin1_General_CI_AS

    Monday, February 08, 2010 2:40 PM
  • That character (the capital I with the dot) is a perfectly acceptable character in the current collation.  Try using a binary collation with the like operator as shown below.


    DECLARE @t TABLE(
    Col NVARCHAR(10)
    );
    
    INSERT INTO @t VALUES (N'DŽE01');
    INSERT INTO @t VALUES (N'ŽC01');
    INSERT INTO @t VALUES (N'ZC01');
    INSERT INTO @t VALUES (N'120-I03');
    INSERT INTO @t VALUES (N'120-İ03');
    
    SELECT Col
    FROM @t
    WHERE Col collate Latin1_General_BIN LIKE '%[^A-Z0-9_.!-]%' ESCAPE '!'

    http://jahaines.blogspot.com/
    Monday, February 08, 2010 6:03 PM
  • this might help you

    DECLARE @temp TABLE
    (
    	MyID	int IDENTITY(1,1) NOT NULL,
    	MyNText nvarchar(10) COLLATE Latin1_General_CS_AS
    )
    
    INSERT INTO @temp(MyNText)
    SELECT	N'Café' UNION ALL	-- Caps + Accent
    SELECT	N'café' UNION ALL	-- Accent
    SELECT	N'Cafe' UNION ALL	-- Caps
    SELECT	N'cafe'			-- All lower case, no accents
    
    SELECT	MyNText,
    	[Bin] = CASE WHEN MyNText LIKE '%[^-~*a-z0-9]%' COLLATE Latin1_General_BIN
    		THEN 'X'
    		ELSE ''
    		END,
    	[CS_AS] = CASE WHEN MyNText LIKE '%[^-~*a-z0-9]%' COLLATE Latin1_General_CS_AS
    		THEN 'X'
    		ELSE ''
    		END,
    	[SQL CS_AS] = CASE WHEN MyNText LIKE '%[^-~*a-z0-9]%' COLLATE SQL_Latin1_General_CP1_CS_AS
    		THEN 'X'
    		ELSE ''
    		END
    FROM	@temp
    ORDER BY MyID
    

    Chase Excellence - Success Will Follow!
    Tuesday, February 09, 2010 4:24 AM
  • thank you all for your help, problem solved!
    Tuesday, February 16, 2010 8:41 AM