Answered by:
How to find similar numeric or alphanumeric in millions of record

Question
-
I have a issue , i have a packet_no of 10 to 12 digits (Number may be numeric or alphanumeric ) , this packet when reaches its destination then due to bad scan one or two digits of original is changed .
So my issue is that, how I should find mismatched packet_no in millions of record.
My matching condition is date range and store_no where the packet is going.
Any help is highly appreciated
Thanks & regards
Rohit
Thursday, June 13, 2013 6:51 AM
Answers
-
Try this:
create table #t(id varchar(12)) insert into #t values ('110040402811') insert into #t values ('110040402820') insert into #t values ('110040402841') insert into #t values ('110040402856') insert into #t values ('110040402857') insert into #t values ('110040402858') insert into #t values ('110040402859') insert into #t values ('110040402863') Declare @original varchar(12) Set @original = '110040402710' ;WITH C1 AS ( SELECT id FROM #t WHERE SUBSTRING(id,1,1) = SUBSTRING(@original,1,1) ), C2 AS ( SELECT id FROM #t WHERE SUBSTRING(id,2,1) = SUBSTRING(@original,2,1) ), C3 AS ( SELECT id FROM #t WHERE SUBSTRING(id,3,1) = SUBSTRING(@original,3,1) ), C4 AS ( SELECT id FROM #t WHERE SUBSTRING(id,4,1) = SUBSTRING(@original,4,1) ), C5 AS ( SELECT id FROM #t WHERE SUBSTRING(id,5,1) = SUBSTRING(@original,5,1) ), C6 AS ( SELECT id FROM #t WHERE SUBSTRING(id,6,1) = SUBSTRING(@original,6,1) ), C7 AS ( SELECT id FROM #t WHERE SUBSTRING(id,7,1) = SUBSTRING(@original,7,1) ), C8 AS ( SELECT id FROM #t WHERE SUBSTRING(id,8,1) = SUBSTRING(@original,8,1) ), C9 AS ( SELECT id FROM #t WHERE SUBSTRING(id,9,1) = SUBSTRING(@original,9,1) ), C10 AS ( SELECT id FROM #t WHERE SUBSTRING(id,10,1) = SUBSTRING(@original,10,1) ), C11 AS ( SELECT id FROM #t WHERE SUBSTRING(id,11,1) = SUBSTRING(@original,11,1) ), C12 AS ( SELECT id FROM #t WHERE SUBSTRING(id,12,1) = SUBSTRING(@original,12,1) ), combined AS ( SELECT * FROM C1 UNION ALL SELECT * FROM C2 UNION ALL SELECT * FROM C3 UNION ALL SELECT * FROM C4 UNION ALL SELECT * FROM C5 UNION ALL SELECT * FROM C6 UNION ALL SELECT * FROM C7 UNION ALL SELECT * FROM C8 UNION ALL SELECT * FROM C9 UNION ALL SELECT * FROM C10 UNION ALL SELECT * FROM C11 UNION ALL SELECT * FROM C12 ), at_least_10 AS ( SELECT id, COUNT(*) AS cnt FROM combined GROUP BY id HAVING COUNT(*) >= 10 ) SELECT * FROM at_least_10 drop table #t
Gert-Jan
- Proposed as answer by Allen Li - MSFT Saturday, June 22, 2013 8:49 AM
- Marked as answer by Kalman Toth Saturday, June 29, 2013 6:58 AM
Thursday, June 13, 2013 1:13 PM
All replies
-
Provide DDL, Sample Data and Expected Output.
Regards, RSingh
Thursday, June 13, 2013 6:57 AM -
These are the certain packet going from A to B
110040402811
110040402820
110040402841
110040402856
110040402857
110040402858
110040402859
110040402863
When they arrive at B, scanning is done some times due to bad scanning number get changed.
As date and store# are same for a packet#. So I am finding changed number using
Original number
I am using like statement
select @packet_no = replace (@packet_no ,substring(@packet_no ,2 ,1) ,'%')
select @packet_no = replace (@packet_no ,substring(@packet_no ,5 ,1) ,'%')
select @packet_no = replace (@packet_no ,substring(@packet_no ,8 ,1) ,'%')
and finding the changed number , but it is not working fine.
Thanks & regards
Rohit
Thursday, June 13, 2013 7:32 AM -
Hi,
Check if the bultin function SOUNDEX() is useful to solve your issue.
E.g. WHERE SOUNDEX(Source_packet_no) = SOUNDEX(Target_packet_no)
Thanks,
Asharaf
MCTS, http://asharafaliptb.wordpress.com
Thursday, June 13, 2013 7:38 AM -
Try this:
create table #t(id varchar(12)) insert into #t values ('110040402811') insert into #t values ('110040402820') insert into #t values ('110040402841') insert into #t values ('110040402856') insert into #t values ('110040402857') insert into #t values ('110040402858') insert into #t values ('110040402859') insert into #t values ('110040402863') Declare @original varchar(12) Set @original = '110040402710' ;WITH C1 AS ( SELECT id FROM #t WHERE SUBSTRING(id,1,1) = SUBSTRING(@original,1,1) ), C2 AS ( SELECT id FROM #t WHERE SUBSTRING(id,2,1) = SUBSTRING(@original,2,1) ), C3 AS ( SELECT id FROM #t WHERE SUBSTRING(id,3,1) = SUBSTRING(@original,3,1) ), C4 AS ( SELECT id FROM #t WHERE SUBSTRING(id,4,1) = SUBSTRING(@original,4,1) ), C5 AS ( SELECT id FROM #t WHERE SUBSTRING(id,5,1) = SUBSTRING(@original,5,1) ), C6 AS ( SELECT id FROM #t WHERE SUBSTRING(id,6,1) = SUBSTRING(@original,6,1) ), C7 AS ( SELECT id FROM #t WHERE SUBSTRING(id,7,1) = SUBSTRING(@original,7,1) ), C8 AS ( SELECT id FROM #t WHERE SUBSTRING(id,8,1) = SUBSTRING(@original,8,1) ), C9 AS ( SELECT id FROM #t WHERE SUBSTRING(id,9,1) = SUBSTRING(@original,9,1) ), C10 AS ( SELECT id FROM #t WHERE SUBSTRING(id,10,1) = SUBSTRING(@original,10,1) ), C11 AS ( SELECT id FROM #t WHERE SUBSTRING(id,11,1) = SUBSTRING(@original,11,1) ), C12 AS ( SELECT id FROM #t WHERE SUBSTRING(id,12,1) = SUBSTRING(@original,12,1) ), combined AS ( SELECT * FROM C1 UNION ALL SELECT * FROM C2 UNION ALL SELECT * FROM C3 UNION ALL SELECT * FROM C4 UNION ALL SELECT * FROM C5 UNION ALL SELECT * FROM C6 UNION ALL SELECT * FROM C7 UNION ALL SELECT * FROM C8 UNION ALL SELECT * FROM C9 UNION ALL SELECT * FROM C10 UNION ALL SELECT * FROM C11 UNION ALL SELECT * FROM C12 ), at_least_10 AS ( SELECT id, COUNT(*) AS cnt FROM combined GROUP BY id HAVING COUNT(*) >= 10 ) SELECT * FROM at_least_10 drop table #t
Gert-Jan
- Proposed as answer by Allen Li - MSFT Saturday, June 22, 2013 8:49 AM
- Marked as answer by Kalman Toth Saturday, June 29, 2013 6:58 AM
Thursday, June 13, 2013 1:13 PM -
You need to send along a CHECKSUM value with each string:
http://msdn.microsoft.com/en-us/library/ms189788.aspx
At destination the CHECKSUM can be calculated again and compared.
Kalman Toth Database & OLAP Architect sqlusa.com
New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012Saturday, June 22, 2013 8:46 AM -
Hm, Kalman, how is your thought here? That they should change their bar codes (or how these packed numbers are printed), so that it includes a checksum/digit?
I agree that this is a good idea, and maybe Rohit should investigate whether there are already check digits in place, because that could help him to reconstruct the correct number from failed scan, or at least limit the space of possible numbers.
But if there are no check digits in place, this require a bigger change in the process than a mere programmer like Rohit may be able to enforce.
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Saturday, June 22, 2013 10:12 AM -
This sounds like a generic string match and error correction question.
Which means there is not likely to be just a single answer based on the digits, since any one or more digits might be randomly changed.
These are both highly specialized areas and not really a database issue.
Are you sure there isn't some kind of error detection or correction built into the numbers already? Because it sounds like there should be!
Josh
Saturday, June 22, 2013 5:12 PM