Answered by:
Numeric vs int and bigint
Question

I have several numbers which can be 10, 15, or 20 characters long which are used in the medical industry such as NDC, NPI numbers and I currently use numeric sql 2000 datatypes for these type of fields. As long as they are all numbers is it better to use bigint, int, numeric or some other datatype when storing these types of values. The main reason I am asking here is b/c I was having trouble finding the answer on MSDN documentation.
Marc Noon
Monday, March 26, 2012 3:58 PM
Answers

@mnoon.
Choosing which of these data types to use depends on the value you want to store for the column or variable. The rule of thumb is to always use the data type that will require the least storage size. Don't always use INT as your data type for whole numbers if you don't need to.For Numberic NUMERIC (p [, s ]),user could declare the precision(p),but it seemed take more storage size than Int type.
Precision Storage Size 1  9 5 bytes 10 19 9 bytes 2028 13 bytes 2938 17 bytes Data Type Minimum Value Maximum Value Storage Size tinyint 0 255 1 byte smallint 2^15 (32,768) 2^15  1 (32,767) 2 bytes int 2^31 (2,147,483,648) 2^31  1 (2,147,483,647) 4 bytes bigint 2^63 (9,223,372,036,854,775,808) 2^63  1 (9,223,372,036,854,775,807) 8 bytes
Best Regards,
Peja
Please remember to click "Mark as Answer" on the post that helps you, and to click "Unmark as Answer" if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.Tuesday, March 27, 2012 8:22 AM 
hi Marc
as often  it depends.
numeric data type is identical with decimal which represents a fixed precision number.
http://msdn.microsoft.com/enus/library/aa258271(v=sql.80).aspx
Fixed precision and scale numeric data from 10^38 +1 through 10^38 –1.
Functionally equivalent to decimal.
I don't think that the number types you mention are using fixed precision number and therefore bigint is probably the most efficient way to store the number especially if you want to perform some computation in your application.
I don't see really any use for computation with those number and therefore you may even use a string of appropiate length which requires more space in the database but you may be able to allow grouping characters in the numbers.
using BIGINT datatype instead of string you can create efficient indexes.
As you write you're already using numeric datatype and therefore if you upgrade to SQL 2008R2 / 2012 you should consider switching to bigint as you don't need fraction in your number.
In addition I would recommand to create constraint on the accepted number such that they're garantied to be within the defined range (eg. length of characters)
 Marked as answer by Kalman Toth Saturday, October 6, 2012 8:23 AM
Monday, March 26, 2012 4:36 PM 
Since you do not do math on the codes you should use CHAR(n) and an appropriate constraints. For example, see http://en.wikipedia.org/wiki/National_Drug_Code tells us that your NDC would be

The first segment, the labeler code, is 4 or 5 digits long and assigned by the Food and Drug Administration(FDA) upon submission of a Labeler Code Request. A labeler is any firm that manufactures, repacks or distributes a drug product.

The second segment, the product code, is 3 or 4 digits long and identifies a specific strength, dosage form, and formulation for a particular firm.

The third segment, the package code, is 1 or 2 digits long and identifies package forms and sizes.
Since you do not do math on the codes you should use CHAR(n) and appropriate constraints. For example, the http://en.wikipedia.org/wiki/National_Drug_Code tells us that your NDC would be
The first segment, the labeler code, is 4 or 5 digits long and assigned by the Food and Drug Administration (FDA) upon submission of a Labeler Code Request. A labeler is any firm that manufactures, repacks or distributes a drug product.
The second segment, the product code, is 3 or 4 digits long and identifies a specific strength, dosage form, and formulation for a particular firm.
The third segment, the package code, is 1 or 2 digits long and identifies package forms and sizes.
So we protect data integrity, which you are not doing now with this code.
ndc CHAR(12) NOT NULL
CHECK (CASE
WHEN ndc LIKE '[09][09][09][09][09][09][09][09][09][09]'
THEN 't'
WHEN ndc LIKE '[09][09][09][09][09][09][09][09][09][09]' THEN 't'
WHEN ndc LIKE '[09][09][09][09][09][09][09][09][09][09]' THEN 't'
WHEN ndc LIKE '[09][09][09][09][09][09][09][09][09][09]' THEN 't' ELSE 'f' END = 't'
This is SQL and 8090% of the work is in the DDL and the data model. Once you have that eight, the DML is easy.
CELKO Books in Celko Series for MorganKaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL
Tuesday, March 27, 2012 9:50 PM 
All replies

hi Marc
as often  it depends.
numeric data type is identical with decimal which represents a fixed precision number.
http://msdn.microsoft.com/enus/library/aa258271(v=sql.80).aspx
Fixed precision and scale numeric data from 10^38 +1 through 10^38 –1.
Functionally equivalent to decimal.
I don't think that the number types you mention are using fixed precision number and therefore bigint is probably the most efficient way to store the number especially if you want to perform some computation in your application.
I don't see really any use for computation with those number and therefore you may even use a string of appropiate length which requires more space in the database but you may be able to allow grouping characters in the numbers.
using BIGINT datatype instead of string you can create efficient indexes.
As you write you're already using numeric datatype and therefore if you upgrade to SQL 2008R2 / 2012 you should consider switching to bigint as you don't need fraction in your number.
In addition I would recommand to create constraint on the accepted number such that they're garantied to be within the defined range (eg. length of characters)
 Marked as answer by Kalman Toth Saturday, October 6, 2012 8:23 AM
Monday, March 26, 2012 4:36 PM 
@mnoon.
Choosing which of these data types to use depends on the value you want to store for the column or variable. The rule of thumb is to always use the data type that will require the least storage size. Don't always use INT as your data type for whole numbers if you don't need to.For Numberic NUMERIC (p [, s ]),user could declare the precision(p),but it seemed take more storage size than Int type.
Precision Storage Size 1  9 5 bytes 10 19 9 bytes 2028 13 bytes 2938 17 bytes Data Type Minimum Value Maximum Value Storage Size tinyint 0 255 1 byte smallint 2^15 (32,768) 2^15  1 (32,767) 2 bytes int 2^31 (2,147,483,648) 2^31  1 (2,147,483,647) 4 bytes bigint 2^63 (9,223,372,036,854,775,808) 2^63  1 (9,223,372,036,854,775,807) 8 bytes
Best Regards,
Peja
Please remember to click "Mark as Answer" on the post that helps you, and to click "Unmark as Answer" if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.Tuesday, March 27, 2012 8:22 AM 
Tuesday, March 27, 2012 10:06 AM

This is a great resource:
http://www.sqlserverhelper.com/faq/datatypesp01.aspx
Tuesday, March 27, 2012 7:57 PM 
Since you do not do math on the codes you should use CHAR(n) and an appropriate constraints. For example, see http://en.wikipedia.org/wiki/National_Drug_Code tells us that your NDC would be

The first segment, the labeler code, is 4 or 5 digits long and assigned by the Food and Drug Administration(FDA) upon submission of a Labeler Code Request. A labeler is any firm that manufactures, repacks or distributes a drug product.

The second segment, the product code, is 3 or 4 digits long and identifies a specific strength, dosage form, and formulation for a particular firm.

The third segment, the package code, is 1 or 2 digits long and identifies package forms and sizes.
Since you do not do math on the codes you should use CHAR(n) and appropriate constraints. For example, the http://en.wikipedia.org/wiki/National_Drug_Code tells us that your NDC would be
The first segment, the labeler code, is 4 or 5 digits long and assigned by the Food and Drug Administration (FDA) upon submission of a Labeler Code Request. A labeler is any firm that manufactures, repacks or distributes a drug product.
The second segment, the product code, is 3 or 4 digits long and identifies a specific strength, dosage form, and formulation for a particular firm.
The third segment, the package code, is 1 or 2 digits long and identifies package forms and sizes.
So we protect data integrity, which you are not doing now with this code.
ndc CHAR(12) NOT NULL
CHECK (CASE
WHEN ndc LIKE '[09][09][09][09][09][09][09][09][09][09]'
THEN 't'
WHEN ndc LIKE '[09][09][09][09][09][09][09][09][09][09]' THEN 't'
WHEN ndc LIKE '[09][09][09][09][09][09][09][09][09][09]' THEN 't'
WHEN ndc LIKE '[09][09][09][09][09][09][09][09][09][09]' THEN 't' ELSE 'f' END = 't'
This is SQL and 8090% of the work is in the DDL and the data model. Once you have that eight, the DML is easy.
CELKO Books in Celko Series for MorganKaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL
Tuesday, March 27, 2012 9:50 PM 

I have several numbers which can be 10, 15, or 20 characters long which are used in the medical industry such as NDC, NPI numbers and I currently use numeric sql 2000 datatypes for these type of fields. As long as they are all numbers is it better to use bigint, int, numeric or some other datatype when storing these types of values. The main reason I am asking here is b/c I was having trouble finding the answer on MSDN documentation.
Probably best to store them as alpha if not used for calculations. If there is some overwhelming reason, such as performance, to use numeric storage, that should be the choice.
Demo:
DECLARE @BIGINT BIGINT = 12345678901234567890 SELECT LEN(@BIGINT) /* Msg 8115, Level 16, State 2, Line 1 Arithmetic overflow error converting expression to data type bigint. (1 row(s) affected) */ DECLARE @DEC DECIMAL (30,0) = 12345678901234567890 SELECT LEN(@DEC)  20 DECLARE @string varchar(30) = '12345678901234567890' SELECT LEN(@string)  20
Optimization article:
http://www.sqlusa.com/articles/queryoptimization/
Kalman Toth
New Book: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2016
 Edited by Kalman Toth Wednesday, November 8, 2017 11:23 PM
Wednesday, March 28, 2012 4:27 PM 
I know this post is old but I've worked with NPIs for close to 10 years and have always used Varchar(18). I've worked for a number for top corner pharmacies and they use varchar(18) as well. Hope that helps??Sunday, October 29, 2017 4:25 PM

What if you want to sort your numbers though? I am working with similar data in the pharmaceutical industry. They use numbers for product codes When I store as VarChar, they do not sort numerically.Saturday, March 3, 2018 4:23 PM

RESIST the urge to reply to any thread that is years old. Replying to one, especially one marked as answered, is not an effective way to get answers. Read the thread, digest the information, and then decide how to proceed. If you still have questions, start your own thread and provide sufficient information for others to evaluate your question and respond appropriately. If this thread provides useful information, link to it and explain why.
And yes  you will need to determine your requirements first and that means you need to understand your data and how it is used. Sorting strings and sorting numbers is inherently different. Pick the type that best suits your needs.
Saturday, March 3, 2018 5:33 PM