I understand the differences between all the different number types in SQL Server. What I am wondering is, does anyone know a solid solution for someone who needs their math to be correct? The results of simple math problems differ depending on which type you use. I can safely say my TI-83 from over 10 years ago never gets the math wrong.
Here is a perfect example. My company is trying to calculate number of layers of boxes when we know the total box count and how many can fit on one layer. The logic comes in when you can put boxes of different dimensions on the same layer. For example box A can fit 3 per layer and box B can fit 6. Therefore box B is half the size of box A. In order to calculate the total layers needed you divide the total boxes by the quantity per layer and sum them up. Simple math in the example below (8/6) + (3/6) + (1/6). Once I know the actual number of layers I want to take the ceiling because we can't have a partial layer. Depending on the type of my numbers I could get different results. See the script below. In case 1 the ceiling is 2, in case 2 it is 3, in case 3 it is 2. Can someone help by telling me a solid number type to use because right now my calculator is still better at math then a corporate level database. Sad
SELECT SUM(TOTAL) FROM ( SELECT CONVERT(DECIMAL(18, 10), 8) / 6 AS TOTAL UNION ALL SELECT CONVERT(DECIMAL(18, 10), 1) / 6 AS TOTAL UNION ALL SELECT CONVERT(DECIMAL(18, 10), 3) / 6 AS TOTAL ) X SELECT SUM(TOTAL) FROM ( SELECT CONVERT(REAL, 8) / 6 AS TOTAL UNION ALL SELECT CONVERT(REAL, 1) / 6 AS TOTAL UNION ALL SELECT CONVERT(REAL, 3) / 6 AS TOTAL ) X SELECT SUM(TOTAL) FROM ( SELECT CONVERT(FLOAT, 8) / 6 AS TOTAL UNION ALL SELECT CONVERT(FLOAT, 1) / 6 AS TOTAL UNION ALL SELECT CONVERT(FLOAT, 3) / 6 AS TOTAL ) X
You are making this too hard. Read http://www.tdan.com/view-perspectives/5343
--CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL
No, it is not easy. And I thank you for the example. A customer has made some noise about the fact that we use float in our financial application. Your first example shows that you are not safe with decimal either.
If I understand your problem correctly, you are effectively working with rational numbers, and there is no built-in type for this in SQL Server. You could implement one in the CLR, or just keep denominator and nominator in separate columns.
Another possibility is to implement an "intelligent" ceiling function. If you know that the nominator is 6, then you know that 2,00000004470348 is the number 2 with some noise that the end. (Of course, if you use approxamite numbers, you should use float, not real, since 32-bit float values are a bit too noisy.)
And while your pocket calculator computes this example correctly, you can bet it will go wrong once the numbers become hairy enough. Decimal and floating-point are after all, only approximation for rational numbers.
Erland Sommarskog, SQL Server MVP, email@example.com
I'm not at work or by a SQL box to test this out, but just out of curiosity, how about changing the 6 to the same type as your numerator in each case? It would be considered 'good form' for optimization purposes anyway, but it may impact your answer too.
Additionally, try nesting: So (pseudocode), "Select Cast( cast(8 as decimal) / cast(6 as decimal) as decimal )" and variations thereof.
Anyway, sorry for posting without the benefit of testing, but I'd definitely try those two variations, to see what might happen.
One thing about all the floating point types to keep in mind, is that they do not have infinite precision. There are only so many digits (before and after the decimal point) they can store. 123,456,789,123,456,789 will suffer the same truncation/inprecision as 0.123 456 789 123 456 789.
You calculator might not suffer that, but that is because it has way more time and (relative) more storage. For all we know, it could calculate this out out as strings (the way humans do). Or just have a float type so big, that the display won't be able to show a number that is too big for it (limiting user input/output to not run into problems). Or just use huge Integers (see below).
The SQL server does not have this capability. It has to solve all your actions fast. It has to use every type to the last value in it's range. It has to use the existing RAM well. And it can't just limit the numbers is shows you.
When you only need a certain amount of precision, it might be better to not use a float type at all. When you only need 3 digits behind the decimal point, just multiply by thousand and store it as Integer. You only need to convert/reformat it into the expected format when you show it the user.
In fact that is how money and smallmoney actually work:
money: -922.337.203.685.477,5808 to 922.337.203.685.477,5807 (8 byte)
bigint: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (8 byte)
smallmoney: -214.748,3648 to 214.748,3647 (4 byte)
int: -2,147,483,648 to 2,147,483,647 (4 byte)
Converting the 6 to Decimal doesn't do it, but the outer layer convert seems to do the job:
SELECT SUM(TOTAL) -- Outer convert over entire formula, including converting 6 to decimal FROM ( -- SELECT Convert( Decimal( 18, 10), (CONVERT(DECIMAL(18, 10), 8) / CONVERT(DECIMAL(18, 10),6) ) ) AS TOTAL UNION ALL SELECT Convert( Decimal( 18, 10), (CONVERT(DECIMAL(18, 10), 1) / CONVERT(DECIMAL(18, 10),6) ) ) AS TOTAL UNION ALL SELECT Convert( Decimal( 18, 10), (CONVERT(DECIMAL(18, 10), 3) / CONVERT(DECIMAL(18, 10),6) ) ) AS TOTAL ) X SELECT SUM(TOTAL) -- Outer convert over entire formula without converting 6 to decimal FROM ( -- SELECT Convert( Decimal( 18, 10), (CONVERT(DECIMAL(18, 10), 8) / 6 ) ) AS TOTAL UNION ALL SELECT Convert( Decimal( 18, 10), (CONVERT(DECIMAL(18, 10), 1) / 6 ) ) AS TOTAL UNION ALL SELECT Convert( Decimal( 18, 10), (CONVERT(DECIMAL(18, 10), 3) / 6 ) ) AS TOTAL ) X
I do understand precision and that it can effect the outcome if you don't have enough. Float defaults to 53 precision which isn't a lot. But if your trying to say that a SQL Server cannot perform proper math because it doesn't have "infinite" precision, I don't buy it. A TI-83 calculator has a 6mhz processor and 512kb of flash RAM. While it does only perform single line calculations at a time it is effective enough to consider all numbers the same (integer and decimal) and the math is always correct. To say a database server with dual core at 2.9ghz and 8gb of ram can't match that is slightly embarrasing. I understand that different data types store at different sizes but wouldn't it be nice if Microsoft would adopt a single data type for numerics that would work similar to varchar. Similarly it would have a size of n + 2 bytes. There would no more number type conversion necessary and the math would be to an "infinite" precision if MAX is specified. On a corporate level I find imprecision a major issue because it can cost thousands of dollars.
There are other fixes to the issue but this thread was mainly started to point out a glaring issue that developers have to deal with in a corporate level database. The math is inaccurate and requires weird conversions, which result in performance loss. I was hoping someone could explain why there are so many data types for numbers? I don't buy the smaller size data types were a winning trade off for poor math.
And from the "Cheater's shortcut" files, a way to make the statements shorter and cleaner looking. However, like many sneaky shortcuts, being sneaky may someday bite you back. However, it's still good/fun to know:
SELECT Cast( 8 / 6 as Decimal(18, 10) ) AS Tot_A_Wrong -- Wrong answer,casted integer answer 1 , Cast( 1 * 8 / 6 as Decimal(18, 10) ) AS Tot_B_Wrong -- Wrong answer, casted integer answer 1 , Cast( 1.0 * 8 / 6 as Decimal(18, 10) ) AS Tot_C_Wrongish -- Probably not enough decimal places , Cast( 1.00 * 8 / 6 as Decimal(18, 10) ) AS Tot_D_Wrongish -- Probably not enough decimal places , Cast( 1.0000 * 8 / 6 as Decimal(18, 10) ) AS Tot_E_Wrongish -- Probably not enough decimal places , Cast( 1.000000 * 8 / 6 as Decimal(18, 10) ) AS Tot_F_Wrongish -- Probably not enough decimal places , Cast( 1.00000000 * 8 / 6 as Decimal(18, 10) ) AS Tot_G_Better -- Better: Enough decimal places , Cast( 1E0 * 8 / 6 as Decimal(18, 10) ) AS Tot_H_Better -- Better: Enough decimal places
Why do I say this may bite you back? It's probably NOT one of those cases where a future SQL Server version will change how the statement operates, but it is a case where a future developer may decide to "correct" your code and remove the "unnecessary" multiplication, thereby eliminating your sneaky trick and introducing an error.
So, if you get tricky, document the statement clearly! Or type it out the long way.
In this 512 kb it has to store at tops 4 values. That means your calcualtor has around 131,072 bytes per Number.
UNIQUEIDENTIFIER with his 16 bytes is an extremly big type. In fact it's so big, it's size starts to become a Performance issue (the processor has to compare it as two 8-byte integers). And honestly, 99.99% of all user of SQL Server do not need at type bigger than BIGINT.
For the 0.01% that is left:
Make the calculation in the client application. There is bound to be some pre-created Class libary with wich you can calcualte really big int/really precise floats in any higher language. If not, it's easy enough to make one.
If you need to store your numbers in the Database:
Varbinary. Store teh obejcts direclty
Varchar. Store them as XML.
A Database has the job to allow the save storage and manipulation of Data. That it is what it is desinged for, that is what it is good at.
It is not designed doing high precision mathematics. Or even complex mathematics. It's not even supposed to be able to do that. Doing what you try to do in SQL, is like trying to make a GUI in Assembler.
Yes, there exist db engines which accept 126-bit mantissa. But not MS SQL. This is how it is, if you need more do it in C# or whatever else.
Or find a workaround. For example, the problem you mentioned, to calculate number of layers of boxes, may be under some conditions be reformulated using integer arithmetic only.
- Edited by SergNL Monday, March 12, 2012 1:52 PM
Again, if you want exact results when computing rational numbers, you need to use rational numbers. If you use decimal numbers, be that in base 10 or in base 2, you are using approxamite numbers, and whether you get the correct result on the bottom line is up to chance, no matter you use SQL Server or TI-83.
Erland Sommarskog, SQL Server MVP, firstname.lastname@example.org
Then again, it appears that when do decimal division, SQL Server always
truncates, never rounds. That works to your advantage, so that ceiling
should work for you.
By the way, I tried this on my HP32S 1/3 + 1/3 + 1/3. The result had an
awful lot of nines. Not unexpected.
Erland Sommarskog, SQL Server MVP, email@example.com