# SELECT COUNT(a) / COUNT(DISTINCT a)

• ### Question

• I have some sql which looks like this

SELECT COUNT(a) / COUNT(DISTINCT a)

My maths is not so good. Is this calculating a percentage value?

Mr Shaw... One day I might know a thing or two about SQL Server!

Thursday, January 17, 2019 6:24 PM

### All replies

• If there are no any duplicates in the column a, COUNT(a) = COUNT(DISTINCT a). Otherwise COUNT(a) > COUNT(DISTINCT a).

A Fan of SSIS, SSRS and SSAS

Thursday, January 17, 2019 6:37 PM
• If there are no any duplicates in the column a, COUNT(a) = COUNT(DISTINCT a). Otherwise COUNT(a) > COUNT(DISTINCT a).

A Fan of SSIS, SSRS and SSAS

Does > COUNT(DISTINCT a) tell you the difference?

Mr Shaw... One day I might know a thing or two about SQL Server!

Thursday, January 17, 2019 6:40 PM
• COUNT(DISTINCT a) means to count the rows with unique values in the column. COUNT(a) mean count all rows. For example, the column a has the following values:

a, b, c, d, e. In this case COUNT(a) = COUNT(DISTINCT a) = 5

but if the column has the following values:

a, b, c, a, c. In this case COUNT(a) = 5 but COUNT(DISTINCT a) = 3

A Fan of SSIS, SSRS and SSAS

Thursday, January 17, 2019 7:12 PM
• I have some sql which looks like this

SELECT COUNT(a) / COUNT(DISTINCT a)

My maths is not so good. Is this calculating a percentage value?

No it is not computing a percentage value. Anything that computes a percentage needs to have a 100 or 0.01 somewhere. This gives you the ratio between the number of rows and the number of distinct values. To meaningful, the ratio has to relatively big, since it is integer division. In the example that Guoxiong gave, the result will be 1 and not 1.67.

Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Thursday, January 17, 2019 11:11 PM
• Hi Mr Shaw,

In following script, I will share you an example to understand.  In your original post, the data type of count() is int and the data type of 'COUNT(a) / COUNT(DISTINCT a)'  is also int.

I think it makes more sense if we use the decimal type for the result.  When  the result>1 , it means in your data there are some duplicated values. When   the result>1, the values are  not duplicated .

```--drop table test1
create table test1
(a int)
insert into test1 values (1),(2),(3),(4)

---drop table test2
create table test2
(a int)
insert into test2 values (1),(2),(3),(1)

SELECT COUNT(a) *1.00/ COUNT(DISTINCT a) from test1
/*
---------------------------------------
1.0000000000000
*/

SELECT COUNT(a) *1.00/ COUNT(DISTINCT a) from test2
/*
---------------------------------------
1.3333333333333
*/```