how to generate combination of records in a column....like,,R1...R1,R2...R1,R2,R3...R1,R3....Something like that

# how to generate combination of records in a column....like,,R1...R1,R2...R1,R2,R3...R1,R3....Something like that

• Saturday, March 10, 2012 11:09 AM

hi

i have a table with one only colum with 10 records lets say R1,R2......R10

i want to generate all possible combinations...as the result set

like

R1

R1,R2

R1,R2,R3

R1,R2,R3,R4

.....

R1,R2,R3,R4............R5

R1,R10

R1,R9

like that i need to generate all possible combination

ilikemicrosoft

### All Replies

• Saturday, March 10, 2012 11:55 AM

Generating the Kartesian Product (I hope that is written right; every row with every row) is rather easy, as show in this approach to the Multiplication Table:

DECLARE @Table TABLE(Value INT);

INSERT INTO @Table VALUES
(1),(2),(3),(4),(5),(6), (7),(8),(9),(10);

SELECT A.Value, B.Value, Product = A.Value * B.Value FROM @Table AS A, @Table AS B;

But I seriously doubt you understand the magnitude of the result set you will get. It's beyond an Exponential growth (X^Y, with Y being fixed). It's X ^ X; When you ahve 2 rows, you will get 4 Results. with 3 you get 27. With ten rows, you get 10 ^10  or 10,000,000,000 Cobinations. With 100 Rows, you get 1 * 10^200 rows.
So please don't try this in SQL. Make it in the client application. Or better yet: Don't do it at all.

• Saturday, March 10, 2012 12:07 PM

Hello,

I dont' think this is a good idea from performance point of view. You can use Cartesian product query for getting all the combinations.

• Saturday, March 10, 2012 12:08 PM

but the above query wont give the following combinations

1,2,3

1,9

1,2,3,4,5

like

each and every combination

ilikemicrosoft

• Saturday, March 10, 2012 1:42 PM

So that is quite a challenge. It's not something you normally do with SQL.

Before I starting thinking of any solutions, I have a few questions.

Have you thought about the format? How do you want each combination served? A comma-separated list? A row? In the latter case, since each row in a result set has the same number of columns, how to deal with unused columns?
Or a result set for each combination?

What about permutations? Is <R2, R1, R3, R4> different from <R4, R3, R1, R4>? If you don't want all permutation is there a requirement for the values to be ordered?

You said that you have 10 values in the table. Is that always 10 values or could it be more? Less?

And finally, what is the ultimate purpose that you will use all these combinations for?

Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
• Saturday, March 10, 2012 2:43 PM

I think I must give you figures to realise just how bad 10,000,000,000 Combanations are. Asuming each row in the result would only take up one byte, you would need 10,000,000,000/1024/1024/1024 = 9.3 GB of RAM.

No in reality each of these rows will be at least 1-10 byte, so on average they will be 5 byte each. That means you need 5 time 9.3 GB of RAM. So you would need around 46 GB. If you need them as Commaseperated String, things go up a notch (aroud twice the 46 GB).

And you do remember, that you propably have to send this to the client via the network. Wich measn you need that much on both ends.

• Saturday, March 10, 2012 2:48 PM
Moderator

If you have only 10 rows and the values are unique (not necessarily), then you could enumerate each row (ROW_NUMBER), pivot the rows to columns (PIVOT), and then generate the combinations using GROUP BY CUBE().

DECLARE @T TABLE (
c1 int NOT NULL UNIQUE
);

INSERT INTO @T (c1)
VALUES
(1),
(2),
(3),
(4),
(5),
(6),
(7),
(8),
(9),
(10);

WITH R AS (
SELECT
ROW_NUMBER() OVER(ORDER BY c1) AS rn,
c1
FROM
@T
)
SELECT
*
FROM
R
PIVOT
(
MAX(c1)
FOR rn IN ([1], [2], [3], [4], [5], [6], [7], [8], [9], [10])
) AS P
GROUP BY
CUBE([1], [2], [3], [4], [5], [6], [7], [8], [9], [10])
HAVING
GROUPING_ID([1], [2], [3], [4], [5], [6], [7], [8], [9], [10]) < 1023;
GO
This query does not handle permutations.

AMB

Some guidelines for posting questions...

• Saturday, March 10, 2012 4:24 PM

 MY SELECT QUERY WILL RETURN ONLY ONE COLUMN AS FOLLOWS R1 R2 R3 R4 R5

BUT I WANT to get the following combinations

 R1 R2 R3 R4 R5 R1 R2 R3 R4 R1 R2 R3 R1 R2 R1 R1 R3 R4 R5 R1 R4 R5 R1 R5 R1 R4 R5 R1 R5 R1 R2 R3 R4 R5 R2 R3 R4 R2 R3 R2 R3 R4 R5 R3 R4 R4 R5 R5

ilikemicrosoft

• Saturday, March 10, 2012 6:07 PM

Alejandro's query gave you exactly that, so what's your problem?

And, again, why do you want all these combinations?

Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
• Saturday, March 10, 2012 6:15 PM

Ah, now I get it. You are the same guy who started the thread

OK, then I have an answer for you: you are in a dead end. This is not going to work out. As I said in that original thread, you cannot try permutations to find a unique key in the data at random. You may find several combinations of columns that are unique, but how do you know that they actually describe the data? And, if these table do not have a primary key, you may not find any unique combinations at all, simply because there are rows that are complete duplicates.

If you have no knowledge or documentation about the data you have, you might as well throw it away.

Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
• Marked As Answer by Friday, March 16, 2012 5:12 AM
•
• Saturday, March 10, 2012 11:47 PM
Moderator

I really hope you listen to the advices given to you about your problem. Better to do it right from the begining, and not get a surprise after deploying your data warehouse model into production.

AMB

Some guidelines for posting questions...

• Sunday, March 11, 2012 2:06 PM

Ah, now I get it. You are the same guy who started the thread

OK, then I have an answer for you: you are in a dead end. This is not going to work out. As I said in that original thread, you cannot try permutations to find a unique key in the data at random. You may find several combinations of columns that are unique, but how do you know that they actually describe the data? And, if these table do not have a primary key, you may not find any unique combinations at all, simply because there are rows that are complete duplicates.

If you have no knowledge or documentation about the data you have, you might as well throw it away.

Listen to this. You have either

...the shema and know the primary key

or

...junk data

There is nothing in between. And when you don't have the shema or even an idea of the shema or at least the PK, there is no way to guess it and no way to calculate it. No amount of wishes or orders from superiors, and no mathematical trick can change that.