Asked by:
New to SQL, Need to arrange data in table

Question
-
Hello all,
First, I will say I have only been working with SQL Server Express 2014 since this past October. I have been reading and researching the basics and have an elementary understanding of the language. I work for a company who now needs to use a SQL database for various data storage and retrieval. I am the lucky one to figure it out.
My first adventure involved importing an Excel document into a table which I have done successfully. This table contains various information and at present is breaking some rules in regards to standard data form. i.e., duplicate values
I need to clean up this data and get it up to spec.
I am unable to share the specific data, but I can give an example of what I am wanting to accomplish.
To start off, I have six columns of data. ID(primkey), UPCNumber, MaterialNumber, MaterialDesciption, ComponentNumber, ComponentDesciption
For ease of understanding, I am only going to use the ID, UPC, ComponentNumber, and ComponentDescription
At present the table looks as this except to the order of 8,129rows.
ID UPC ComponentNumber ComponentDesc ----------------------------------------------------- 1 1234 9876 RED 2 1234 6789 BLUE 3 1234 7689 WHITE 4 1234 8769 GREEN 5 2334 4564 RED 6 2334 3456 BLUE 7 2334 2345 WHITE 8 2334 7643 GREEN
I think the best way to store this data is as such:
ID UPC RED BLUE WHITE GREEN ----------------------------------------------------- 1 1234 9876 6789 7689 8769 2 2334 4564 3456 2345 7643
I must mention that each UPC contains a certain RED, BLUE, WHITE, and GREEN "Thing", and some cases a UPC number may use the same RED, BLUE, WHITE, and/or GREEN "Thing", as a different UPC number does
So my question is, how would you go about accomplishing this?
Wednesday, April 12, 2017 9:03 PM
All replies
-
Hi seth350,
You can use the PIVOT function, PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output, and performs aggregations where they are required on any remaining column values that are wanted in the final output.
Please refer to the following query:
select ROW_NUMBER() OVER(ORDER BY UPC) AS ID,UPC,RED,BLUE,WHITE,GREEN into [UPC-new] from ( select UPC,ComponentNumber,ComponentDesc from dbo.UPC ) as s pivot ( max(ComponentNumber) for ComponentDesc in (RED,BLUE,WHITE,GREEN) )p
Best Regards,
Teige
MSDN Community Support
Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.- Edited by Teige Gao Thursday, April 13, 2017 5:16 AM
Thursday, April 13, 2017 1:44 AM -
I think the best way to store this data is as such
Hello,
Why do you think so? You have a normalized table, quite right for a OLTP database. Only in DWH we denormalize table to speed up reporting.
What if you get a new color, do you want to always add columns for new colors? With that table design you don't need to make changes.
Olaf Helper
[ Blog] [ Xing] [ MVP]Thursday, April 13, 2017 5:52 AM -
Thank you Teige for your suggestion. Thank you Olaf, I was concerned about the duplicate data within the columns. I see your point where adding a new color would be more trouble. The general purpose for this database is to provide a means to check if a questionable component number is a valid member of a particular UPC#. Select ComponentNum from dbo.UPC where UPC='1234' The result contains all component numbers matching UPC# '1234'. The result is then evaluated in my application against a single component number that was entered by the user. The application compares the user entered component number against each component number in the result. If the application finds a match in the result, then it will notify the user that the component number is valid for UPC# '1234'. My thought for using the 2nd method in my original post was that there will be specific users scanning each color. User #1 only enters RED components User #2 only enters BLUE components Etc... Rather than make the database select all of the components for a given UPC, I thought it would be better to only return the data that is important. The drawback is that it is less dynamic and is locked to one user. In my application, being as dynamic as possible will be key.Thursday, April 13, 2017 10:53 PM