Hi All, i dont know if this is the right forum for my question but i have to normalize the data in EXCEL spreadsheet having around 35 columns, create an ER diagram and sql tables . I know there are plenty of normalization articles online but here i have an excel spreadsheet and dont know Where to start. Any suggestion or if you know any article on Normalizing Excel data to create sql tables then it would be great.
Normalization is not an automated process, its a manual process, we need to see the data, and use our normalization techniques and create tables and then populate accordingly
my suggestion would be export excel data to a single table, from your table insert into the normalized tables using some logic.
If you think my suggestion is useful, please rate it as helpful. If it has helped you to resolve the problem, please Mark it as Answer. DVR
Excel data is not different than any other data if we think about normallization, just analyze your data and try to divide into proper tables. So basically you have to apply your knowledge about normalization to your data.
With kind regards
Follow me on twitter
If all 35 columns are on a single tab of the spreadsheet then it's currently being treated like a single table.
Read the rules that apply to the 1st normal form and start moving columns to new tables (spreadsheet tabs). Once you feel like your data is in the 1st normal form, do the same thing all over again for the 2nd normal form... then the 3rd normal form.
Most real world databases don't go beyond the 3rd normal form but if this is a classroom assignment, normalize to the assigned normal form.
Just take it step by step
Just for the start:
The first normal form is to eliminate repeating groups. What that means is, if you have data like this:
StudentID, Student Name, Subject1, Mark1, Subject2, Mark2, Subject 3, Mark3, Overall Grade, ...,
The (Subject, Mark) combination is repeating. So the first step is to remove them and take them to a separate table linked with Student Key.
If Student Id is the key
The subject Mark table may look like
StudentID, Subject, Mark and repeating groups groups will be multiple rows here..
The simpler the solution the stronger it is
If this post answers you, please mark it as answer..
If this post is useful, please vote it as useful..
Thanks for your responce Guys, all the columns in my spreadsheet are having reapeating data. For example..
GroupName GroupNumber BP PlanName Contract Rx Group PlanId .....
TeamTrack 1001 123 Basic s12 0423 100001 2365
TeamTrack 1001 125 Basic s12 0423 100001 2365
TeamTrack 1001 456 Basic s12 0423 100001 2365
TeamTrack 1001 795 Basic s12 0423 100001 5569
Management 1002 235 Choice s23 0421 100001 5569
Management 1002 235 Choice 23s 0421 100001 5569
Can you tell me how do i normalize this example?
You haven't given enough info to really help here. Based on what you're showing us it's just a bunch of random numbers.
You can start getting the group stuff out...
CREATE TABLE Groups ( GroupNumber INT NOT NULL Primary Key, GroupName VarChar(255) NOT NULL) INSERT INTO Groups SELECT 1001, 'TeamTrack' UNION ALL SELECT 1002, 'Management'
And that's just a guess... There is literally nothing there that tells us what these data columns are, in fact it looks like BP, Contract, Rx, Group & PlanID are all foreign keys to other tables.
I agree with DVR, the initial phase of normalization is not an automated process at all. Breakingdown the columns into manageable chuncks, and remove the redundant data is a major key to get the task accomplished. There are many normalization tools you can use. However, I strongly suggest, to first conceptualize the ERD and map it accordingly.
1- Breakdown the columns into manageable chuncks.
2- Remove redundant data.
3- Know your ERD keys.
4- Map them accordingly.