Normalize data in Excel Spreadsheet and create SQL tables.


  • Hi All, i dont know if this is the right forum for my question but i have to normalize the data in EXCEL spreadsheet having around 35 columns, create an ER diagram and sql tables . I know there are plenty of normalization articles online but here i have an excel spreadsheet and dont know Where to start. Any suggestion or if you know any article on Normalizing Excel data to create sql tables then it would be great.
    Monday, April 25, 2011 5:15 PM

All replies

  • Normalization is not an automated process, its a manual process, we need to see the data, and use our normalization techniques and create tables and then populate accordingly

    my suggestion would be export excel data to a single table, from your table insert into the normalized tables using some logic.


    If you think my suggestion is useful, please rate it as helpful. If it has helped you to resolve the problem, please Mark it as Answer. DVR
    Monday, April 25, 2011 5:20 PM
  • Excel data is not different than any other data if we think about normallization, just analyze your data and try to divide into proper tables. So basically you have to apply your knowledge about normalization to your data.
    With kind regards
    Krystian Zieja
    Follow me on twitter
    My Blog
    Monday, April 25, 2011 5:41 PM
  • If all 35 columns are on a single tab of the spreadsheet then it's currently being treated like a single table.

    Read the rules that apply to the 1st normal form and start moving columns to new tables (spreadsheet tabs). Once you feel like your data is in the 1st normal form, do the same thing all over again for the 2nd normal form... then the 3rd normal form.

    Most real world databases don't go beyond the 3rd normal form but if this is a classroom assignment, normalize to the assigned normal form.

    Just take it step by step

    Jason Long
    Monday, April 25, 2011 6:56 PM
  • Just for the start:

    The first normal form is to eliminate repeating groups. What that means is, if you have data like this:


    StudentID, Student Name, Subject1, Mark1, Subject2, Mark2, Subject 3, Mark3, Overall Grade, ...,

    The (Subject, Mark) combination is repeating. So the first step is to remove them and take them to a separate table linked with Student Key.


    If Student Id is the key

    The subject Mark table may look like

    StudentID, Subject, Mark and repeating groups groups will be multiple rows here..

    The simpler the solution the stronger it is
    If this post answers you, please mark it as answer..
    If this post is useful, please vote it as useful..
    Monday, April 25, 2011 7:06 PM
  • Thanks for your responce Guys, all the columns in my spreadsheet are having reapeating data. For example..

    GroupName     GroupNumber   BP      PlanName  Contract   Rx         Group       PlanId  .....

    TeamTrack      1001               123     Basic          s12        0423     100001      2365

    TeamTrack      1001               125     Basic          s12        0423      100001     2365

    TeamTrack      1001               456     Basic          s12         0423     100001      2365

    TeamTrack      1001               795     Basic          s12         0423      100001     5569

    Management   1002               235     Choice        s23         0421      100001     5569

    Management   1002               235      Choice       23s         0421      100001     5569


    Can you tell me how do i normalize this example?

    Thanks again..

    Monday, April 25, 2011 8:11 PM
  • You haven't given enough info to really help here. Based on what you're showing us it's just a bunch of random numbers.

    You can start getting the group stuff out...

    CREATE TABLE Groups (
    GroupNumber INT NOT NULL Primary Key,
    GroupName VarChar(255) NOT NULL)
    INSERT INTO Groups
    SELECT 1001, 'TeamTrack' UNION ALL
    SELECT 1002, 'Management'

    And that's just a guess... There is literally nothing there that tells us what these data columns are, in fact it looks like BP, Contract, Rx, Group & PlanID are all foreign keys to other tables.

    Jason Long
    Monday, April 25, 2011 8:42 PM
  • I agree with DVR, the initial phase of normalization is not an automated process at all. Breakingdown the columns into manageable chuncks, and remove the redundant data is a major key to get the task accomplished. There are many normalization tools you can use. However, I strongly suggest, to first conceptualize the ERD and map it accordingly.

    To recap:

    1-    Breakdown the columns into manageable chuncks.

    2-    Remove redundant data.

    3-    Know your ERD keys.

    4-    Map them accordingly.

    Good luck


    Thursday, April 19, 2012 6:05 PM