locked
POWER QUERY- help needed RRS feed

  • Question

  • Hi everyone, 

    I have a folder with many identical excel files, each contains data about a different project.

    I want to use power query to import 1 specific sheet out of each file existing in the folder (in each excel file- sheet's locations are the same and so are their names). 

    Here is the tricky part for me- 

    The sheet has 9 summary rows in the upper part of the sheet that I do not need, and the data table starts from row 10 (header row). 

    I do not know how to remove these 9 rows, do that to all files and merge the data into a single table. 

    Appreciate your help!



    • Edited by עמיתת Tuesday, July 7, 2015 7:57 AM
    Tuesday, July 7, 2015 7:56 AM

Answers

  • Gil, thank you! 

    I decided not to use the solution as it was a bit complex for me to understand, and created a template for the users to fill... A small price to pay for what they will get ;) 

    עמית

    • Marked as answer by Charlie Liao Wednesday, July 8, 2015 6:34 AM
    Tuesday, July 7, 2015 6:53 PM

All replies

  • Once you have the data in the power query editor, navigate to the 'Home' tab and click on 'Remove Rows'.  This gives you the option to  'Remove Top Rows' in which you are able to specify how many rows that you want to delete from the top, which in this specific case is 9.

    Once you have deleted the rows for each of the imported sheets you can then can click on the 'Merge' button if you want to join the data together using a particular column (table grows wider), or 'Amend' if you want to add the data from each sheet to the bottom of the original table (table grows longer).

    If you are merging, it will ask which columns that you would like to join the tables on and once selected the UI will tell you how many matches.  Once you click OK, you will then get the original table with another column with a clickable link called 'table'.  On the column header you will have a button with two arrows in, click this and you will be able to choose which columns from your merged data you want to show.

    If you want to append, you select the two tables you want and in order for the append to work properly, the columns have to have the same name, otherwise you won't (in most cases) get what you want.  You can either use the UI to append multiple tables together, or if you are feeling brave, open up the M editor and you can add them in.

    Hope this helps

    Laurence

    Tuesday, July 7, 2015 9:10 AM
  • Hi Laurence!

    Thank you

    I think that what you are suggesting is a local solution- 

    From my experience I noticed that in order to append tables I have to create a query that recognizes them. 

    In my case, I will have a folder with an ever growing amount of excel files which I want the import to be done auto' and not to manually declare each table in the system and then use the Append.

    Maybe I missed something in your answer or in the Append capabilities, and will appreciate your feedback on the core issue. 

    Thanks a lot!!


    עמית

    Tuesday, July 7, 2015 10:00 AM
  • Your best bet is to work against a single file first. Pick a single file from the folder and work through all of the steps needed to transform the data you want from it to the form you need it in, for example using the "Remove Top N" step to remove the 9 summary rows you don't want.

    Once you have done everything you need to do against a single file, you can convert the steps needed into a custom M function that accepts a file as its input and outputs a table of data. That custom function can then be called against all of the files in the folder and the resulting set of tables merged to form your final result set.

    These two pages show examples of this sort of thing:

    http://datapigtechnologies.com/blog/index.php/using-power-query-to-combine-data-from-multiple-excel-files-into-one-table/

    http://www.excelguru.ca/blog/2014/11/12/merge-multiple-files-with-properties/

    Hope that helps!

    Tuesday, July 7, 2015 10:25 AM
  • Hi Amit,

    Here is another option.

    (If you don't have time to learn how I got it, skip the instructions and just copy the first and third Power Query expressions below.

    You should start by creating a single Query from the first workbook. Edit it in the Query Editor, remove first 9 rows as Laurence suggested and save it as a connection.

    Now open the Advanced Editor and change the expression to a function query that receives the workbook filename:

    (filename) =>
    let
        Source = Excel.Workbook(File.Contents(filename), null, true),
        Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
        #"Removed Top Rows" = Table.Skip(Sheet1_Sheet,9),
        #"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows")
    in
        #"Promoted Headers"

    Save the function query as "LoadWorkbook".

    Now create a new query using the From Folder, and select the folder that contains the workbooks.

    Merge the path and the filename into a single column and add a custom column that will call "LoadWorkbook" function with the merged column as the parameter.

    Now you just need to expand the table, delete the merged column (that holds the full path filename) and you are ready to go.

    Here is the second query that will be generated if you follow the instructions (assuming that your files are located in folder name C:\Users\amit\Desktop\append\ and that you have only three columns: "Column1", "Column2", "Column3".

    let
        Source = Folder.Files("C:\Users\amit\Desktop\append"),
        #"Reordered Columns" = Table.ReorderColumns(Source,{"Content", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path", "Name"}),
        #"Removed Columns" = Table.RemoveColumns(#"Reordered Columns",{"Content", "Extension", "Date accessed", "Date modified", "Date created", "Attributes"}),
        #"Merged Columns" = Table.CombineColumns(#"Removed Columns",{"Folder Path", "Name"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
        #"Added Custom" = Table.AddColumn(#"Merged Columns", "Custom", each LoadWorkbook([Merged])),
        #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Column1", "Column2", "Column3"}, {"Column1", "Column2", "Column3"}),
        #"Removed Columns1" = Table.RemoveColumns(#"Expanded Custom",{"Merged"})
    in
        #"Removed Columns1"

    Instead of the query above you can also use this query which is more robust to support any number of columns and names (The column names are not used in the query). In this query we check which columns are used by the first worksheet and assumes all other worksheets share the same structure.

    let
        Source = Folder.Files("C:\Users\amit\Desktop\append"),
        #"Reordered Columns" = Table.ReorderColumns(Source,{"Content", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path", "Name"}),
        #"Removed Columns" = Table.RemoveColumns(#"Reordered Columns",{"Content", "Extension", "Date accessed", "Date modified", "Date created", "Attributes"}),
        #"Merged Columns" = Table.CombineColumns(#"Removed Columns",{"Folder Path", "Name"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
        #"Added Custom" = Table.AddColumn(#"Merged Columns", "Custom", each LoadWorkbook([Merged])),
    
        Headers = Table.ColumnNames(#"Added Custom"{0}[Custom]),
        #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", Headers, Headers),
        #"Removed Columns1" = Table.RemoveColumns(#"Expanded Custom",{"Merged"})
    in
        #"Removed Columns1"

    Hope it helps,

    Gil

    • Proposed as answer by Gil RavivMVP Tuesday, July 7, 2015 10:35 AM
    Tuesday, July 7, 2015 10:34 AM
  • Gil, thank you! 

    I decided not to use the solution as it was a bit complex for me to understand, and created a template for the users to fill... A small price to pay for what they will get ;) 

    עמית

    • Marked as answer by Charlie Liao Wednesday, July 8, 2015 6:34 AM
    Tuesday, July 7, 2015 6:53 PM