none
Optimal performance for two dim array to datatable conversion

    Question

  • Hi.

    I was looking for an optimized method to convert a square array of type

    double[ , ]

    to DataTable

    I looked up over the internet and couldn't find a way faster than the currently implemented in my method which depends on two nested for loops.

    The problem is that the performance is very bad. My input array could be any where from 500 to 15000 elements in each dimension. (Math can be scary)...

    This means that it could take ages for the conversion to complete with the current method.

    I was thinking about using some unmanaged code or even assembly code to improve the performance but I didn't know how to do it..

    My Current code is this...

    // helper method to get row as a string array
    
    private static string[] GetRowFromArray(double[,] array, int column )
    
    {
    
       int dim = array.GetUpperBound(0) +1;
    
       string[] row = new string[dim];
    
       for(int i = 0; i < dim; i++)
    
       {
    
          row[i] = array[i, column].ToString();
    
       }
    
       return row;
    
    }
    
    
    //the static extension method used to convert the array to datatable
    
    public static DataTable ToDataTable(this double[,] array)
    
    {
    
       DataTable dt = new DataTable();
    
       int rowsCount = array.GetUpperBound(0) + 1;
    
       int colsCount = array.GetUpperBound(1) + 1;
    
       int i = 0;
    
       while (i < colsCount)
    
       {
    
         dt.Columns.Add(); i++;
    
       }
    
       int ii = 0;
    
       while (ii < rowsCount)
    
       {
    
          string[] a = GetRowFromArray(array, ii);
    
          dt.Rows.Add(a);
    
          ii++;
    
       }
    
       return dt;
    
    }


    Any ideas on how can we optimize this to make it run faster.

    Just a side note, the array is square array, so  ColumnsCount = RowsCount

    And so many thanks for any help.



    Thursday, December 07, 2017 1:05 AM

All replies

  • IMO the way you're doing this is already fastest.

    If the number of data is significantly large, you may consider pre-adding that number of DataRow into DataTable, then partition the assignment code with something like the second example in Parallel.For() here and see if there's any improvement in speed. (by breaking the step into empty row insert and point-to-point data assignment, you make it become two tasks that contains subtasks that can be executed in independent order, therefore creating opportunity for parallelization )


    Thursday, December 07, 2017 2:06 AM
    Answerer
  • Hello Sir and so many thanks for the response

    I will read the example and test if it suits my needs, even though I am still a beginner and it seems a long shot for me.

    I've already tried using some simple threading to do it using Task but no performance gain achieved.

    Anyways your help is much appreciated and so many thanks for you again

    Thursday, December 07, 2017 2:21 AM
  • IMO the way you're doing this is already fastest.

    If the number of data is significantly large, you may consider pre-adding that number of DataRow into DataTable, then partition the assignment code with something like the second example in Parallel.For() here and see if there's any improvement in speed. (by breaking the step into empty row insert and point-to-point data assignment, you make it become two tasks that contains subtasks that can be executed in independent order, therefore creating opportunity for parallelization )


    I have been reading and trying to do it as you kindly described but I can't seem to get it to work.

    I noticed though in the only NOT success NOT failure run of the application ((It ran without exceptions but returned wrong results))

    that the resulting datatable have sort-of random nulls placed in between actual result

    Anyways the result is not usable so far.

    I will keep trying and hope it goes well.

    Any Other suggestions are extremely welcome.

    Thanks

    Thursday, December 07, 2017 5:40 AM
  • I tried using parallel for but I think overhead was bigger than wanted

    It didn't give desired effect 

    Thursday, December 07, 2017 2:07 PM
  • Your comment is unfair to us because that really is not an answer to your question.

    In the past we did suggest to use jagged array too, but the suggestion was rejected because the asker need DataTable (DataSet actually) to write to XML.

    As you don't say what you're trying to do with the DataTable, the kind of suggestion we can make is really restricted by what information you give to us.


    Friday, December 08, 2017 1:23 AM
    Answerer
  • I am sorry for any inconvenience I 've deleted the unfair accepted answer .

    About the datatable I do not need any xml from it.

    And I am still open to any optimized solution or suggestion 

    Again thanks in advance 

    Friday, December 08, 2017 12:58 PM
  • very much in addition to cheong, 

    You say, "Just a side note, the array is square array, so  ColumnsCount = RowsCount"

    A datatable is a kind of reflection of a table in a database. 

    Therefore, try first to make a database table with 15000 elements, than you know what you are doing and why it takes so long. 

    The datatable simply does not fit for this kind of operations. Ad if it does in your case, then accept it takes very long to populate.


    Success
    Cor

    Friday, December 08, 2017 1:33 PM
  • Friday, December 08, 2017 3:13 PM
  • HI.

    Thanks DA924x  very much for your reply

    Those are very interesting articles actually

    I am giving it a try and see what comes up .


    Friday, December 08, 2017 8:43 PM
  • very much in addition to cheong, 

    You say, "Just a side note, the array is square array, so  ColumnsCount = RowsCount"

    A datatable is a kind of reflection of a table in a database. 

    Therefore, try first to make a database table with 15000 elements, than you know what you are doing and why it takes so long. 

    The datatable simply does not fit for this kind of operations. Ad if it does in your case, then accept it takes very long to populate.


    Success
    Cor

    Hello Sir,

    So many thanks for your help and clarifications you did about DataTables. They were very enlightening. 

    Again I am sorry for any inconvenience caused by my previous reply.

    Friday, December 08, 2017 8:48 PM
  • HI.

    Thanks DA924x  very much for your reply

    Those are very interesting articles actually

    I am giving it a try and see what comes up .


    Hi MyPuplicldentity,

    Have you solved your problem now?

    If so, hope you can close this thread by marking the helpful reply as answer as this will help others looking for the same or similar issues down the road.

    Regards,

    Stanly


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    16 hours 8 minutes ago