locked
Parallel.ForEach with DataTable Rows Collection RRS feed

  • Question

  • Strugling with syntax...

    I have the following foreach:  foreach (dsFactset.FactsetRow dr in this.DsFactset.Factset.Rows) { ...}.

    I have tried replacing it with: Parallel.ForEach(this.DsFactset.Factset.Rows, dr => {...});

    I get the following message: Error 8 The type arguments for method 'System.Threading.Tasks.Parallel.ForEach<TSource>(System.Collections.Concurrent.OrderablePartitioner<TSource>, System.Action<TSource,System.Threading.Tasks.ParallelLoopState,long>)' cannot be inferred from the usage. Try specifying the type arguments explicitly.

    Looking for a sample of what the syntax should look like.

    Thanks....

    Monday, February 22, 2010 7:53 PM

Answers

All replies

  • Unfortunately, DataRowCollection does not implement IEnumerable<DataRow>, or more importantly, your specific type, so you need to explicitly state the type.  Try doing:

         Parallel.ForEach<dsFactset.FactsetRow>(this.DsFactset.Factset.Rows, dr => { ... });




    Reed Copsey, Jr. - http://reedcopsey.com
    Monday, February 22, 2010 8:18 PM
    Moderator
  • Thanks for your suggestion.

    To keep it a simple test I changed it to:

    Parallel

     

     

    .ForEach<dsFactset.FactsetRow>(this.DsFactset.Factset.Rows, dr => { System.Console.WriteLine("Next Record"); });

    I am now getting:
    Error 35 The best overloaded method match for 'System.Threading.Tasks.Parallel.ForEach<Unisys.Dwp.DsTypes.dsFactset.FactsetRow>(System.Collections.Concurrent.OrderablePartitioner<Unisys.Dwp.DsTypes.dsFactset.FactsetRow>, System.Action<Unisys.Dwp.DsTypes.dsFactset.FactsetRow,System.Threading.Tasks.ParallelLoopState,long>)' has some invalid arguments

    Unfortunately, I am not clear on what part of the statement it really is point me at or how to correct it.

    Monday, February 22, 2010 11:59 PM
  • By giving Parallel.ForEach an explicit type parameter as in Parallel.ForEach<dsFactset.FactsetRow>, you're telling it that it should expect as its first parameter an IEnumerable<FactsetRow>.  However, you're then passing in a non-generic IEnumerable, and the compiler is telling you that it's unable to find an overload that works with that parameter.

    Try this instead:

        Parallel.ForEach(
            this.DsFactset.Factset.Rows.Cast<dsFactset.FactsetRow>(),
            dr => { ... });

    The Cast here is from LINQ and is used to convert an IEnumerable to an IEnumerable<T>, where you specify what the T should be.

    Tuesday, February 23, 2010 12:04 AM
    Moderator
  • Much closer but still not working.

    I am now getting:  Error 29 'System.Data.DataRowCollection' does not contain a definition for 'Cast' and no extension method 'Cast' accepting a first argument of type 'System.Data.DataRowCollection' could be found (are you missing a using directive or an assembly reference?)

    Could this be because the dataset was originally built using the 2.0 Framework and Cast needs the dataset to be built with a newer framework/wizard?
    Tuesday, February 23, 2010 12:09 AM
  • Are you currently using at least .NET 3.5? 

    Have you imported the System.Linq namespace? (i.e. "using System.Linq;")

    Have you added an assembly reference to System.Core.dll?

    Tuesday, February 23, 2010 12:26 AM
    Moderator
  • I just added the using for Linq and the error went away.
    I guess I don't understand the relationship of the Linq and the Parallel.Foreach in this case.

    Been watching your videos on MSDN and found them very informative.  I would love to see a session on Parallel and DataSets.  In the meantime, any other references that might provide some more insight and/or samples?

    Thanks.
    Tuesday, February 23, 2010 12:32 AM
  • I just added the using for Linq and the error went away.
    I guess I don't understand the relationship of the Linq and the Parallel.Foreach in this case.


    Since your original DataTable wasn't strongly typed, you need to cast it into the appropriate type.  System.Linq makes this easier...  There isn't a direct correlation between Linq and Parallel.ForEach here - however, System.Linq adds the .Cast<T>() extension method for IEnumerable.  Basically, what you're doing now is equivelent to the following:

     
    // This requires System.Linq, since it's using an extension method from Linq to Objects
    IEnumerable<dsFactset.FactsetRow> temporaryEnumerable = this.DsFactset.Factset.Rows.Cast<dsFactset.FactsetRow>();
    
    // You can now use Parallel.ForEach directly on the new IEnumerable<T>
    Parallel.ForEach(temporaryEnumerable, dr => { ... });
    


    Reed Copsey, Jr. - http://reedcopsey.com
    Tuesday, February 23, 2010 12:42 AM
    Moderator
  • Making progress.  The Parallel.ForEach is now running.
    In the { ... } I have code which does:
              dr.BeginEdit();
              dr[field] = newValue;
              dr.EndEdit();

    I am now encountering exceptions indicating "DataTable internal index is corrupted; "n'".

    I was expecting that this should work since each datarow is being processed "independently" of all the others.

    The ultimate goal is to be able to take advantage of multi-core parallelism with the dataset logic that is already in place.

    Is this possible? 

    Tuesday, February 23, 2010 2:14 PM
  • Hi JMWilton-

    Unfortunately, DataTable is not safe to be used in this manner (though I understand and initially had the same expectations as you).  See:
    http://social.msdn.microsoft.com/Forums/en-US/parallelextensions/thread/0790b69b-acbd-4b4c-bbd9-7d1c4da1a4ef/
    and
    http://social.msdn.microsoft.com/forums/en-US/adodotnetdataproviders/thread/18544cd3-1083-45fe-b9e7-bb34482b68dd/

    Tuesday, February 23, 2010 3:42 PM
    Moderator
  • Thanks for your response.

    I can appreciate the problem...unfortunately not being able to get the benefits of parallel with datatables seems like a big hole. 

    Maybe a future release.

    Tuesday, February 23, 2010 5:05 PM
  • I know I am quite late with an answer, but I wanted to provide you one as I came across this issue myself and found a solution. 

    MySqlConnection con = new MySqlConnection("connectionstring");
    MySqlCommand com = new MySqlCommand();
    MySqlDataAdapter da = new MySqlDataAdapter();
    DataSet ds = new DataSet();
    
    com.Connection = con;
    com.CommandType = CommandType.Text;
    com.CommandText = queryToExecute;
    
    da.SelectCommand = com;
    da.Fill(ds);
    
    List<WellSample> list = new List<WellSample>();
    var queue = new ConcurrentQueue<WellSample>();
    
    Parallel.ForEach(ds.Tables[0].AsEnumerable(), dr =>
      {
          queue.Enqueue(ConvertDataRowIntoWellSample(dr));
      });
    
       list = queue.ToList();

    The main problem is that the Rows collection is not IEnumerable. I hope this helps someone at least.
    Wednesday, November 21, 2012 5:10 PM
  • That is BRILLIANT!  Thanks!  It solved the issue I was struggling with.
    Friday, November 13, 2015 5:57 AM