none
multi-threading or parallel processing for each process within recursive function RRS feed

  • Question

  • I have my code which does collect the list of directories and saves it to an CSV output (DirectoryOutputfile.csv in my example). Using the directory list CSV file I use a recursive function and perform the task for each directory.

    The task involves inputting the each directory and initiating the process and saves the output to another CSV. This happens for each directory and saves the output.

    Also I am saving the completed directories into another CSV to know how many are being complete to view on LIVE basis.

    My ask: I am not using any threading in my case, so would like to make the process run in parallel to make it fast and have the output saved as mentioned above. I am new to the threading, kindly suggest how can I use parallel processing or multi-threading in my code to speed up as there are around ~750K directories in the list.

    I am not sure where would I need to implement this in my code as I have write and read operations using streamwriter and streamreader at multiple places.

    Code below:

    public static void OutputSigned(string dir, string outputdir, string checkdir)
    {
             string filenameOutput = "\\checkoutput_" + DateTime.Now.ToString("yyyy_MM_dd_HHmmss") + ".csv";

             if (!File.Exists(outputdir + "\\DirectoryOutputfile.csv"))
             {
                 var directories = DirectorySearch.GetDirectories(dir); //Gets the list of directories
                 directories.Sort();
                 DirectoriesToCSV(directories, outputdir); //Method to output the directory LIST to CSV file
             }

             using (StreamWriter writer = new StreamWriter(outputdir + filenameOutput)) //Creates output CSV file and writes the scanned files update
             {
                 writer.AutoFlush = true;

                 using (FileStream fileStreamDirectory = File.Open(outputdir + "\\DirectoryOutputfile.csv", FileMode.Open, FileAccess.Read, FileShare.ReadWrite)) //Read directories from the DirectoryOutputfile CSV file
                 using (BufferedStream bufferStreamDirectory = new BufferedStream(fileStreamDirectory))
                 using (StreamReader streamReaderDirectory = new StreamReader(bufferStreamDirectory))
                 {
                     var directoryLineCountinCSV = File.ReadLines(outputdir + "\\DirectoryOutputfile.csv").Count(); //Counts the numver of directory lines in DirectoryOutputfile.csv output file

                     using (StreamWriter DirectoryLineUpdatedSofarInCSV = new StreamWriter(outputdir + "\\DirectoryLineUpdatedSofar" + filenameOutput.Replace("\\checkoutput", ""))) //Creates CSV file with the list of already scanned directories
                     {
                         DirectoryLineUpdatedSofarInCSV.AutoFlush = true;
                         string Directoryline;

                         while ((Directoryline = streamReaderDirectory.ReadLine()) != null)
                         {
                             if (Directory.Exists(Directoryline))
                             {
                                 ProcessStartInfo ProcessStartInfo = new ProcessStartInfo
                                 {
                                     FileName = checkdir + "\\testing.exe",
                                     Arguments = "-a -h -i -l " + Directoryline,
                                     RedirectStandardOutput = true,
                                     UseShellExecute = false,
                                     CreateNoWindow = true
                                 };

                                 using (Process checkProcess = Process.Start(ProcessStartInfo))
                                 {
                                     var line = string.Empty;

                                     using (StreamReader streamReader = checkProcess.StandardOutput)
                                     {
                                         while (!streamReader.EndOfStream)
                                         {
                                             var content = streamReader.ReadLine();
                                             #Doing some task here
                                             writer.WriteLine("Updating the data in CSV file");
                                         }
                                     }
                                     checkProcess.WaitForExit(10000); //Waits for the process to exit
                                 }
                             }
                             DirectoryLineUpdatedSofarInCSV.WriteLine("{0},Completed", Directoryline);
                         }
                     }
                 }
             }
     }

     public static void DirectoriesToCSV(List<string> DirectoriesList, string outputdir)
     {
         using (StreamWriter directoryWriter = new StreamWriter(outputdir + "\\DirectoryOutputfile.csv"))
         {
             foreach (string directoryline in DirectoriesList)
             {
                 directoryWriter.WriteLine(directoryline);
             }
         }
     }


    Thanks, Dhilip

    Friday, January 4, 2019 11:09 PM

All replies

  • So the loop that starts while((Directoryline = streamReaderDirectory.ReadLine()) != null) is where you read each directory name from the file.  Right?  It would be easy enough to move the contents of that entire loop into a separate function, then fire off a thread to do the processing, something like

        Thread thread = new Thread( Class.HandleOneDirectory );
        thread.Start( Directoryline );

    It might be better to use a thread pool.  If you try to start 750,000 threads, you'll cause a riot.  Also, remember that you have to be careful when writing to a single file from multiple threads.  You might want to synchronize access to DirectoryLineUpdatedSofarInCSV.


    Tim Roberts | Driver MVP Emeritus | Providenza &amp; Boekelheide, Inc.

    Saturday, January 5, 2019 12:40 AM
  • Hi dhilip Gopalan,

    Thank you for posting here.

    For your question, please try to use Parallel.

    Here are some examples in the links for your reference.

    https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.parallel?view=netframework-4.7.2

    https://docs.microsoft.com/en-us/dotnet/standard/parallel-programming/how-to-write-a-simple-parallel-foreach-loop

    Best Regards,

    Wendy


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    • Proposed as answer by Stanly Fan Friday, February 1, 2019 7:40 AM
    Monday, January 7, 2019 5:59 AM
    Moderator