locked
Searching string on physical file path RRS feed

  • Question

  • Hi,

    I want to search a specific list of strings in a physical file path. There are around 50,000 files. Can anyone please let me know the best approach without impacting much performance?

    Kindly waiting for your response.

    Thanks,

    Santosh

    Monday, July 6, 2020 10:02 AM

Answers

  • Hi Santosh Umarani,
    As Tim Roberts said, you can try the Parallel.ForEach which works like a Parallel.For loop. The loop partitions the source collection and schedules the work on multiple threads based on the system environment. 
    Here is a code example you can refer to.

    private static void GetMessages()
    {
        DirectoryInfo d = new DirectoryInfo(@"C:\Users\Desktop");//Assuming Test is your Folder
        FileInfo[] Files = d.GetFiles("*.txt"); //Getting Text files
        Parallel.ForEach(Files, (file) =>
        {
            using (StreamReader reader = new StreamReader(file.FullName))
            {
                string line;
                while ((line = reader.ReadLine()) != null)
                {
    
                    if (line.Contains("search string"))
                    {
    
                        Console.WriteLine(line);
                    }
                }
            }
    
        });
    }

    Best Regards,
    Daniel Zhang


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Tuesday, July 7, 2020 2:56 AM

All replies

  • For example, create a DirectoryInfo object, then call EnumerateFiles to make a loop. For each FileInfo, get the path using FullName, then call File.ReadLines to make the inner loop. Use string.Contains or string.Equals with your list of strings; specify the parameters to make a case-insensitive comparison.

    Parallelising disk operations does not always help but consider this approach too: create a thread that reads files or portions into memory, and other threads that analyse and discard the text.


    • Edited by Viorel_MVP Monday, July 6, 2020 10:55 AM
    Monday, July 6, 2020 10:45 AM
  • Thank you so much for response.

    Can you please let me know more details on parallelising disk operations uisng thread ?

    Monday, July 6, 2020 11:46 AM
  • For this task, it would be very easy to replace the "for each" loop with a Parallel.ForEach.

    However, Viorel gave you very good advice.  If the files are small, then there is no point in parallelizing this.  you will always be limited by the speed of your disk.  It doesn't matter how many threads you run, the disk can only feed data at its own speed.

    Now, if the files are large, then the parallelism might help because you can keep several CPUs busy doing the searching.


    Tim Roberts | Driver MVP Emeritus | Providenza & Boekelheide, Inc.

    Monday, July 6, 2020 9:29 PM
  • Thanks again for the response.

    Will searching on multiple threads help here ? If so can you please let me know how can I achieve the same?

    Tuesday, July 7, 2020 2:25 AM
  • Hi Santosh Umarani,
    As Tim Roberts said, you can try the Parallel.ForEach which works like a Parallel.For loop. The loop partitions the source collection and schedules the work on multiple threads based on the system environment. 
    Here is a code example you can refer to.

    private static void GetMessages()
    {
        DirectoryInfo d = new DirectoryInfo(@"C:\Users\Desktop");//Assuming Test is your Folder
        FileInfo[] Files = d.GetFiles("*.txt"); //Getting Text files
        Parallel.ForEach(Files, (file) =>
        {
            using (StreamReader reader = new StreamReader(file.FullName))
            {
                string line;
                while ((line = reader.ReadLine()) != null)
                {
    
                    if (line.Contains("search string"))
                    {
    
                        Console.WriteLine(line);
                    }
                }
            }
    
        });
    }

    Best Regards,
    Daniel Zhang


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Tuesday, July 7, 2020 2:56 AM