locked
When searching for text in files how can i repeat the search without retrieving the files all over again ? RRS feed

  • Question

  • I'm doing two methods when searching. First retrieving files and then searching in the files.

    But in some cases i change only the searching text i search for and i want to repeat the searching in the same directories and files. So how can i make that it will remember the last retrieved files ?

    This is the button click event to start the backgorundworker and the searching methods:

    private void startButton_Click(object sender, EventArgs e)
            {
                ListViewCostumControl.lvnf.Items.Clear();
                numberoffiles = 0;
                numberofrestrictedFiles = 0;
                numberofdirs = 0;
                label24.Text = "0";
                label1.Text = "0";
                label15.Text = "0";
                Logger.Write("Operation started");
                label21.Text = "Phase 1: Retrieving files";
                label21.Visible = true;
                startButton.Enabled = false;
                stopButton.Enabled = true;
                pauseresumeButton.Enabled = true;
                label1.Select();
                timer1.Start();
                if (!backgroundWorker1.IsBusy)
                {
                    SetWorkerMode(true);
                    backgroundWorker1.RunWorkerAsync();
                }
            }


    This is the backgroundworker events in the completed i'm writing to a text file some information but not the whole retrieved files.

    private void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e)
            {
                BackgroundWorker worker = sender as BackgroundWorker;
                _stopwatch.Restart();
                string[] values = textBox1.Text.Split(new string[] { ",," }, StringSplitOptions.None);
                DirSearch(textBox3.Text, textBox2.Text, values, worker, e);
                _stopwatch.Stop();
            }
    
            private void backgroundWorker1_ProgressChanged(object sender, ProgressChangedEventArgs e)
            {
                MyProgress mypro = (MyProgress)e.UserState;
                ListViewCostumControl.lvnf.Items.Add(mypro.Report1);
                label15.Text = mypro.Report2;
                label15.Visible = true;
                if (ListViewCostumControl.lvnf.Items.Count > 9)
                    textBox4.Enabled = true;
            }
    
            private void backgroundWorker1_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
            {
                if (e.Cancelled == true)
                {
                    label1.Select();
                    _stopwatch.Stop();
                    label24.Text = "0";
                    label5.Text = "00:00:00";
                    label1.Text = "0";
                    label11.Text = "0";
                    label15.Text = "0";
                    label24.Text = "0";
                    pauseresumeButton.Enabled = false;
                    stopButton.Enabled = false;
                    startButton.Enabled = true;
                    timer1.Stop();
                    ListViewCostumControl.lvnf.Items.Clear();
                    Logger.Write("Operation cancelled");
                    button4.Enabled = true;
                }
                if (e.Error != null)
                {
                    
                }
                else
                {
                    label1.Select();
                    _stopwatch.Stop();
                    timer1.Stop();
                    stopButton.Enabled = false;
                    pauseresumeButton.Enabled = false;
                    startButton.Enabled = true;
                    label21.Text = "Last operation ended at: " + DateTime.Now;
                    Logger.Write("Number of retrieved files: " + numberoffiles);
                    Logger.Write("Number of restricted files: " + numberofrestrictedFiles);
                    Logger.Write("Number of searched files: " + numberofdirs);
                    Logger.Write("Number of results: " + label15.Text);
                    Logger.Write("Searched root directory: " + textBox3.Text);
                    Logger.Write("Operation time: " + _stopwatch.Elapsed);
                    Logger.Write("Operation ended");
                    Logger.Write(" ");
                    button4.Enabled = true;
                    mCompleted = true;
                    if (mClosePending) this.Close();
                }
            }


    This is the searching methods:

    int numberofdirs = 0;
            void DirSearch(string rootDirectory, string filesExtension, string[] textToSearch, BackgroundWorker worker, DoWorkEventArgs e)
            {
                List<string> resultsoftextfound = new List<string>();
                List<string> resultsoftextfound1 = new List<string>();
                List<string> filePathList = new List<string>();
                int numberoffiles = 0;
                try
                {
                    filePathList = SearchAccessibleFilesNoDistinct(rootDirectory, null,worker,e).ToList();
                }
                catch (Exception err)
                {
                    string ad = err.ToString();
                }
                label21.Invoke((MethodInvoker)delegate
                        {
                            label21.Text = "Phase 2: Searching in files";
                        });
                MyProgress myp = new MyProgress();
                myp.Report4 = filePathList.Count.ToString();
                foreach (string file in filePathList)
                {
                    try
                    {
                        _busy.WaitOne();
                        if (worker.CancellationPending == true)
                        {
                            e.Cancel = true;
                            return;
                        }
    
                        bool reportedFile = false;
    
                        for (int i = 0; i < textToSearch.Length; i++)
                        {
                            if (File.ReadAllText(file).IndexOf(textToSearch[i], StringComparison.InvariantCultureIgnoreCase) >= 0)
                            {
                                resultsoftextfound.Add(file + "  " + textToSearch[i]);
                                if (!reportedFile)
                                {
                                    numberoffiles++;
                                    
                                    myp.Report1 = file;
                                    myp.Report2 = numberoffiles.ToString();
                                    myp.Report3 = textToSearch[i];
                                    backgroundWorker1.ReportProgress(0, myp);
                                    reportedFile = true;
                                }
                            }
                        }
                        numberofdirs++;
                        label1.Invoke((MethodInvoker)delegate
                        {
                            label1.Text = string.Format("{0}/{1}", numberofdirs, myp.Report4);
                            label1.Visible = true;
                        });
                    }
                    catch (Exception)
                    {
                      
                    }
                }
            }


    And SearchAccessibleFilesNoDistinct method:

    string restrictedFile = "";
            List<string> restrictedFiles = new List<string>();
            int numberofrestrictedFiles = 0;
            int numberoffiles = 0;
            IEnumerable<string> SearchAccessibleFilesNoDistinct(string root, List<string> files,BackgroundWorker worker, DoWorkEventArgs e)
            {
                _busy.WaitOne();
                if (files == null)
                    files = new List<string>();
                if (Directory.Exists(root))
                {
                    foreach (var file in Directory.EnumerateFiles(root))
                    {
                        if (worker.CancellationPending == true)
                        {
                            e.Cancel = true;
                            return files;
                        }
                        restrictedFile = file;
                        string ext = Path.GetExtension(file);
                        if (!files.Contains(file) && ext == textBox2.Text)
                        {
                            files.Add(file);
                        }
                        numberoffiles++;
                        label24.Invoke((MethodInvoker)delegate
                        {
                            label24.Text = numberoffiles.ToString();
                            label24.Visible = true;
                        });
                    }
                    foreach (var subDir in Directory.EnumerateDirectories(root))
                    {
                        if (worker.CancellationPending == true)
                        {
                            e.Cancel = true;
                            return files;
                        }
                        try
                        {
                            SearchAccessibleFilesNoDistinct(subDir, files,worker, e);
                        }
                        catch (UnauthorizedAccessException)
                        {
                            restrictedFiles.Add(restrictedFile);
                            numberofrestrictedFiles++;
                            label11.Invoke((MethodInvoker)delegate
                            {
                                label11.Text = numberofrestrictedFiles.ToString();
                                label11.Visible = true;
                            });
                            continue;
                        }
                    }
                }
                return files;
            }


    The idea is to give the user and option to choose if to save or not somehow the last retrieved files so when he repeat the searching it will not retrieve the whole files over again only will search in them. In other words if i repeat the last searching just make the second method. 

    I call it Phase 1 and Phase 2.

    The problem might be that if i save the last retrieved files to a text files or something like that it might be a large file on the hard disk ? 

    Another problem is when i type the text to search for in textBox1 if i type two words with a space it's not the same without a space for example: Form1 is not the same as Form 1 How can i make that it will search for both results Form1 and Form 1 ?

    private void textBox1_TextChanged(object sender, EventArgs e)
            {
                if (textBox1.Text != "" && textBox3.Text != "" && Directory.Exists(textBox3.Text))
                {
                    startButton.Enabled = true;
                    Properties.Settings.Default["Setting2"] = textBox1.Text;
                    Properties.Settings.Default.Save();
                }
                else
                {
                    startButton.Enabled = false;
                }
            }

    I did that if the user type in the textBox1 ,,

    It will consider the text after it as another search text.

    void lvnf_SelectedIndexChanged(object sender, EventArgs e)
            {
                if (ListViewCostumControl.lvnf.SelectedItems.Count > 0)
                {
                    results = new List<int>();
                    richTextBox1.Text = File.ReadAllText(ListViewCostumControl.lvnf.Items[ListViewCostumControl.lvnf.SelectedIndices[0]].Text);
                    FileInfo fi = new FileInfo(ListViewCostumControl.lvnf.Items[ListViewCostumControl.lvnf.SelectedIndices[0]].Text);
                    label17.Text = ExtensionMethods.ToFileSize(fi.Length);
                    label17.Visible = true;
                    filePath = Path.GetDirectoryName(fi.FullName);
                    string word = textBox1.Text;
                    string[] test = word.Split(new string[] { ",," }, StringSplitOptions.None);
                    foreach (string myword in test)
                    {
                        HighlightPhrase(richTextBox1, myword, Color.Yellow);
                        label16.Text = results.Count.ToString();
                        label16.Visible = true;
                        if (results.Count > 0)
                        {
                            numericUpDown1.Maximum = results.Count;
                            numericUpDown1.Enabled = true;
                            richTextBox1.SelectionStart = results[(int)numericUpDown1.Value - 1];
                            richTextBox1.ScrollToCaret();
                        }
                    }
                }
            }

    For example if i type to search for Form1,,Form 1

    Then it will search for all the results of Form1 and of Form 1

    Or if i type Form1,,Help

    Then the results will be also for Form1 and for Help.

    The question not sure if a problem is more logic. When i type only Form1 should i consider it somehow also as Form 1 or i should leave it as it is now Form1,,Form 1 ? 

    Before i searched for SwitchCameras and it found results but before that i searched for Switch Cameras and it didn't find any results and i didn't put ,, between the Switch Cameras.

    So i think by logic Switch Cameras should also find SwitchCameras.

    So how can i change this searching rule so if there is no ,, but there is a space between the words search also for one word ?

    Switch Cameras will also find SwitchCameras but Switch,,Cameras might find more results.



    Wednesday, August 23, 2017 6:24 PM

Answers

  • 1) You're doing it right. If you don't want to retrieve the files again, you have to cache/save it somewhere locally.

    2) To make "Form 1" search for both "Form1" and "Form 1", the most efficient way would be to implement a customized search function yourself. So if it encounters space in the search string in matching but the next character in the target side is not space, it'll continue try match for the next character in search string instead. In this way it also allows you to declare a list of character to skip as you wish.

    • Marked as answer by Chocolade1972 Sunday, September 17, 2017 2:05 PM
    Thursday, August 24, 2017 1:46 AM
    Answerer

All replies

  • 1) You're doing it right. If you don't want to retrieve the files again, you have to cache/save it somewhere locally.

    2) To make "Form 1" search for both "Form1" and "Form 1", the most efficient way would be to implement a customized search function yourself. So if it encounters space in the search string in matching but the next character in the target side is not space, it'll continue try match for the next character in search string instead. In this way it also allows you to declare a list of character to skip as you wish.

    • Marked as answer by Chocolade1972 Sunday, September 17, 2017 2:05 PM
    Thursday, August 24, 2017 1:46 AM
    Answerer
  • Hello Chocolade1972,

    The problem might be that if i save the last retrieved files to a text files or something like that it might be a large file on the hard disk ?

    I don't think  it is more efficient way than last search . If the file becomes very large , it even reduce your search efficiency . You should do is that record some useful information in each search. For text of user input  each time , You could save the input and relating result like filepath into database or file .For the next the same input , It should be more efficient.


    >>if i type two words with a space it's not the same without a space for example: Form1 is not the same as Form 1 How can i make that it will search for both results Form1 and Form 1 ?

    It sound like fuzzy query, I think you could consider to install a database and  build a fuzzy query statement to search , this way is easy and you do not need to implement the code logic.

    Best regards,
    feih_7


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, August 24, 2017 6:19 AM
  • Humm... use database to do search on files on network share with possibly large files ?

    I can guarantee the search would be slow as hell even on optimal condition.



    On the other hand, install [strike]WSS[/strike] SharePoint Foundation (i.e.: WSSv4) and then set it to index the shared folder you're targeting, and ask WSS to return search result seems like a good idea. Just that I'm not sure if there exists syntax that let you omit space when searching. Update: You'll have to use Fast Query syntax to search both the original string and the string with space removed on "OR" condition to do the query.

    Thursday, August 24, 2017 6:34 AM
    Answerer
  • Hello cheong00,

    I mean that  put search content into database and build a fuzzy query statement to search , and don't care the fuzzy query code logic.

    sorry if it is not helpful.:-)

    Best regards,

    feih_7


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, August 24, 2017 6:50 AM