none
Extracting Key Words Using Regular Expressions RRS feed

  • Question

  • Hi, Everybody...

    I wish to write a small utility to read a C# file and find all instances of the following "pattern":

    <anyCharacter(s)>[DbRecord("<dbRecordName>"<anyCharacter(s)>class<whitespace><className>:<anyCharacter(s)>

    where

    • <anyCharacter(s)> is one or more characters, including whitespace, carriage return, linefeed, tab, etc.
    • <whitespace> is white space
    • <dbRecordName> is the DbRecord name I wish to extract
    • <className> is the class name I wish to extract
    • bold characters are literals, including the double quotes (") and colon (:)

    For each instance, I wish to extract the DbRecord name and corresponding class name.

    Questions:

    • What is the appropriate regular expression?
    • How would I iterate each instance and extract the DbRecord name and corresponding class name?  Would the results be returned as an ArrayList or some other IEnumerable element?

    Please advise.

    Thank-you!

    Thursday, December 11, 2014 7:20 PM

Answers

  • Hello Informatosaurus,

    >>bold characters are literals, including the double quotes (") and colon (:)

    So you mean every instance would have a structure as:

    [DbRecord(""class:

    If it is, the regular expression might be like below:

    string sPattern = ".*\\[DbRecord\\(\".*\".*class .*:.*";

    An instance like “aaa[DbRecord(\"ccccccc\"dddddclass dddd:eeeeeee” would be matched.

    >>How would I iterate each instance and extract the DbRecord name and corresponding class name?

    For this, you need to write additional regular expression and use the Regex.Matches(text), this would return all matched strings, for example, you could check this link:

    http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.110).aspx

    >>Would the results be returned as an ArrayList or some other IEnumerable element?

    It would be a MatchCollection type.

    Here is an example:

    string[] numbers = { "aaa[DbRecord(\"ccccccc\"dddddclass dddd:eeeeeee" };
    
    
                    string sPattern = ".*\\[DbRecord\\(\".*\".*class .*:.*";
    
                    string dbRecordName = "\".*\"";
    
                    string className = " .*:";
    
                    Regex rx = new Regex(dbRecordName, RegexOptions.Compiled | RegexOptions.IgnoreCase);
    
                    var dbrecirdnames = rx.Matches(numbers[0]);
    
                    rx = new Regex(className, RegexOptions.Compiled | RegexOptions.IgnoreCase);
    
                    var classNames = rx.Matches(numbers[0]);
    
                    // Keep the console window open in debug mode.
    
                    System.Console.WriteLine("Press any key to exit.");
    
                    System.Console.ReadKey();
    

    Regards.


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Friday, December 12, 2014 7:33 AM
    Moderator