Answered by:
Excluding Items in Search

Question
-
Hi,
I'm in a dilemma here, I'm trying to search for a string using RegEx but I don't want to include or match items found within a quoted string, here's a sample:
search: dog
input string:
a dog is in the city 'yet there are many dog also' and ther goes the dog
result:
2 matches
the first one is the first occurence of "dog"
the second is the third occurence of "dog"
Based on the sample I don't want to include strings contained in quotes. How can I do that in RegEx? Is it even possible?
Thanks in advance guys...
chow,Wednesday, September 24, 2008 4:29 PM
Answers
-
Sorry, I wrote the pattern quickly. Now that I come back to it I see a flaw. Here is an updated pattern and a test jig to experiment with it...
string pattern = @"(?>=[^'])*dog(?=[^']*('[^']+'|$))"; string[] tests = { "this is a dog but 'this is not a dog for our' purposes of a dog", "this is a dog but", "'this is not a dog for our' purposes of a dog", }; foreach (string test in tests) { Console.WriteLine(test); Regex rx = new Regex(pattern, RegexOptions.IgnorePatternWhitespace); Match mx = rx.Match(test); while (mx.Success) { Console.WriteLine("\t\t{0} {1} {2}", mx.Value, mx.Index, mx.Length); mx = mx.NextMatch(); } }
This isn't perfect because your quoted portion of the string could contain "possessive" nouns or contractions or proper names with apostrophes in them.
Les Potter, Xalnix Corporation, Yet Another C# Blog- Marked as answer by OmegaManModerator Saturday, October 4, 2008 9:05 PM
Wednesday, September 24, 2008 11:14 PM
All replies
-
Try this...
string
pattern = @"(?>=[^'])*dog(?=([^']*'[^']+')|$)";
Les Potter, Xalnix Corporation, Yet Another C# BlogWednesday, September 24, 2008 6:28 PM -
Sorry, I wrote the pattern quickly. Now that I come back to it I see a flaw. Here is an updated pattern and a test jig to experiment with it...
string pattern = @"(?>=[^'])*dog(?=[^']*('[^']+'|$))"; string[] tests = { "this is a dog but 'this is not a dog for our' purposes of a dog", "this is a dog but", "'this is not a dog for our' purposes of a dog", }; foreach (string test in tests) { Console.WriteLine(test); Regex rx = new Regex(pattern, RegexOptions.IgnorePatternWhitespace); Match mx = rx.Match(test); while (mx.Success) { Console.WriteLine("\t\t{0} {1} {2}", mx.Value, mx.Index, mx.Length); mx = mx.NextMatch(); } }
This isn't perfect because your quoted portion of the string could contain "possessive" nouns or contractions or proper names with apostrophes in them.
Les Potter, Xalnix Corporation, Yet Another C# Blog- Marked as answer by OmegaManModerator Saturday, October 4, 2008 9:05 PM
Wednesday, September 24, 2008 11:14 PM