Not Operator in Regular Expressions
-
Thursday, April 26, 2007 7:19 PM
Hi,
I need a regular expression that does not match a pattern. We are comparing descriptions on a bank statement and we need to say for one account: If the description matches "BBM TFR TO 20-24-61" then it is this category, otherwise it is another category. The regular expression for this description works but I need an NOT operator that states if it is NOT "BBM TFR TO 20-24-61" then it is another category.
Sounds really simple but I am having alot of trouble finding what I need. Came across this "Isaac (?!Asimov)" which matches Isaac and Not Asimov - tried to replace Isaac with .* for everything except Asimov but no joy. I am using Expresso to test these.
Thanks for the help.
Answers
-
Sunday, May 06, 2007 6:54 PMModerator
The (?! invalidates the whole match, so finding info before or after that will not occur since the match has been invalidated. I recommend that you break out the steps of this process such as 1) determine if the condition exists and if it does do what needs to be done with a separate regex(?), otherwise match what needs to be matched in a specific regex for that condition.
Everyone, including myself, tries do everything all at once in a pattern, and sometimes that is not feasible.
All Replies
-
Thursday, April 26, 2007 11:42 PMModeratorOne must realize that the (?! ...) is used to invalidate the whole match. So what you want to do, as in the case with SSN numbers is to create matches that will fail and then specify the match(es) that will succeed and capture the data you want.
For example if one is getting a SSN there are two obvious failures that should not be matched, any SSN that starts with a 666 or 000. So we look for that condition such as found in red for invalidate then green for the valid match.
^(?!000)(?!666)(?<SSN3>[0-6]\d{2}|7(?:[0-6]\d|7[012]))([- ]?)(?!00)(?<SSN2>\d\d)\1(?!0000)(?<SSN4>\d{4})$
Which says if my SSN starts with 000 or 666 it is obviously a bad item and not to match it and we get a match failure.
The invalidates do not have to be the same pattern as the valids. See these posts Help on Regex and Help Bulilding a regex for other invalidate examples -
Friday, April 27, 2007 7:28 AMWhen you say "One must realize that the (?! ...) is used to invalidate the whole match. " - this is what I am looking for, but I can't get it to invalidate the whole match. We have a program that reads bank statements and looks at teh description field, based on the regular expression that matches that description we assign a category. THis all works fine, however for one account the description matches the regular expression than it assigns a certain category, we then need to say if it does not match the previous description then assign a different category. In other words every other description except the one that is assigned its own category. So i need to say if it does not match XXX then this category, which should pick up everything else.
-
Friday, April 27, 2007 9:04 AMOK I just found this: ^((?!regexp).)*$ and it works...so it matches everything that is not regexp, however items that have anything before or after regexp it still wont match - how can i tweak this to match everything thats not regexp or if regexp has information in front or after it, i.e not ot match the eact string.
-
Sunday, May 06, 2007 6:54 PMModerator
The (?! invalidates the whole match, so finding info before or after that will not occur since the match has been invalidated. I recommend that you break out the steps of this process such as 1) determine if the condition exists and if it does do what needs to be done with a separate regex(?), otherwise match what needs to be matched in a specific regex for that condition.
Everyone, including myself, tries do everything all at once in a pattern, and sometimes that is not feasible. -
Monday, October 29, 2007 10:33 AM
To skip a prefix, you can include the ".*" wildcard in your regexp.
E.g. to skip all strings with '.*Key', following seems to work fine: "^((?!.*Key).)*$".
Thanks, above info was very useful. Needed it for following, all three are same:
// demo 'include types', with user-selected list (user selects all except PK + FK)String regexp =
"(.*Schema)|(.*Table)|(.*Column)";filter.setType(regexp);
// demo 'exclude types', with user-selected listregexp =
"(.*PrimaryKey)|(.*ForeignKey)";filter.setType(
"^((?!("+regexp+")).)*$"); // (for end users, GUI should offer a 'NOT' check box :-)
// demo 'free format' expressions; entered by user, or hard coded behind a GUI 'exclude keys' check boxfilter.setType(
"^((?!.*Key).)*$");
Arjan v H -
Monday, October 05, 2009 12:27 PM
OK I just found this: ^((?!regexp).)*$ and it works...so it matches everything that is not regexp,
I needed to create a list of excluded words for usernames. I needed to reject any bad words. and I read that this below is a more efficient way to do what you're trying to do:
^(?:(?!regexp).)*$- Proposed As Answer by Mark Main Monday, October 05, 2009 12:33 PM

