Tuesday, January 16, 2007 6:41 PMHello
I'm newbe with reg expressions
I want to build a simple parser to analize a SQL string. I only need to analize a SQL of type Select and get these groups:
- Select and From clauses
- Where clause (if any) without the where reserved word
- Rest of the sql string (if any)
I have archived this with this reg expressions::
string Expr1 = @"(WHERE\s+(.+?))(\s+ORDER BY.+|\s+GROUP BY.+|\s*$)";
string Expr2 = @"(\s+ORDER BY.+|\s+GROUP BY.+)$";
I get the select clauses with:
SelectClause = SQL.Substring(0, m.Groups.Index - 1);
WhereClause = m.Groups.Value;
FinalClauses = m.Groups.Value;
If no match is successfull (no where clause present) I try with Expr2 with similar process.
But how can i do (if possible) to get work with a subquery like this:
where f3 in (select tf1 from table2 group by f4 having f5 > 1 )
order by f1
The problem is that the first 'group by' match belongs to the subquery and I need to get the entire where clause.
thanks in advance
Tuesday, January 16, 2007 8:16 PMModeratorOne of the most powerful tools in RegEx is the Lookahead/lookbehind qualifiers (?= and ?<=. I think of them as bookends. I created this expression which will just grab the where clause.
What it does, not a linear interpolation mind you, is look for text sandwiched between a where and an order or end of text and also, do not match sandwiching items..
Note I prefer named groups and that is what (?<Where>.*?) does vs just saying (.*?)....because of that in the code I specify Explicit capture which says do not capture any non numbered or named groups. It allows one to say m.groups["Where"].value.
Here is the output of the capture, note there is a \n which is not apparent to the display below.
f3 in (select tf1 from table2 group by f4 having f5 > 1 )
// using System.Text.RegularExpressions;
/// Regular expression built for C# on: Tue, Jan 16, 2007, 01:09:13 PM
/// Using Expresso Version: 3.0.2559, http://www.ultrapico.com
/// Capture just the where clause in a single run.
/// A description of the regular expression:
/// Match a prefix but exclude it from the capture. [where]
/// [Where]: A named capture group. [.*?]
/// Any character, any number of repetitions, as few as possible
/// Match a suffix but exclude it from the capture. [order|$]
/// Select from 2 alternatives
/// End of line or string
public Regex CaptureWhere = new Regex(