none
Regex to parse postal address

    Question

  • Greetings:

     

    I am using the following C# function to parse the premise from a postal address:

     

    public static string ParseHouseNumber(string streetAddress)
            {
                streetAddress = streetAddress.Replace(".", " ").Replace(",", " ");
                streetAddress = WordFunctions.RemoveWhiteSpace(streetAddress);
    
                var parsedHouseNumber = "";
                var splitAddress = streetAddress.Split(' ');
                foreach (var partOfAddress in splitAddress)
                {
                    if (CountLetters(partOfAddress) <= 1 && partOfAddress != "V")
                    //if (CountLetters(partOfAddress) < 1)
                    {
                        parsedHouseNumber += partOfAddress;
                    }
                }
                return parsedHouseNumber;
            }
    
            static int CountLetters(string partOfAddress)
            {
                return partOfAddress.Count(c => Char.IsLetter(c));
            }

    However, although this works on some postal addresses, the below sample addresses are causing me problems:

    4Th Floor 15 Basinghall Street
    2Nd Floor 67-74 Saffron Hill
    1St Floor 8 Spencer Parade

    What I end up with is 415, 26774 and 18 for house number.

    I'm not very good with Regex and I've spent this afternoon trying to think of a good way to resolve this so that I can differentiate between a level and a premise. Any suggestions?

    Thanks in advance.

     

    www.SQL4n00bs.com

    Tuesday, October 22, 2013 3:52 PM

Answers

  • It may be a good idea to do as IB00 pointed out. I delimited the different parts of each address line by a comma as delimiter for the selection below:

    4Th Floor,15,Basinghall Street
    2Nd Floor,67-74,Saffron Hill
    1St Floor,8,Spencer Parade
    1000,Regent Park
    56A,South Clifton Street
    1St Floor,Stanmore House
    1St Floor,Regency Arcade
    6Th Floor,Newbury House
    6Th Floor,Newbury House
    6Th Floor,Newbury House
    77A,Charterhouse Street
    133C,Andersonstown Road
    79A,Digbeth High Street
    61A,Banstead Road South
    16B-18B,Nicholas Stree

    and changed my method above to:

    public static string GetHouseNo(string address) {
        string pattern = @"(?<floor>.*Floor)?(?:,)?(?<houseNo>\d+\w?(-\d+\w?)?)?(?:,)(?<street>\w.*\b)";
        Regex regex = new Regex(pattern);
        Match match = regex.Match(address);
    
        return string.Format("{0,-15} HouseNo: {1,-10}	Street: {2}", match.Groups["floor"],match.Groups["houseNo"],match.Groups["street"]);
    }
    

    Keeping the delimiters makes it much simpler to use a regex filter.

    wizend

    Wednesday, October 23, 2013 6:02 PM
  • You need to go through your list of addresses and work out what the 'correct' answer is for you.  For example, these two addresses

    10A White Hart Parade
    3Rd Floor Vyman House

    They both are in the form

    <Number><letters><space><letters><space><letters><space><letters>

    What result do you want from these two addresses?  From what you have said so far it seems that you want '10A' from the first (or is it '10' that you want?) and nothing from the second.  To differentiate these you will not be able to use pattern matching.  You will need to parse the string to determine that the first address does have a street number and the second one does not.  This is a hard problem (there are so many possible variations - what if someone lives in 'Floor Road', is just one problematic example).

    My guess is that what you are trying to do is not achievable.  You will need to accept some percentage of incorrect results as a fact of life.

    What is the business problem you are trying to solve?


    Paul Linton

    Thursday, October 24, 2013 12:00 AM
  • I remember some of the hiccups I encountered when first working with regular expressions.  Here is some sample code that should get you started.  String addresses are parsed in the function AddressParser.Parse().  You pass a string address to this function and it returns an instance of the class Address.  To parse a string address, the sample code defines 3 different regular expressions (see comments starting with 'search for pattern 1', 'search for pattern 2' and 'search for pattern 3') - you will want to define your own regular expressions to suit your requirements.  The search stops with the first pattern found in the string address.  The sample code is intentionally long in order to illustrate a few points.  

    This is not a good implementation because it's success depends upon the order in which the 3 expressions/patterns are executed.  Specifically, if the last expression, #3, is moved so that it is executed first then you will find address information is incorrectly parsed - a good example for you to test.

     
    using System;
    using System.Text;
    using System.Text.RegularExpressions;
    
    namespace TestCS
    {
    	internal class Program
    	{
    		private static void Main( string[] args )
    		{
    			try
    			{
    				ParseStreetAddresses();
    				Console.ReadKey();
    			}
    			catch( Exception e )
    			{
    				Console.WriteLine( e.ToString() );
    				Console.ReadKey();
    			}
    		}
    
    		private static void ParseStreetAddresses()
    		{
    			string[] inputAddresses =
    				{
    					//"56A South Clifton Street",
    					"16B-18B Nicholas Street",
    					"266A-271 Broad Street",
    					"1St Floor, Stanmore House",
    					"Charles House, 5Th Floor",
    					"Dome Building, 2Nd Floor",
    					"2Nd Floor, Dome Building"
    				};
    
    			var parser = new AddressParser();
    
    			foreach( var inputAddress in inputAddresses )
    			{
    				Console.WriteLine("input address: {0}", inputAddress);
    
    				var address = parser.Parse( inputAddress );
    				if( null == address )
    				{
    					Console.WriteLine( "error: unable to parse address" );
    					continue;
    				}
    
    				Console.WriteLine( 
    					"\taddress parsed: \n\t\tlocation = {0} \n\t\tbuilding = {1} \n\t\tfloor = {2}\n",
    					String.IsNullOrEmpty( address.Location ) ? "undefined" : address.Location,
    					String.IsNullOrEmpty( address.BuildingName ) ? "undefined" : address.BuildingName,
    					String.IsNullOrEmpty( address.Floor ) ? "undefined" : address.Floor
    				);
    			}
    
    		}
    
    		internal class Address
    		{
    			public string Location
    			{
    				get;
    				set;
    			}
    			public string BuildingName
    			{
    				get;
    				set;
    			}
    			public string Floor
    			{
    				get;
    				set;
    			}
    		}
    
    		internal class AddressParser
    		{
    			public Address Parse( string inputAddress )
    			{
    				const string streetNumberPattern = @"(?<streetNumber>\d+\w+\s*(-\s*\d+\w*\s)?)";
    				const string locationPattern = @"(?<location>\w.*)";
    				const string floorPattern = @"(?<floor>\d+\w+\s+Floor)";
    				const string commaPattern = @"(?:\s*,\s*)";
    
    				const string streetNumberKey = "streetNumber";
    				const string locationKey = "location";
    				const string floorKey = "floor";
    				const string undefined = "undefined";
    
    				// search for pattern 1: floor number , location (e.g. '1St Floor, Stanmore House')
    				var match = Regex.Match(
    					inputAddress,
    					floorPattern + commaPattern + locationPattern,
    					RegexOptions.IgnoreCase
    				);
    				if( match.Success && match.Groups.Count == 3 )
    				{
    					_showMatchGroupValues( match );
    
    					var address = new Address();
    					address.Location = match.Groups[locationKey].Success ? match.Groups[locationKey].Value : undefined;
    					address.Floor = match.Groups[floorKey].Success ? match.Groups[floorKey].Value : undefined;
    					return address;
    				}
    
    				// search for pattern 2: location , floor number (e.g.: 'Dome Building, 2Nd Floor')
    				match = Regex.Match(
    					inputAddress,
    					locationPattern + commaPattern + floorPattern,
    					RegexOptions.IgnoreCase
    				);
    				if( match.Success && match.Groups.Count == 3 )
    				{
    					_showMatchGroupValues( match );
    
    					var address = new Address();
    					address.Location = match.Groups[locationKey].Success ? match.Groups[locationKey].Value : undefined;
    					address.Floor = match.Groups[floorKey].Success ? match.Groups[floorKey].Value : undefined;
    					return address;
    				}
    
    				// search for pattern 3: street number + street name (e.g.: '56A South Clifton Street' or '16B-18B Nicholas Street')
    				 match = Regex.Match(
    					inputAddress,
    					streetNumberPattern + locationPattern,
    					RegexOptions.IgnoreCase
    				);
    				 if( match.Success && match.Groups.Count == 4 )
    				{
    					_showMatchGroupValues( match );
    
    					var address = new Address();
    					address.Location = String.Format(
    						"{0} {1}",
    						match.Groups[streetNumberKey].Success ? match.Groups[streetNumberKey].Value : undefined,
    						match.Groups[locationKey].Success ? match.Groups[locationKey].Value : undefined
    					);
    					return address;
    				}
    
    				return null;
    			}
    
    			private void _showMatchGroupValues( Match match )
    			{
    				var ctGroups = match.Groups.Count;
    				Console.WriteLine( "\tMatch group count: {0}", ctGroups );
    
    				int x = 0;
    				foreach( Group group in match.Groups )
    				{
    					Console.WriteLine( "\tGroup #{0}, value: {1}",
    						++x,
    						string.IsNullOrEmpty( group.Value ) ? "undefined" : group.Value
    					);
    				}
    			}
    
    		}
    	}
    }

    The output from this sample code is:

    input address: 16B-18B Nicholas Street
            Match group count: 4
            Group #1, value: 16B-18B Nicholas Street
            Group #2, value: -18B
            Group #3, value: 16B-18B
            Group #4, value: Nicholas Street
            address parsed:
                    location = 16B-18B  Nicholas Street
                    building = undefined
                    floor = undefined

    input address: 266A-271 Broad Street
            Match group count: 4
            Group #1, value: 266A-271 Broad Street
            Group #2, value: -271
            Group #3, value: 266A-271
            Group #4, value: Broad Street
            address parsed:
                    location = 266A-271  Broad Street
                    building = undefined
                    floor = undefined

    input address: 1St Floor, Stanmore House
            Match group count: 3
            Group #1, value: 1St Floor, Stanmore House
            Group #2, value: 1St Floor
            Group #3, value: Stanmore House
            address parsed:
                    location = Stanmore House
                    building = undefined
                    floor = 1St Floor

    input address: Charles House, 5Th Floor
            Match group count: 3
            Group #1, value: Charles House, 5Th Floor
            Group #2, value: Charles House
            Group #3, value: 5Th Floor
            address parsed:
                    location = Charles House
                    building = undefined
                    floor = 5Th Floor

    input address: Dome Building, 2Nd Floor
            Match group count: 3
            Group #1, value: Dome Building, 2Nd Floor
            Group #2, value: Dome Building
            Group #3, value: 2Nd Floor
            address parsed:
                    location = Dome Building
                    building = undefined
                    floor = 2Nd Floor

    input address: 2Nd Floor, Dome Building
            Match group count: 3
            Group #1, value: 2Nd Floor, Dome Building
            Group #2, value: 2Nd Floor
            Group #3, value: Dome Building
            address parsed:
                    location = Dome Building
                    building = undefined
                    floor = 2Nd Floor

     


    Thursday, October 24, 2013 12:37 AM

All replies

  • Can you give use sample data of the results you want along with sample that work.

    jdweng

    Tuesday, October 22, 2013 4:33 PM
  • Heres is one attempt based on the little examples you gave us so far:

    static void Main(string[] args) {
        string path = Path.Combine(Path.GetTempPath(), "Temp\\addresses.txt");
        string[] addresses = File.ReadAllLines(path);
    
        foreach (var address in addresses) {
            Console.WriteLine("{0}", GetHouseNo(address));
        }
    
        Console.WriteLine("Finished");
        Console.ReadKey();
    }
    
    public static string GetHouseNo(string address) {
        string pattern = @"(?<floor>.*\b)\s(?<houseNo>\d+(-\d+)?)\s(?<street>\w.*\b)";
        Regex regex = new Regex(pattern);
        Match match = regex.Match(address);
    
        return string.Format("Street: {0},	HouseNo: {1},	Floor: {2}", match.Groups["street"], match.Groups["houseNo"], match.Groups["floor"]);
    }
    

    Here are some results:

    Kind regards,

    wizend


    Tuesday, October 22, 2013 4:58 PM
  • Hi,

    This is great thanks very much for your efforts. However, I think this assumes that the floor always comes before the street name (which it should) but as you know with mailing addresses it's never that simple!

    Once again thanks.


    www.SQL4n00bs.com

    Wednesday, October 23, 2013 1:53 PM
  • Perhaps, you could give us a more representative choice of examples. If you wish to use Regex you first need to find a repetitive pattern in your source strings. Clearly, my snippet works only on that pattern filtered out from those three example lines above.

    If the floor is optional, you might try:

    string pattern = @"(?<floor>.*Floor\b)?\s?(?<houseNo>\d+(-\d+)?)\s(?<street>\w.*\b)";

    If there is no distinguishable pattern and order of those constituents of that address lines at all, you could try to filter for each part individually (with as much different patterns as possible address line parts).

    wizend

    Wednesday, October 23, 2013 2:35 PM
  • Hi wizend,

    I've been trying to an idea where I split on space and then do some work on the various address elements but it's so messy I feel lost already!

    More examples addresses below:

    56A South Clifton Street
    1St Floor Stanmore House
    1St Floor Regency Arcade
    6Th Floor Newbury House
    6Th Floor Newbury House
    6Th Floor Newbury House
    77A Charterhouse Street
    133C Andersonstown Road
    79A Digbeth High Street
    61A Banstead Road South
    16B-18B Nicholas Street
    Dome Building 2Nd Floor
    2Nd Floor Dome Building
    2Nd Floor Dome Building
    2Nd Floor Dome Building
    2Nd Floor Dome Building
    4Th Floor Saltire Court
    3Rd Floor Charter House
    5Th Floor Charles House
    5Th Floor Charles House
    Charles House 5Th Floor
    1St Floor Epworth House

    2Nd Floor Connies House
    2Nd Floor Connies House
    56A Tooting High Street
    148-148A Holland Street
    17A Harlaw Hill Gardens
    Central House 1St Floor
    12C-12D Austhorpe Road
    179A Tottenham Ct Road
    1A Constitution Street
    13A Montpellier Parade
    31A Wolverhampton Road
    5A Branksome Wood Road
    5A Branksome Wood Road
    5A Branksome Wood Road
    106A University Street
    25A Kenton Park Parade
    5Th Floor Regina House
    5Th Floor Regina House
    7Th Floor 11 Old Jewry
    122A Humberston Avenue
    11A Lancaster Crescent
    1St Floor Global House
    1St Floor Global House
    1St Floor Global House
    1St Floor Global House
    1St Floor Global House
    Argyll House 2Nd Floor
    23A Kenilworth Gardens
    4A West Princes Street
    1A Wollstonecraft Road
    74B Kirkintilloch Road
    296A Strathmore Avenue
    20A Wellbrooke Gardens
    76A Richmond Park Road
    11B West Halkin Street
    3A Queensferry St Lane
    51A West Regent Street
    68A East Kilbride Road
    Curzon House 2Nd Floor
    51A South Lambeth Road
    51A Regents Park Road
    3Rd Floor The Heights
    1St Floor Tudor House
    1St Floor Tudor House
    266A-271 Broad Street
    75A Jacobs Wells Road
    224A Holdenhurst Road
    36A Warriston Gardens
    1A King Edward Street
    2Nd Floor Hill Stores
    3E Whitemountain Road
    1A Auchingramont Road
    79A High Street North
    45A Scarbrough Avenue
    20A Mountstewart Road
    27a Stubbington Green
    2A Shillington Street
    2Nd Floor Tower House
    Tower House 2Nd Floor
    Tower House 2Nd Floor
    19B Glenorchy Terrace
    10A White Hart Parade
    3Rd Floor Vyman House
    3Rd Floor The Heights
    5Th Floor Brook House
    30A Wavendon Crescent
    4A Knightrider Street
    3A St Michaels Street
    24A South Park Street
    3Rd Floor Heron House
    2Nd Floor Aquis House
    2Nd Floor Crown House
    1A Glen Douglas Drive
    62A Bloomfield Avenue
    365A Barlow Moor Road
    30A Stephenson Street
    51A-57A Chertsey Road
    453B Lea Bridge Road
    Unit 1A Global House
    308C Victoria Centre
    49A North Bar Street
    60A Marine Promenade
    3A St Vincent Street
    35A Killieser Avenue
    21B Old Broad Street
    32A Courtenay Street
    2A St Georges Square
    1A Gatteridge Street
    41A Hatton Hill Road
    10A St Peters Street
    50A West Main Street
    9A Bucknall New Road
    10A Strathearn Place
    128-132A Burton Road
    32Nd Floor 30 Street
    32Nd Floor 30 Street
    37B Robertson Street
    1A Torphichen Street
    00E Holly Tree Hotel
    Unit 2B Vantage Park
    13A Victoria Gardens
    2Nd Floor The Atrium
    1A/Festing Buildings
    55A Frederick Street
    40A Guildford Street
    39A St Patricks Road
    16A Stephenson Place
    99B Warren Wood Road
    73-75A George Street
    104A Bradford Street
    8A Artillery Passage
    805A Commercial Road
    1A Torphichen Street
    1A Torphichen Street
    61A Frederick Street
    4A South King Street
    2Nd Floor York House
    14A Eccleston Street
    26A Westfield Street
    1A St Martins Street
    39A Thoresby Avenue
    81A Beatrice Street
    52A Braunstone Gate
    30A Manchester Road
    305A North End Road
    13A Crawford Street
    33A-39 Manor Street
    73A Waddon New Road
    201A Normanton Road
    9A Southgate Street
    72A Waterloo Street
    89A Cornwall Street
    1B Laburnum Terrace
    37E Whitegate Drive
    214A Iron Mill Lane

    Thanks.


    www.SQL4n00bs.com



    • Edited by Abu Dina Wednesday, October 23, 2013 3:27 PM
    Wednesday, October 23, 2013 3:21 PM
  • I expect it will be easier to parse the information using Regex on the original street address format (i.e. before you remove the commas and white spaces).  Can you provide the un-edited list again (i.e. include all comma's and white spaces).

    These are some of the patterns in the samples you provided:

    1. number + suffix + street name  (e.g. 56A South Clifton Street)
    2. (number + suffix) - (number + suffix), street name  (e.g. 16B-18B Nicholas Street)
    3. (number + suffix) - number, street name (e.g. 266A-271 Broad Street)
    4. number + suffix + floor, house name (e.g. 1St Floor Stanmore House)
    5. number + suffix + floor, number street name (e.g. 7Th Floor 11 Old Jewry)
    6. house name, number + suffix + floor (e.g. Charles House 5Th Floor)
    7. building, number + suffix (e.g. Dome Building 2Nd Floor)
    8. number + suffix, building (e.g. 2Nd Floor Dome Building)

    Regex can easily be configured to identify each pattern.   I can get you started with some Regex sample code if you provide the original/unedited lists of addresses.



    • Edited by IB00 Wednesday, October 23, 2013 5:04 PM
    Wednesday, October 23, 2013 4:37 PM
  • It may be a good idea to do as IB00 pointed out. I delimited the different parts of each address line by a comma as delimiter for the selection below:

    4Th Floor,15,Basinghall Street
    2Nd Floor,67-74,Saffron Hill
    1St Floor,8,Spencer Parade
    1000,Regent Park
    56A,South Clifton Street
    1St Floor,Stanmore House
    1St Floor,Regency Arcade
    6Th Floor,Newbury House
    6Th Floor,Newbury House
    6Th Floor,Newbury House
    77A,Charterhouse Street
    133C,Andersonstown Road
    79A,Digbeth High Street
    61A,Banstead Road South
    16B-18B,Nicholas Stree

    and changed my method above to:

    public static string GetHouseNo(string address) {
        string pattern = @"(?<floor>.*Floor)?(?:,)?(?<houseNo>\d+\w?(-\d+\w?)?)?(?:,)(?<street>\w.*\b)";
        Regex regex = new Regex(pattern);
        Match match = regex.Match(address);
    
        return string.Format("{0,-15} HouseNo: {1,-10}	Street: {2}", match.Groups["floor"],match.Groups["houseNo"],match.Groups["street"]);
    }
    

    Keeping the delimiters makes it much simpler to use a regex filter.

    wizend

    Wednesday, October 23, 2013 6:02 PM
  • You need to go through your list of addresses and work out what the 'correct' answer is for you.  For example, these two addresses

    10A White Hart Parade
    3Rd Floor Vyman House

    They both are in the form

    <Number><letters><space><letters><space><letters><space><letters>

    What result do you want from these two addresses?  From what you have said so far it seems that you want '10A' from the first (or is it '10' that you want?) and nothing from the second.  To differentiate these you will not be able to use pattern matching.  You will need to parse the string to determine that the first address does have a street number and the second one does not.  This is a hard problem (there are so many possible variations - what if someone lives in 'Floor Road', is just one problematic example).

    My guess is that what you are trying to do is not achievable.  You will need to accept some percentage of incorrect results as a fact of life.

    What is the business problem you are trying to solve?


    Paul Linton

    Thursday, October 24, 2013 12:00 AM
  • I remember some of the hiccups I encountered when first working with regular expressions.  Here is some sample code that should get you started.  String addresses are parsed in the function AddressParser.Parse().  You pass a string address to this function and it returns an instance of the class Address.  To parse a string address, the sample code defines 3 different regular expressions (see comments starting with 'search for pattern 1', 'search for pattern 2' and 'search for pattern 3') - you will want to define your own regular expressions to suit your requirements.  The search stops with the first pattern found in the string address.  The sample code is intentionally long in order to illustrate a few points.  

    This is not a good implementation because it's success depends upon the order in which the 3 expressions/patterns are executed.  Specifically, if the last expression, #3, is moved so that it is executed first then you will find address information is incorrectly parsed - a good example for you to test.

     
    using System;
    using System.Text;
    using System.Text.RegularExpressions;
    
    namespace TestCS
    {
    	internal class Program
    	{
    		private static void Main( string[] args )
    		{
    			try
    			{
    				ParseStreetAddresses();
    				Console.ReadKey();
    			}
    			catch( Exception e )
    			{
    				Console.WriteLine( e.ToString() );
    				Console.ReadKey();
    			}
    		}
    
    		private static void ParseStreetAddresses()
    		{
    			string[] inputAddresses =
    				{
    					//"56A South Clifton Street",
    					"16B-18B Nicholas Street",
    					"266A-271 Broad Street",
    					"1St Floor, Stanmore House",
    					"Charles House, 5Th Floor",
    					"Dome Building, 2Nd Floor",
    					"2Nd Floor, Dome Building"
    				};
    
    			var parser = new AddressParser();
    
    			foreach( var inputAddress in inputAddresses )
    			{
    				Console.WriteLine("input address: {0}", inputAddress);
    
    				var address = parser.Parse( inputAddress );
    				if( null == address )
    				{
    					Console.WriteLine( "error: unable to parse address" );
    					continue;
    				}
    
    				Console.WriteLine( 
    					"\taddress parsed: \n\t\tlocation = {0} \n\t\tbuilding = {1} \n\t\tfloor = {2}\n",
    					String.IsNullOrEmpty( address.Location ) ? "undefined" : address.Location,
    					String.IsNullOrEmpty( address.BuildingName ) ? "undefined" : address.BuildingName,
    					String.IsNullOrEmpty( address.Floor ) ? "undefined" : address.Floor
    				);
    			}
    
    		}
    
    		internal class Address
    		{
    			public string Location
    			{
    				get;
    				set;
    			}
    			public string BuildingName
    			{
    				get;
    				set;
    			}
    			public string Floor
    			{
    				get;
    				set;
    			}
    		}
    
    		internal class AddressParser
    		{
    			public Address Parse( string inputAddress )
    			{
    				const string streetNumberPattern = @"(?<streetNumber>\d+\w+\s*(-\s*\d+\w*\s)?)";
    				const string locationPattern = @"(?<location>\w.*)";
    				const string floorPattern = @"(?<floor>\d+\w+\s+Floor)";
    				const string commaPattern = @"(?:\s*,\s*)";
    
    				const string streetNumberKey = "streetNumber";
    				const string locationKey = "location";
    				const string floorKey = "floor";
    				const string undefined = "undefined";
    
    				// search for pattern 1: floor number , location (e.g. '1St Floor, Stanmore House')
    				var match = Regex.Match(
    					inputAddress,
    					floorPattern + commaPattern + locationPattern,
    					RegexOptions.IgnoreCase
    				);
    				if( match.Success && match.Groups.Count == 3 )
    				{
    					_showMatchGroupValues( match );
    
    					var address = new Address();
    					address.Location = match.Groups[locationKey].Success ? match.Groups[locationKey].Value : undefined;
    					address.Floor = match.Groups[floorKey].Success ? match.Groups[floorKey].Value : undefined;
    					return address;
    				}
    
    				// search for pattern 2: location , floor number (e.g.: 'Dome Building, 2Nd Floor')
    				match = Regex.Match(
    					inputAddress,
    					locationPattern + commaPattern + floorPattern,
    					RegexOptions.IgnoreCase
    				);
    				if( match.Success && match.Groups.Count == 3 )
    				{
    					_showMatchGroupValues( match );
    
    					var address = new Address();
    					address.Location = match.Groups[locationKey].Success ? match.Groups[locationKey].Value : undefined;
    					address.Floor = match.Groups[floorKey].Success ? match.Groups[floorKey].Value : undefined;
    					return address;
    				}
    
    				// search for pattern 3: street number + street name (e.g.: '56A South Clifton Street' or '16B-18B Nicholas Street')
    				 match = Regex.Match(
    					inputAddress,
    					streetNumberPattern + locationPattern,
    					RegexOptions.IgnoreCase
    				);
    				 if( match.Success && match.Groups.Count == 4 )
    				{
    					_showMatchGroupValues( match );
    
    					var address = new Address();
    					address.Location = String.Format(
    						"{0} {1}",
    						match.Groups[streetNumberKey].Success ? match.Groups[streetNumberKey].Value : undefined,
    						match.Groups[locationKey].Success ? match.Groups[locationKey].Value : undefined
    					);
    					return address;
    				}
    
    				return null;
    			}
    
    			private void _showMatchGroupValues( Match match )
    			{
    				var ctGroups = match.Groups.Count;
    				Console.WriteLine( "\tMatch group count: {0}", ctGroups );
    
    				int x = 0;
    				foreach( Group group in match.Groups )
    				{
    					Console.WriteLine( "\tGroup #{0}, value: {1}",
    						++x,
    						string.IsNullOrEmpty( group.Value ) ? "undefined" : group.Value
    					);
    				}
    			}
    
    		}
    	}
    }

    The output from this sample code is:

    input address: 16B-18B Nicholas Street
            Match group count: 4
            Group #1, value: 16B-18B Nicholas Street
            Group #2, value: -18B
            Group #3, value: 16B-18B
            Group #4, value: Nicholas Street
            address parsed:
                    location = 16B-18B  Nicholas Street
                    building = undefined
                    floor = undefined

    input address: 266A-271 Broad Street
            Match group count: 4
            Group #1, value: 266A-271 Broad Street
            Group #2, value: -271
            Group #3, value: 266A-271
            Group #4, value: Broad Street
            address parsed:
                    location = 266A-271  Broad Street
                    building = undefined
                    floor = undefined

    input address: 1St Floor, Stanmore House
            Match group count: 3
            Group #1, value: 1St Floor, Stanmore House
            Group #2, value: 1St Floor
            Group #3, value: Stanmore House
            address parsed:
                    location = Stanmore House
                    building = undefined
                    floor = 1St Floor

    input address: Charles House, 5Th Floor
            Match group count: 3
            Group #1, value: Charles House, 5Th Floor
            Group #2, value: Charles House
            Group #3, value: 5Th Floor
            address parsed:
                    location = Charles House
                    building = undefined
                    floor = 5Th Floor

    input address: Dome Building, 2Nd Floor
            Match group count: 3
            Group #1, value: Dome Building, 2Nd Floor
            Group #2, value: Dome Building
            Group #3, value: 2Nd Floor
            address parsed:
                    location = Dome Building
                    building = undefined
                    floor = 2Nd Floor

    input address: 2Nd Floor, Dome Building
            Match group count: 3
            Group #1, value: 2Nd Floor, Dome Building
            Group #2, value: 2Nd Floor
            Group #3, value: Dome Building
            address parsed:
                    location = Dome Building
                    building = undefined
                    floor = 2Nd Floor

     


    Thursday, October 24, 2013 12:37 AM