locked
Ask for a regular expression

    Soru

  • Hi all,

    Could you give me the regex for the following requirement:

    • abc{0bcd à abc{0}bcd
    • cde0}abc à cde{0}abc

    For example:

    • Sample1:{ 0 is sucker than the 1 }  } sample.
      à
      Sample1:{0} is sucker than the {1} sample.
    • 0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvw{ xyz}13{{14
      à 
      {0}
      Ab {1}Cde {2} f3{4} 5ij {6}kl{7}m{8}nop9 {10} qrst{11}12Uvw{ xyz}13{14}

    Note:

    • { or }: can be multiple with spaces, for example, "{","{ ", "}}", "{  {{ "
    • 0: can be multi-digit,  for example: 0, 12, 123 etc.
    • abc: any string
    • bcd: any string, which can be null and include white space, but doesn't start with 0-9
    • cde: any string, which can be null  and include white space , but doesn't end with 0-9
    • à: means "Change into"

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com







    31 Ocak 2012 Salı 17:25

Yanıtlar

  • Hi,

    Tty this:

     string pattern = @"(({\s*)+(?<num>\d+)(\s*})*)|(({\s*)*(?<num>\d+)(\s*})+)";
                Regex r = new Regex(pattern);
                string input = @"0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14";
                Console.WriteLine(r.Replace(input, @"{${num}}"));

    It works fine.

    Please Mark it as answer, if it helps solve your problem.

    • Yanıt Olarak İşaretleyen Andrew Huang 22 Mart 2012 Perşembe 13:06
    22 Mart 2012 Perşembe 08:08

Tüm Yanıtlar

  • If you could provide more samples, the team may be able to solve the problem faster, but here is my guess that uses two Regular Expressions:

    using System;
    using System.Text.RegularExpressions; 
    
    namespace ConsoleApplication5
    {
        class Program
        {
            /// <summary>
            /// Regex Demo
            /// 2012 - Shawn Eary
            /// 
            /// Uses two Microsoft style Regular Expressions and 
            /// two Micrsoft style backreferences to **try** and 
            /// solve Mr. Huang's requirement.
            /// </summary>        
            static void Main(string[] args)
            {
                String regExPattern1 = "{[{ ]+([0-9]+)[^}]"; String replacementPattern1 = "{$1} ";
                String regExPattern2 = "[^{0-9]([0-9]+)}[} ]+"; String replacementPattern2 = " {$1} "; 
    
                const String someSample = 
                    "{   {  382383 is { { 3 nicer than the 3234243} } } sample.";
                
                String transformation = Regex.Replace(
                    input:someSample, 
                    pattern: regExPattern1,
                    replacement: replacementPattern1
                );
                String transformation2 = Regex.Replace(
                    input: transformation,
                    pattern: regExPattern2,
                    replacement: replacementPattern2
                );
                String finalTransformation = transformation2;
                Console.WriteLine("In:'" + someSample + "' [without quotes]");
                Console.WriteLine("RegEx1 is /" + regExPattern1 + "/" + "{$1}" + "/"); 
                Console.WriteLine("RegEx2 is /" + regExPattern2 + "/" + "{$1}" + "/"); 
                Console.WriteLine("Out:'" + finalTransformation + "' [without quotes]"); 
                Console.WriteLine("");
                Console.WriteLine("Press any key to continue."); 
                Console.ReadKey();
            }
        }
    }
    
    

    Regards,

    Shawn

    04 Şubat 2012 Cumartesi 04:38
  • Thanks for your coding, Shawn.

    For your kindly suggestion, I added a complex sample which includes all situations. Please refer to the first post. 


    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com
    04 Şubat 2012 Cumartesi 19:12
  • Have somebody any idea?


    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    17 Şubat 2012 Cuma 01:57
  • Just up

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    19 Şubat 2012 Pazar 18:05
  • Mr. Huang:

    It appears that I missed the mark the first time.  Your post is interesting, but this conversation is getting a bit deep for me; never the less, I hope to terminate my post gracefully.  

    Have you considered the option of starting with a working Finite State Machine model and then converting it into a Regular Expression?  I think that once you have a working Finite State Machine, there are automated algorithms that can help you create a Regular Expression.  One such algorithm *might* be found here: http://qntm.org/algo

    Below is a picture I have attached that is my RE-*guess* at your requirement; unfortunately, I don't have a lot of free time on my hands so my guess could have some serious bugs in it.  I would like to write demo code for the team here, but I really need to move on to something else as I have been looking at this post for several non-contiguous hours now...

    As facinating as your post is, I really have other more pressing matters to attend to.   Hopefully, someone here will be able to fill in the gap where I left off or come up with an even better idea than what I have presented.

    Best of Luck,

    Shawn

    20 Şubat 2012 Pazartesi 05:29
  • Mr. Huang:

    It appears that I missed the mark the first time.  Your post is interesting, but this conversation is getting a bit deep for me; never the less, I hope to terminate my post gracefully.  

    Have you considered the option of starting with a working Finite State Machine model and then converting it into a Regular Expression?  I think that once you have a working Finite State Machine, there are automated algorithms that can help you create a Regular Expression.  One such algorithm *might* be found here: http://qntm.org/algo

    Below is a picture I have attached that is my RE-*guess* at your requirement; unfortunately, I don't have a lot of free time on my hands so my guess could have some serious bugs in it.  I would like to write demo code for the team here, but I really need to move on to something else as I have been looking at this post for several non-contiguous hours now...

    As facinating as your post is, I really have other more pressing matters to attend to.   Hopefully, someone here will be able to fill in the gap where I left off or come up with an even better idea than what I have presented.

    Best of Luck,

    Shawn

    Thanks so much, Shawn.

    Sorry, I have no idea about Finite State Machine model, so hope you could  help to move on. 


    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    25 Şubat 2012 Cumartesi 04:56
  • The thread is still open. Please drop any idea.

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    01 Mart 2012 Perşembe 14:50
  • The third up. It is still open. Please.

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    08 Mart 2012 Perşembe 16:45
  • Mr. Huang:

    I translated the FSM Graph that I previously posted into C# code and came "pretty close" to achieving the tranformation specified in your second sample:

    0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14
    à
    {0}
    Ab {1}Cde {2} f3{4} 5ij{6}kl{7}m{8}nop9 {10} qrst{11}12Uvw{ xyz}13{14}

    Note that while I didn't get "exactly" what you are looking for, I came "reasonably" close.  Also note that on 9-MAR-2012 you appeared to have a typo in your original input.  Instead of:

    Input: 0} Ab {1}Cde { 2 } } f3{4 5ij { {{6kl{ 7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14

    You probably meant - Input: 0} Ab {1}Cde { 2 } } f3{4 5ij { {{6kl{ 7m8}}nop9 10 } qrst11 }12Uvw{ xyz}13{{14

    Here is my code:

    using System;
    
    namespace parenParse
    {
    	/// <summary>
    	/// Shawn Eary 2012 (Public Domain) 
    	/// 
    	/// Class to transform a String as specified by Mr. Huang in 
    	/// http://social.msdn.microsoft.com/Forums/en-US/regexp/thread/25fd0cc0-fff2-45ab-a41b-2c85822fcdf6
    	/// (Well almost...)
    	/// 
    	/// Mostly uses the state machine that Shawn Eary posted to the 
    	/// same thread.  One particular error in Mr. Eary's state machine 
    	/// figure is the fact that the start state does not accept the 
    	/// anything else input to wrap back around to itself.  That has 
    	/// been corrected in this C# implementation of the state machine. 
    	/// 
    	/// There are some minor spacing bugs in this braketTransformer
    	/// but it comes "pretty" close to meeting Mr. Huang's needs.
    	/// Perhaps when time allows.  Either Mr. Huang, myself or 
    	/// someone else in this forum can correct those minor bugs. 
    	/// </summary>
    	class bracketTransformer
    	{
    		public bracketTransformer(String iStringToTransform) {
    			this.m_stringToTransform = iStringToTransform; 			
    		}
    
    		public bool EOS()
    		{
    			return (m_currentIndex > (m_stringToTransform.Length - 1)); 
    		}
    
    		public void eatExtraSpacesAndCloseCurlyBraketsState()
    		{
    			if (EOS())
    			{
    				return;
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    			m_currentIndex++;
    			if ((curChar == '}') || (curChar == ' '))
    			{
    				// Do nothing.  Stay in this state eating } and spaces	
    				eatExtraSpacesAndCloseCurlyBraketsState(); 			
    			}
    			else
    			{
    				m_transformation += curChar; 
    				startState(); 
    			}
    		}
    
    		public void openedNumberState() {
    			if (EOS())
    			{
    				m_transformation += '}';
    				return;
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    			m_currentIndex++; 
    			if (Char.IsDigit(curChar))
    			{
    				m_transformation += curChar;
    				openedNumberState(); 
    			}
    			else if ((curChar == '}') || (curChar == ' '))
    			{
    				// [2] - Output }
    				m_transformation += '}';
    				eatExtraSpacesAndCloseCurlyBraketsState();
    			}
    			else
    			{				
    				m_transformation += '}';
    				m_transformation += curChar;
    				startState(); 
    			}
    		}
    
    		public void openState()
    		{
    			// The only way to get into this state is to get a '{'
    			// so go ahead and output that '{'
    			// m_transformation += '{';
    
    			if (EOS())
    			{
    				return;
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    			m_currentIndex++; 
    			if ((curChar == '{') || (curChar == ' '))
    			{
    				// Do nothing.   Stay in this state
    				// This "should" toos extra { and spaces
    				openState(); 
    			}
    			else if (Char.IsDigit(curChar))
    			{
    				m_transformation += curChar; 
    				openedNumberState();
    			}
    			else
    			{
    				m_transformation += curChar; 
    				startState(); 
    			}
    		}
    
    		public void closeUnopenedNumberState()
    		{
    			if (EOS())
    			{
    				return;
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    			m_currentIndex++; 
    			if (curChar == '{')
    			{				
    				openState();
    			}
    			else if ((curChar == ' ') || (curChar == '}'))
    			{
    				// Stay in this state.  Don't do anything 
    				closeUnopenedNumberState(); 
    			}
    			else
    			{
    				m_transformation += curChar; 
    				startState(); 
    			}
    		}
    
    		public void unopenedNumberWithTrailingSpacesState()
    		{
    			if (EOS())
    			{
    				return;
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    
    			if (curChar == '}')
    			{
    				m_transformation =
    					m_transformation.Insert(m_openBracketPos, "{");
    				m_transformation += curChar;
    				m_currentIndex++;
    				m_openBracketPos = 0; 
    				closeUnopenedNumberState();
    			}
    			else if (curChar == ' ')
    			{
    				// Do Nothing.  Stay in this state
    				m_currentIndex++;
    				unopenedNumberWithTrailingSpacesState(); 
    			}
    			else
    			{
                    // Don't process alternate AnythingElse characters
    				// just go to the start state and let the start 
    				// state handle it 
    				startState(); 
    			}
    		}
    
    		public void unopenedNumberState()
    		{
    			if (EOS())
    			{
    				return;
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    			m_transformation += curChar;
    			m_currentIndex++; 
    			if (Char.IsDigit(curChar))
    			{
    				// Don't do anything.  Stay in this state
    				unopenedNumberState(); 
    			} else if (curChar == '{') {
    				openState();
    			}
    			else if (curChar == '}')
    			{
    				m_transformation =
    					m_transformation.Insert(m_openBracketPos, "{");
    				m_openBracketPos = 0; 
    				closeUnopenedNumberState();
    			}
    			else if (curChar == ' ')
    			{
    				unopenedNumberWithTrailingSpacesState();
    			}
    			else
    			{
    				startState();
    			}
    		}
    
    		public void startState() {
    			if (EOS()) {
    				return; 
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    			m_transformation += curChar; 
    			if (curChar == '{')
    			{
    				m_currentIndex++;
    				openState();
    			}
    			else if (Char.IsDigit(curChar))
    			{
    				if (m_transformation.Length < 1)
    				{
    					m_openBracketPos = 0;
    				}
    				else
    				{
    					m_openBracketPos = (m_transformation.Length - 1);
    				}				
    				m_currentIndex++;
    				unopenedNumberState();
    			} else { 	
                    // I forgot to include this edge in my State Machine
                    // graph but it is here now		
    				m_currentIndex++; 
    				startState(); 
    			}
    		}
    
    		public String getTransformation() {
    			m_transformation = ""; 
    			m_currentIndex = 0; 
    			m_openBracketPos = 0; 
    			startState(); 
    			return m_transformation; 
    		}
    		
    		protected readonly String m_stringToTransform; 
    		protected String m_transformation; 
    		protected int m_currentIndex;
    		protected int m_openBracketPos; 
    	}
    
    	class Program
    	{
    		static void Main(string[] args)
    		{
    			// Notice how easy it is to put an input string into the 
    			// bracket transformer class and get an output pattern.
                // This is even easier than using the RegEx library for
    			// this specialized case
    			const String inputPattern = 
    				"0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14";
    			bracketTransformer myBT = 
    				new bracketTransformer(inputPattern); 
    			String outputPattern = myBT.getTransformation(); 
    
    
    			// Show the input and the output
    			Console.WriteLine("input : *" + inputPattern + "*");
    			Console.WriteLine("output: *" + outputPattern + "*"); 
    			Console.WriteLine("Press a key to continue.");
    			Console.ReadKey();
    		}
    	}
    }
    

    And here are my *pretty* close results:

    input : *0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14*
    output: *{0}Ab {1}Cde {2}f3{4}5ij {6}kl{7}m{8}nop9 {10 }qrst{11 }12Uvt{xyz}13{14}*
    Press a key to continue.
    k

    It's not *exactly* what you are looking for but it isn't too bad and perhaps one of us can fix the minor bugs when time permits.

    Regards,

    Shawn
    (I'm lossing interest in this discussion - I would rather be writing video games :-)   )

    10 Mart 2012 Cumartesi 05:42
  • Thanks a lot, Shawn.

    Firstly, your are right about "t" and "W". I'll correct it soon.

    It's almost working, except some white spaces need to be removed (BOLD):

    output: *{0}Ab {1}Cde {2}f3{4}5ij {6}kl{7}m{8}nop9 {10 }qrst{11 }12Uvt{xyz}13{14}*

    Would you help me to move it on?


    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    13 Mart 2012 Salı 23:22
  • try

    string pattern = @"[{}]*(?<num>\d+)[{}]*";
                Regex r = new Regex(pattern);
                string input = @"abc{0bcd";
                r.Replace(input, @"{$<num>}");


    Please Mark it as answer, if it helps solve your problem.

    20 Mart 2012 Salı 03:32
  • try

    string pattern = @"[{}]*(?<num>\d+)[{}]*";
                Regex r = new Regex(pattern);
                string input = @"abc{0bcd";
                r.Replace(input, @"{$<num>}");


    Please Mark it as answer, if it helps solve your problem.

    Thanks, RudeFledgling.

    It doesn't work. Would you explain about "$<num>" in your last statement? The output is: abc{$<num>}bcd


    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    20 Mart 2012 Salı 18:03
  • Hi, sorry. I write wong synbol.

    here is the correct one:

    string pattern = @"[{}\s]*(?<num>\d+)[{}\s]*";
                Regex r = new Regex(pattern);
                string input = @"{ 0 is sucker than the 1}  } ";
                Console.WriteLine(r.Replace(input, @"{${num}}"));

    "${num}" will substitute the captured string by the group name. I also add "\s" to capture the blank space between the digits and the "{" or "}".

    It can works fine. I have tested the string you have provided.

    Please Mark it as answer, if it helps solve your problem.


    • Düzenleyen Phape 21 Mart 2012 Çarşamba 02:33
    21 Mart 2012 Çarşamba 02:29
  • Hi, sorry. I write wong synbol.

    here is the correct one:

    string pattern = @"[{}\s]*(?<num>\d+)[{}\s]*";
                Regex r = new Regex(pattern);
                string input = @"{ 0 is sucker than the 1}  } ";
                Console.WriteLine(r.Replace(input, @"{${num}}"));

    "${num}" will substitute the captured string by the group name. I also add "\s" to capture the blank space between the digits and the "{" or "}".

    It can works fine. I have tested the string you have provided.

    Please Mark it as answer, if it helps solve your problem.


    Hi RudeFledging,

    It's almost working, but there're some not fit my requirement.

    Please consider the simple example here:

    Input: { 0 is over 2 times sucker than the1}  }

    Output should be: {0} is over 2 times sucker than the{1}

    Your output: {0}is over{2}times sucker than the{1}

    Problems:

    1. Keep the numbers which are not attached { or }.
    2. Keep the white spaces which are out of { and/or }.

    For my sample:

    0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14

    Your code output:

    {0}Ab{1}Cde{2}f{3}{4}{5}ij{6}kl{7}m{8}nop{9}{10}qrst{11}{12}Uvt{ xyz{13}{14}

    It doesn't match the answer:

    {0} Ab {1}Cde {2} f3{4} 5ij {6}kl{7}m{8}nop9 {10} qrst{11}12Uvw{ xyz}13{14}


    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    21 Mart 2012 Çarşamba 13:42
  • Hi,

    Tty this:

     string pattern = @"(({\s*)+(?<num>\d+)(\s*})*)|(({\s*)*(?<num>\d+)(\s*})+)";
                Regex r = new Regex(pattern);
                string input = @"0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14";
                Console.WriteLine(r.Replace(input, @"{${num}}"));

    It works fine.

    Please Mark it as answer, if it helps solve your problem.

    • Yanıt Olarak İşaretleyen Andrew Huang 22 Mart 2012 Perşembe 13:06
    22 Mart 2012 Perşembe 08:08
  • Hi,

    Tty this:

     string pattern = @"(({\s*)+(?<num>\d+)(\s*})*)|(({\s*)*(?<num>\d+)(\s*})+)";
                Regex r = new Regex(pattern);
                string input = @"0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14";
                Console.WriteLine(r.Replace(input, @"{${num}}"));

    It works fine.

    Please Mark it as answer, if it helps solve your problem.

    FINALLY! Thanks a lot, RudeFledgling.

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    22 Mart 2012 Perşembe 13:07
  • HI,

    Do you get the solution for the RegX?

    12 Nisan 2012 Perşembe 09:59
  • HI,

    Do you get the solution for the RegX?

    Sure, just above.

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    13 Nisan 2012 Cuma 03:02
  • Hi,

    This might help...

    Thanks, it's already done by Phape, please refer to his solution.

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    16 Nisan 2012 Pazartesi 00:53