locked
Ask for a regular expression

    Pregunta

  • Hi all,

    Could you give me the regex for the following requirement:

    • abc{0bcd à abc{0}bcd
    • cde0}abc à cde{0}abc

    For example:

    • Sample1:{ 0 is sucker than the 1 }  } sample.
      à
      Sample1:{0} is sucker than the {1} sample.
    • 0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvw{ xyz}13{{14
      à 
      {0}
      Ab {1}Cde {2} f3{4} 5ij {6}kl{7}m{8}nop9 {10} qrst{11}12Uvw{ xyz}13{14}

    Note:

    • { or }: can be multiple with spaces, for example, "{","{ ", "}}", "{  {{ "
    • 0: can be multi-digit,  for example: 0, 12, 123 etc.
    • abc: any string
    • bcd: any string, which can be null and include white space, but doesn't start with 0-9
    • cde: any string, which can be null  and include white space , but doesn't end with 0-9
    • à: means "Change into"

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com







    martes, 31 de enero de 2012 17:25

Respuestas

  • Hi,

    Tty this:

     string pattern = @"(({\s*)+(?<num>\d+)(\s*})*)|(({\s*)*(?<num>\d+)(\s*})+)";
                Regex r = new Regex(pattern);
                string input = @"0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14";
                Console.WriteLine(r.Replace(input, @"{${num}}"));

    It works fine.

    Please Mark it as answer, if it helps solve your problem.

    • Marcado como respuesta Andrew Huang jueves, 22 de marzo de 2012 13:06
    jueves, 22 de marzo de 2012 8:08

Todas las respuestas

  • If you could provide more samples, the team may be able to solve the problem faster, but here is my guess that uses two Regular Expressions:

    using System;
    using System.Text.RegularExpressions; 
    
    namespace ConsoleApplication5
    {
        class Program
        {
            /// <summary>
            /// Regex Demo
            /// 2012 - Shawn Eary
            /// 
            /// Uses two Microsoft style Regular Expressions and 
            /// two Micrsoft style backreferences to **try** and 
            /// solve Mr. Huang's requirement.
            /// </summary>        
            static void Main(string[] args)
            {
                String regExPattern1 = "{[{ ]+([0-9]+)[^}]"; String replacementPattern1 = "{$1} ";
                String regExPattern2 = "[^{0-9]([0-9]+)}[} ]+"; String replacementPattern2 = " {$1} "; 
    
                const String someSample = 
                    "{   {  382383 is { { 3 nicer than the 3234243} } } sample.";
                
                String transformation = Regex.Replace(
                    input:someSample, 
                    pattern: regExPattern1,
                    replacement: replacementPattern1
                );
                String transformation2 = Regex.Replace(
                    input: transformation,
                    pattern: regExPattern2,
                    replacement: replacementPattern2
                );
                String finalTransformation = transformation2;
                Console.WriteLine("In:'" + someSample + "' [without quotes]");
                Console.WriteLine("RegEx1 is /" + regExPattern1 + "/" + "{$1}" + "/"); 
                Console.WriteLine("RegEx2 is /" + regExPattern2 + "/" + "{$1}" + "/"); 
                Console.WriteLine("Out:'" + finalTransformation + "' [without quotes]"); 
                Console.WriteLine("");
                Console.WriteLine("Press any key to continue."); 
                Console.ReadKey();
            }
        }
    }
    
    

    Regards,

    Shawn

    sábado, 04 de febrero de 2012 4:38
  • Thanks for your coding, Shawn.

    For your kindly suggestion, I added a complex sample which includes all situations. Please refer to the first post. 


    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com
    sábado, 04 de febrero de 2012 19:12
  • Have somebody any idea?


    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    viernes, 17 de febrero de 2012 1:57
  • Just up

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    domingo, 19 de febrero de 2012 18:05
  • Mr. Huang:

    It appears that I missed the mark the first time.  Your post is interesting, but this conversation is getting a bit deep for me; never the less, I hope to terminate my post gracefully.  

    Have you considered the option of starting with a working Finite State Machine model and then converting it into a Regular Expression?  I think that once you have a working Finite State Machine, there are automated algorithms that can help you create a Regular Expression.  One such algorithm *might* be found here: http://qntm.org/algo

    Below is a picture I have attached that is my RE-*guess* at your requirement; unfortunately, I don't have a lot of free time on my hands so my guess could have some serious bugs in it.  I would like to write demo code for the team here, but I really need to move on to something else as I have been looking at this post for several non-contiguous hours now...

    As facinating as your post is, I really have other more pressing matters to attend to.   Hopefully, someone here will be able to fill in the gap where I left off or come up with an even better idea than what I have presented.

    Best of Luck,

    Shawn

    lunes, 20 de febrero de 2012 5:29
  • Mr. Huang:

    It appears that I missed the mark the first time.  Your post is interesting, but this conversation is getting a bit deep for me; never the less, I hope to terminate my post gracefully.  

    Have you considered the option of starting with a working Finite State Machine model and then converting it into a Regular Expression?  I think that once you have a working Finite State Machine, there are automated algorithms that can help you create a Regular Expression.  One such algorithm *might* be found here: http://qntm.org/algo

    Below is a picture I have attached that is my RE-*guess* at your requirement; unfortunately, I don't have a lot of free time on my hands so my guess could have some serious bugs in it.  I would like to write demo code for the team here, but I really need to move on to something else as I have been looking at this post for several non-contiguous hours now...

    As facinating as your post is, I really have other more pressing matters to attend to.   Hopefully, someone here will be able to fill in the gap where I left off or come up with an even better idea than what I have presented.

    Best of Luck,

    Shawn

    Thanks so much, Shawn.

    Sorry, I have no idea about Finite State Machine model, so hope you could  help to move on. 


    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    sábado, 25 de febrero de 2012 4:56
  • The thread is still open. Please drop any idea.

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    jueves, 01 de marzo de 2012 14:50
  • The third up. It is still open. Please.

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    jueves, 08 de marzo de 2012 16:45
  • Mr. Huang:

    I translated the FSM Graph that I previously posted into C# code and came "pretty close" to achieving the tranformation specified in your second sample:

    0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14
    à
    {0}
    Ab {1}Cde {2} f3{4} 5ij{6}kl{7}m{8}nop9 {10} qrst{11}12Uvw{ xyz}13{14}

    Note that while I didn't get "exactly" what you are looking for, I came "reasonably" close.  Also note that on 9-MAR-2012 you appeared to have a typo in your original input.  Instead of:

    Input: 0} Ab {1}Cde { 2 } } f3{4 5ij { {{6kl{ 7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14

    You probably meant - Input: 0} Ab {1}Cde { 2 } } f3{4 5ij { {{6kl{ 7m8}}nop9 10 } qrst11 }12Uvw{ xyz}13{{14

    Here is my code:

    using System;
    
    namespace parenParse
    {
    	/// <summary>
    	/// Shawn Eary 2012 (Public Domain) 
    	/// 
    	/// Class to transform a String as specified by Mr. Huang in 
    	/// http://social.msdn.microsoft.com/Forums/en-US/regexp/thread/25fd0cc0-fff2-45ab-a41b-2c85822fcdf6
    	/// (Well almost...)
    	/// 
    	/// Mostly uses the state machine that Shawn Eary posted to the 
    	/// same thread.  One particular error in Mr. Eary's state machine 
    	/// figure is the fact that the start state does not accept the 
    	/// anything else input to wrap back around to itself.  That has 
    	/// been corrected in this C# implementation of the state machine. 
    	/// 
    	/// There are some minor spacing bugs in this braketTransformer
    	/// but it comes "pretty" close to meeting Mr. Huang's needs.
    	/// Perhaps when time allows.  Either Mr. Huang, myself or 
    	/// someone else in this forum can correct those minor bugs. 
    	/// </summary>
    	class bracketTransformer
    	{
    		public bracketTransformer(String iStringToTransform) {
    			this.m_stringToTransform = iStringToTransform; 			
    		}
    
    		public bool EOS()
    		{
    			return (m_currentIndex > (m_stringToTransform.Length - 1)); 
    		}
    
    		public void eatExtraSpacesAndCloseCurlyBraketsState()
    		{
    			if (EOS())
    			{
    				return;
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    			m_currentIndex++;
    			if ((curChar == '}') || (curChar == ' '))
    			{
    				// Do nothing.  Stay in this state eating } and spaces	
    				eatExtraSpacesAndCloseCurlyBraketsState(); 			
    			}
    			else
    			{
    				m_transformation += curChar; 
    				startState(); 
    			}
    		}
    
    		public void openedNumberState() {
    			if (EOS())
    			{
    				m_transformation += '}';
    				return;
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    			m_currentIndex++; 
    			if (Char.IsDigit(curChar))
    			{
    				m_transformation += curChar;
    				openedNumberState(); 
    			}
    			else if ((curChar == '}') || (curChar == ' '))
    			{
    				// [2] - Output }
    				m_transformation += '}';
    				eatExtraSpacesAndCloseCurlyBraketsState();
    			}
    			else
    			{				
    				m_transformation += '}';
    				m_transformation += curChar;
    				startState(); 
    			}
    		}
    
    		public void openState()
    		{
    			// The only way to get into this state is to get a '{'
    			// so go ahead and output that '{'
    			// m_transformation += '{';
    
    			if (EOS())
    			{
    				return;
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    			m_currentIndex++; 
    			if ((curChar == '{') || (curChar == ' '))
    			{
    				// Do nothing.   Stay in this state
    				// This "should" toos extra { and spaces
    				openState(); 
    			}
    			else if (Char.IsDigit(curChar))
    			{
    				m_transformation += curChar; 
    				openedNumberState();
    			}
    			else
    			{
    				m_transformation += curChar; 
    				startState(); 
    			}
    		}
    
    		public void closeUnopenedNumberState()
    		{
    			if (EOS())
    			{
    				return;
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    			m_currentIndex++; 
    			if (curChar == '{')
    			{				
    				openState();
    			}
    			else if ((curChar == ' ') || (curChar == '}'))
    			{
    				// Stay in this state.  Don't do anything 
    				closeUnopenedNumberState(); 
    			}
    			else
    			{
    				m_transformation += curChar; 
    				startState(); 
    			}
    		}
    
    		public void unopenedNumberWithTrailingSpacesState()
    		{
    			if (EOS())
    			{
    				return;
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    
    			if (curChar == '}')
    			{
    				m_transformation =
    					m_transformation.Insert(m_openBracketPos, "{");
    				m_transformation += curChar;
    				m_currentIndex++;
    				m_openBracketPos = 0; 
    				closeUnopenedNumberState();
    			}
    			else if (curChar == ' ')
    			{
    				// Do Nothing.  Stay in this state
    				m_currentIndex++;
    				unopenedNumberWithTrailingSpacesState(); 
    			}
    			else
    			{
                    // Don't process alternate AnythingElse characters
    				// just go to the start state and let the start 
    				// state handle it 
    				startState(); 
    			}
    		}
    
    		public void unopenedNumberState()
    		{
    			if (EOS())
    			{
    				return;
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    			m_transformation += curChar;
    			m_currentIndex++; 
    			if (Char.IsDigit(curChar))
    			{
    				// Don't do anything.  Stay in this state
    				unopenedNumberState(); 
    			} else if (curChar == '{') {
    				openState();
    			}
    			else if (curChar == '}')
    			{
    				m_transformation =
    					m_transformation.Insert(m_openBracketPos, "{");
    				m_openBracketPos = 0; 
    				closeUnopenedNumberState();
    			}
    			else if (curChar == ' ')
    			{
    				unopenedNumberWithTrailingSpacesState();
    			}
    			else
    			{
    				startState();
    			}
    		}
    
    		public void startState() {
    			if (EOS()) {
    				return; 
    			}
    
    			char curChar = m_stringToTransform[m_currentIndex];
    			m_transformation += curChar; 
    			if (curChar == '{')
    			{
    				m_currentIndex++;
    				openState();
    			}
    			else if (Char.IsDigit(curChar))
    			{
    				if (m_transformation.Length < 1)
    				{
    					m_openBracketPos = 0;
    				}
    				else
    				{
    					m_openBracketPos = (m_transformation.Length - 1);
    				}				
    				m_currentIndex++;
    				unopenedNumberState();
    			} else { 	
                    // I forgot to include this edge in my State Machine
                    // graph but it is here now		
    				m_currentIndex++; 
    				startState(); 
    			}
    		}
    
    		public String getTransformation() {
    			m_transformation = ""; 
    			m_currentIndex = 0; 
    			m_openBracketPos = 0; 
    			startState(); 
    			return m_transformation; 
    		}
    		
    		protected readonly String m_stringToTransform; 
    		protected String m_transformation; 
    		protected int m_currentIndex;
    		protected int m_openBracketPos; 
    	}
    
    	class Program
    	{
    		static void Main(string[] args)
    		{
    			// Notice how easy it is to put an input string into the 
    			// bracket transformer class and get an output pattern.
                // This is even easier than using the RegEx library for
    			// this specialized case
    			const String inputPattern = 
    				"0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14";
    			bracketTransformer myBT = 
    				new bracketTransformer(inputPattern); 
    			String outputPattern = myBT.getTransformation(); 
    
    
    			// Show the input and the output
    			Console.WriteLine("input : *" + inputPattern + "*");
    			Console.WriteLine("output: *" + outputPattern + "*"); 
    			Console.WriteLine("Press a key to continue.");
    			Console.ReadKey();
    		}
    	}
    }
    

    And here are my *pretty* close results:

    input : *0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14*
    output: *{0}Ab {1}Cde {2}f3{4}5ij {6}kl{7}m{8}nop9 {10 }qrst{11 }12Uvt{xyz}13{14}*
    Press a key to continue.
    k

    It's not *exactly* what you are looking for but it isn't too bad and perhaps one of us can fix the minor bugs when time permits.

    Regards,

    Shawn
    (I'm lossing interest in this discussion - I would rather be writing video games :-)   )

    sábado, 10 de marzo de 2012 5:42
  • Thanks a lot, Shawn.

    Firstly, your are right about "t" and "W". I'll correct it soon.

    It's almost working, except some white spaces need to be removed (BOLD):

    output: *{0}Ab {1}Cde {2}f3{4}5ij {6}kl{7}m{8}nop9 {10 }qrst{11 }12Uvt{xyz}13{14}*

    Would you help me to move it on?


    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    martes, 13 de marzo de 2012 23:22
  • try

    string pattern = @"[{}]*(?<num>\d+)[{}]*";
                Regex r = new Regex(pattern);
                string input = @"abc{0bcd";
                r.Replace(input, @"{$<num>}");


    Please Mark it as answer, if it helps solve your problem.

    martes, 20 de marzo de 2012 3:32
  • try

    string pattern = @"[{}]*(?<num>\d+)[{}]*";
                Regex r = new Regex(pattern);
                string input = @"abc{0bcd";
                r.Replace(input, @"{$<num>}");


    Please Mark it as answer, if it helps solve your problem.

    Thanks, RudeFledgling.

    It doesn't work. Would you explain about "$<num>" in your last statement? The output is: abc{$<num>}bcd


    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    martes, 20 de marzo de 2012 18:03
  • Hi, sorry. I write wong synbol.

    here is the correct one:

    string pattern = @"[{}\s]*(?<num>\d+)[{}\s]*";
                Regex r = new Regex(pattern);
                string input = @"{ 0 is sucker than the 1}  } ";
                Console.WriteLine(r.Replace(input, @"{${num}}"));

    "${num}" will substitute the captured string by the group name. I also add "\s" to capture the blank space between the digits and the "{" or "}".

    It can works fine. I have tested the string you have provided.

    Please Mark it as answer, if it helps solve your problem.


    • Editado Phape miércoles, 21 de marzo de 2012 2:33
    miércoles, 21 de marzo de 2012 2:29
  • Hi, sorry. I write wong synbol.

    here is the correct one:

    string pattern = @"[{}\s]*(?<num>\d+)[{}\s]*";
                Regex r = new Regex(pattern);
                string input = @"{ 0 is sucker than the 1}  } ";
                Console.WriteLine(r.Replace(input, @"{${num}}"));

    "${num}" will substitute the captured string by the group name. I also add "\s" to capture the blank space between the digits and the "{" or "}".

    It can works fine. I have tested the string you have provided.

    Please Mark it as answer, if it helps solve your problem.


    Hi RudeFledging,

    It's almost working, but there're some not fit my requirement.

    Please consider the simple example here:

    Input: { 0 is over 2 times sucker than the1}  }

    Output should be: {0} is over 2 times sucker than the{1}

    Your output: {0}is over{2}times sucker than the{1}

    Problems:

    1. Keep the numbers which are not attached { or }.
    2. Keep the white spaces which are out of { and/or }.

    For my sample:

    0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14

    Your code output:

    {0}Ab{1}Cde{2}f{3}{4}{5}ij{6}kl{7}m{8}nop{9}{10}qrst{11}{12}Uvt{ xyz{13}{14}

    It doesn't match the answer:

    {0} Ab {1}Cde {2} f3{4} 5ij {6}kl{7}m{8}nop9 {10} qrst{11}12Uvw{ xyz}13{14}


    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    miércoles, 21 de marzo de 2012 13:42
  • Hi,

    Tty this:

     string pattern = @"(({\s*)+(?<num>\d+)(\s*})*)|(({\s*)*(?<num>\d+)(\s*})+)";
                Regex r = new Regex(pattern);
                string input = @"0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14";
                Console.WriteLine(r.Replace(input, @"{${num}}"));

    It works fine.

    Please Mark it as answer, if it helps solve your problem.

    • Marcado como respuesta Andrew Huang jueves, 22 de marzo de 2012 13:06
    jueves, 22 de marzo de 2012 8:08
  • Hi,

    Tty this:

     string pattern = @"(({\s*)+(?<num>\d+)(\s*})*)|(({\s*)*(?<num>\d+)(\s*})+)";
                Regex r = new Regex(pattern);
                string input = @"0} Ab {1}Cde {  2 }  } f3{4 5ij { {{6kl{  7m8}}nop9 10 } qrst11 }12Uvt{ xyz}13{{14";
                Console.WriteLine(r.Replace(input, @"{${num}}"));

    It works fine.

    Please Mark it as answer, if it helps solve your problem.

    FINALLY! Thanks a lot, RudeFledgling.

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    jueves, 22 de marzo de 2012 13:07
  • HI,

    Do you get the solution for the RegX?

    jueves, 12 de abril de 2012 9:59
  • HI,

    Do you get the solution for the RegX?

    Sure, just above.

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    viernes, 13 de abril de 2012 3:02
  • Hi,

    This might help...

    Thanks, it's already done by Phape, please refer to his solution.

    Thanks,

    Andrew Huang

    andrew.huang.2009@gmail.com

    lunes, 16 de abril de 2012 0:53