none
Few area not clear using regex RRS feed

  • Question

  • i got the below code to sort quarter

    var list = new List<String>
        {
            "1Q 2014A",
            "2Q 2014A",
            "2012FY",
            "1Q 2013A",
            "2Q 2013A",
            "4Q 2014A",
            "4Q 2013A",
            "2013 FYA",
            "2011FY",
            "3Q 2014A",
            "3Q 2013A",
            "2013FY",
            "2010FY",
            "2014 FYA",
        };
    
    var result =
        list
            .OrderBy( s => Regex.Match( s, @"\d\d\d\d" ).Value )
            .ThenBy( s => Regex.Match( s, @"^.Q" ).Value + '5' )
            .ThenBy( s => !Regex.Match( s, @"FYA$" ).Success )
            .ThenBy( s => !Regex.Match( s, @"FY$" ).Success )
            .ToList( );
    1) why 4 \d has been used instead of one ?

    2) why dot has been used after this ^ symbol ?

    3) why not sign has been used here !Regex.Match( s, @"FYA$" ).Success )
    if i did not use not sign then what will happen or what will be meaning then ?

    please answer for my above questions. thanks

    Wednesday, October 23, 2019 3:16 PM

Answers

  • 1. Because he wanted to sort these in order by year.  If you put \d, it would match the first digit in each item.  By saying \d\d\d\d, it will only match a series of 4 digits.

    2. He wants the secondary sort key to be the quarter number.  "Dot" matches any character, so this will match 1Q, 2Q, 3Q, or 4Q starting at the beginning of the string.  It will also match XQ and QQ, but I'm guessing he knew what the format of the data would be.

    3. There is no "not" sign here, so I assume you meant the "at" sign: @.  It's just a good habit to use @ with your regular expressions, because you so often need to enter backslashes, and it can be a source of confusion.  The only case in THIS code where the "at" sign makes a difference is the first one, but by using the @ every time, you can add backslash escapes later without having to remember the @.


    Tim Roberts | Driver MVP Emeritus | Providenza &amp; Boekelheide, Inc.

    • Marked as answer by Sudip_inn Friday, October 25, 2019 7:15 AM
    Wednesday, October 23, 2019 10:32 PM
  • Hi Sudip_inn,


    Thank you for posting here.

    For your question, you seem to have some doubts about these regex.

    Tim's answer is good, but in order to make you understand better, I found some explanations from the Microsoft documents.

    1. \d : Matches any decimal digit. There are four \d in your code, which means matching four consecutive decimal numbers, and you could replace it with \d{4}.

    2. ^ : By default, the match must start at the beginning of the string; in multiline mode, it must start at the beginning of the line.
          . : Wildcard: Matches any single character except \n. 

    So "^. Q" can match the string whose second character is Q, such as 1q, QQ, sq.

    3. I guess you mean the $ sign.

    $ : By default, the match must occur at the end of the string or before \n at the end of the string; in multiline mode, it must occur before the end of the line or before \n at the end of the line.

      For example, “FYA$” can match “2014FYA” but does not match “FYA2014”.

    This is a link to this Microsoft document: Regular Expression Language - Quick Reference.

    Hope my solution could be helpful.

    Best regards,
    Timon

     

    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.




    Thursday, October 24, 2019 8:06 AM

All replies

  • The patterns can for sure be expressed also in a different way.. Why asks for the intention. Only the author of this piece of code can answer these questions.

    Wednesday, October 23, 2019 3:24 PM
  • 1. Because he wanted to sort these in order by year.  If you put \d, it would match the first digit in each item.  By saying \d\d\d\d, it will only match a series of 4 digits.

    2. He wants the secondary sort key to be the quarter number.  "Dot" matches any character, so this will match 1Q, 2Q, 3Q, or 4Q starting at the beginning of the string.  It will also match XQ and QQ, but I'm guessing he knew what the format of the data would be.

    3. There is no "not" sign here, so I assume you meant the "at" sign: @.  It's just a good habit to use @ with your regular expressions, because you so often need to enter backslashes, and it can be a source of confusion.  The only case in THIS code where the "at" sign makes a difference is the first one, but by using the @ every time, you can add backslash escapes later without having to remember the @.


    Tim Roberts | Driver MVP Emeritus | Providenza &amp; Boekelheide, Inc.

    • Marked as answer by Sudip_inn Friday, October 25, 2019 7:15 AM
    Wednesday, October 23, 2019 10:32 PM
  • Sir you said -  There is no "not" sign here, so I assume you meant the "at" sign: @.  I

    but there is not sign see this code  .ThenBy( s => !Regex.Match( s, @"FY$" ).Success ) here i bold not sign in code.

    why they use not sign in ThenBy ? 

    if you run my above code then you must see data is storing in list properly mean with right order. and FY is coming also in ordered result. so my question is what this line is doing in code .ThenBy( s => !Regex.Match( s, @"FY$" ).Success ) 

    please answer you understand the objective of this code .ThenBy( s => !Regex.Match( s, @"FY$" ).Success ) 

    Thanks

    Thursday, October 24, 2019 7:42 AM
  • Hi Sudip_inn,


    Thank you for posting here.

    For your question, you seem to have some doubts about these regex.

    Tim's answer is good, but in order to make you understand better, I found some explanations from the Microsoft documents.

    1. \d : Matches any decimal digit. There are four \d in your code, which means matching four consecutive decimal numbers, and you could replace it with \d{4}.

    2. ^ : By default, the match must start at the beginning of the string; in multiline mode, it must start at the beginning of the line.
          . : Wildcard: Matches any single character except \n. 

    So "^. Q" can match the string whose second character is Q, such as 1q, QQ, sq.

    3. I guess you mean the $ sign.

    $ : By default, the match must occur at the end of the string or before \n at the end of the string; in multiline mode, it must occur before the end of the line or before \n at the end of the line.

      For example, “FYA$” can match “2014FYA” but does not match “FYA2014”.

    This is a link to this Microsoft document: Regular Expression Language - Quick Reference.

    Hope my solution could be helpful.

    Best regards,
    Timon

     

    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.




    Thursday, October 24, 2019 8:06 AM
  • what is the meaning of using ! symbol in code

     .ThenBy( s => !Regex.Match( s, @"FYA$" ).Success )
            .ThenBy( s => !Regex.Match( s, @"FY$" ).Success )

    what does mean !Regex.Match here used ! symbol....why it is used ?

    one more question is that why adding 5 ?

    .ThenBy(s => Regex.Match(s, @"^.Q").Value + '5')



    • Edited by Sudip_inn Wednesday, October 30, 2019 3:04 PM
    Friday, October 25, 2019 7:17 AM