none
Bug in Uri class with periods

    Question

  • I believe there is a bug in the Uri class.  If a segment of the path in an HTTP Uri ends with a period (http://my.domain.com/top.level.folder/top.level.folder./subfolder/my.cool.picture.jpg), the period is removed.  I believe this is in contradiction to RFC 2396 and 3986.  Here is a unit test that shows the problem:

    1         public void TestUriAndPeriods()  
    2         {  
    3             const string urlWithoutPeriod = "http://my.domain.com/top.level.folder/subfolder/my.cool.picture.jpg";  
    4             const string urlWithPeriod = "http://my.domain.com/top.level.folder./subfolder/my.cool.picture.jpg";  
    5  
    6             Uri uriWithoutPeriod = new Uri(urlWithoutPeriod);  
    7             Uri uriWithPeriod = new Uri(urlWithPeriod);  
    8  
    9             if (uriWithoutPeriod.ToString() == uriWithPeriod.ToString())  
    10             {  
    11                 Console.WriteLine("Without period: {0}", uriWithoutPeriod);  
    12                 Console.WriteLine("With period:    {0}", uriWithPeriod);  
    13                 throw new ApplicationException("These Uri's should NOT be equal ");  
    14             }  
    15         } 

    Basically, the trailing period in the urlWithPeriod varialble in the first part of the path (top.level.folder.) is removed and the ToString() values of both Uri's are equal.  The RFC's state that any the period should be removed if "they are complete components of a path, but not when they are only part of a segment" as stated in RFC 3986 in section 5.4.2:

    1 Similarly, parsers must remove the dot-segments "." and ".." when they are complete components of a path, but not when they are only part of a segment.  
    2  
    3       "/./g"          =  "http://a/g"  
    4       "/../g"         =  "http://a/g"  
    5       "g."            =  "http://a/b/c/g."  
    6       ".g"            =  "http://a/b/c/.g"  
    7       "g.."           =  "http://a/b/c/g.."  
    8       "..g"           =  "http://a/b/c/..g" 

    Note the example on line 5, the period is retained.  I have also verified that if the last part of the path ends in a period, that period is removed as well.

    Big deal?  Well, not if I have control over all URL's that my application uses.  However, I am allowing users to provide URL's at runtime that we will make automated posts to and/or retrieve information from.  Some clients are now submitting URL's with this scenario, and they are not working at all.  Telling the Uri class that the URL is already encoded does not help either.  I can't use WebClient by passing it a string as the URL because it ends up internally using the Uri class.  So, the only workaround I can think of now would be to rewrite the WebClient (et. al.) classes to use strings for the URL's instead of instances of the Uri class.  A prospect I am not fond of.

    Does anyone else think that this behavior is expected per the RFC's (which the MSDN documentation refers to)?  Anyone else think this is incorrect behavior?

    Any ideas will be greatly appreciated!
    Sunday, December 28, 2008 6:16 AM

Answers

All replies

  • This is a known bug.  This was actually discussed on these forums not too long ago.  An MSFT employee acknowledged the problem and stated that it will be considered for a future release.  I looked for quite a while to find that thread, but I couldn't find it... :(

    Central to this issue is that a trailing period is irrelevant for Windows file names.  For example, "test." and "test" both mean the same thing (file named "test" with no extension).  This is not true in general for HTTP URIs, which is the problem with the Uri class.
    • Marked as answer by Zhi-Xin Ye Friday, January 02, 2009 12:39 PM
    Tuesday, December 30, 2008 12:40 AM
  • Well, I tried my best to find any reference to this problem, but was unsuccessful, which is why I posted it.  It is affecting our business right now, so I guess I'll have to open a case with MS under our partner agreement.  I hope they can come up with a workaround or hotfix for it.  IE doesn't suffer from the same problem, so they have some code somewhere that works correctly.

    If you do find a reference to the issue, please post it.  Hopefully the next person who is searching on this issue will at least be able to find this post.
    Monday, January 05, 2009 6:46 PM
  • Confirmed broken in .NET 4.0, still.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    
    namespace ConsoleApplication2
    {
        class Program
        {
            static void Main(string[] args)
            {
                var surl = "http://x/y./z";
                var url = new Uri(surl);
                Console.WriteLine(url.ToString()); // spits out http://x/y/z
    
                Console.WriteLine("Press ENTER to exit ...");
                Console.ReadLine();
            }
        }
    }
    
    This hurts us significantly as we have to tell our customers to go pooh pooh away. We, too, work with customers that provide their own URLs and we must manage these URLs; we use System.Uri to validate and clean-up the URLs, but if the validator is broken, .. so is our product.

    This is a serious bug that needs to be fixed, MSFT.
    Wednesday, February 17, 2010 11:23 PM
  • Wonderful, so much for standards.  Fortunately, we were able to get our partners to change, but that doesn't mean that some huge partner might come along that requires this functionality.  I just won't hold my breath.

    Wednesday, February 17, 2010 11:51 PM
  • Hey good news. I just tried the workaround posted at https://connect.microsoft.com/VisualStudio/feedback/details/386695/system-uri-incorrectly-strips-trailing-dots?wa=wsignin1.0#tabs in the Workarounds tab, and it works. Run this once for the app (application_start @ global.asax.cs or in void Main() etc) and you're good to go. Or at least, that's what I'm trying. 
    • Proposed as answer by jxdavis Thursday, February 18, 2010 12:10 AM
    • Marked as answer by danruehle Thursday, February 18, 2010 2:57 AM
    Thursday, February 18, 2010 12:10 AM