none
System.URI unescaping parenthesis in 4.8 RRS feed

  • Question

  • I have the following url:

    id.loc.gov/authorities/names/label/Col%C3%B4nia%20Santa%20Izabel%20%28Hospital%20%3A%20Belo%20Horizonte%2C%20Brazil%29

    This URL should resolve, but when I create a URI to pull the data via an HttpWebRequest, the URL is changed to:

    id.loc.gov/authorities/names/label/Col%C3%B4nia%20Santa%20Izabel%20(Hospital%20%3A%20Belo%20Horizonte%2C%20Brazil)

    Unencoding the "(" and ")" values.  I don't think these values should be unencoding, and I'm not sure how to stop it.  I've escaped the script and would just like the URI created as is.  

    Example of how to recreate:

    string url = "http://id.loc.gov/authorities/names/label/Col%C3%B4nia%20Santa%20Izabel%20%28Hospital%20%3A%20Belo%20Horizonte%2C%20Brazil%29";

    System.Uri search_uri = new Uri(uri);

    MessageBox.Show(search_uri.AbsoluteUri);

    Any ideas?

    --tr

    Sunday, February 16, 2020 4:04 AM

Answers

  • Ok -- I did more looking -- and found that the problem is my targeted framework.  If you are on .NET 4.7.2 or 4.8, you can add this to the app.config file and all is well:

      <uri>
        <schemeSettings>
          <add name="http" genericUriParserOptions="DontUnescapePathDotsAndSlashes"/>      
        </schemeSettings>
      </uri>

    In every version previous (and I had a component targeting 4.7.1 within a 4.8 app) -- this won't do anything, and the URIs fail to parse properly when passed through a client.

    My solution, to make sure everything is moved up to at least 4.7.2 so that this actually works.


    • Marked as answer by reeset Sunday, February 16, 2020 4:50 PM
    Sunday, February 16, 2020 4:50 PM

All replies


  • Unencoding the "(" and ")" values.  I don't think these values should be unencoding, and I'm not sure how to stop it.  I've escaped the script and would just like the URI created as is.  

    Example of how to recreate:

    string url = "http://id.loc.gov/authorities/names/label/Col%C3%B4nia%20Santa%20Izabel%20%28Hospital%20%3A%20Belo%20Horizonte%2C%20Brazil%29";

    System.Uri search_uri = new Uri(uri);

    MessageBox.Show(search_uri.AbsoluteUri);


    What do you see if you use:

    MessageBox.Show(search_uri.OriginalString);

    - Wayne

    Sunday, February 16, 2020 7:05 AM

  • This URL should resolve, but when I create a URI to pull the data via an HttpWebRequest, the URL is changed to:

    id.loc.gov/authorities/names/label/Col%C3%B4nia%20Santa%20Izabel%20(Hospital%20%3A%20Belo%20Horizonte%2C%20Brazil)

    Unencoding the "(" and ")" values.  I don't think these values should be unencoding, and I'm not sure how to stop it.  

    See this thread:

    HttpClient decodes encoded Url?
    https://stackoverflow.com/questions/42072156/httpclient-decodes-encoded-url

    And related:

    HttpClient decodes encoded Url?
    https://github.com/dotnet/runtime/issues/20126

    - Wayne

    Sunday, February 16, 2020 7:24 AM
  • When I use originalstring, I get back the originalstring.  But this doesn't help me.  The data is being unencoded when the request is being made when the uri is passed to the httpwebrequet compoment.
    Sunday, February 16, 2020 1:31 PM
  • I'm not sure how this helps.  Per the specification noted in the github link, the characters "(" and ")" shouldn't be unencoded.  However, when the URI is passed to the HttpClient, or the underlying httpwebrequest -- as in:

    System.Net.HttpWebRequest objRequest =
                (System.Net.HttpWebRequest)System.Net.WebRequest.Create(search_uri);

    The value based to the server will have the characters "(" and ")" unencoded and cause the URL to fail.  i need these components to stop unencoding the URI.  This feels to me a lot like the same bug that use to be in the URI parser around forward slashes and periods in previous versions.

    --tr

    Sunday, February 16, 2020 1:36 PM
  • Ok -- I did more looking -- and found that the problem is my targeted framework.  If you are on .NET 4.7.2 or 4.8, you can add this to the app.config file and all is well:

      <uri>
        <schemeSettings>
          <add name="http" genericUriParserOptions="DontUnescapePathDotsAndSlashes"/>      
        </schemeSettings>
      </uri>

    In every version previous (and I had a component targeting 4.7.1 within a 4.8 app) -- this won't do anything, and the URIs fail to parse properly when passed through a client.

    My solution, to make sure everything is moved up to at least 4.7.2 so that this actually works.


    • Marked as answer by reeset Sunday, February 16, 2020 4:50 PM
    Sunday, February 16, 2020 4:50 PM