none
Web Archive and References to an Anchor

    Question

  • I have a program that creates a Web Archive (.mht) from multiple html files. One of the HTML files refers to an Anchor in another HTML file placed in the archive. If I have in the file F3.HTM in the archive with the reference:

    <a href="F2.HTM">Normal Link</a>

    It correctly opens F2.HTM from the archive and the properties show: mhtml:file://C:\MyArchive.mht!F2.htm

    but if I have

    <a href="F2.HTM#LN749">Anchor Link</a>

    It fails to open the page and the properties show: http://localhost/MyArchive/F2.HTM#LN749

    Is it possible to use an Anchor / Name in a Web Archive, or any other method of causing a link to open another page a a specific point?

    From: <test>
    Subject: test
    Date: 21/04/2014 15:48:12
    MIME-Version: 1.0
    Content-Type: multipart/related; boundary="Boundry-test"; type=text/html
    This is a multi-part message in MIME format.
    --Boundry-test
    Content-Type: text/html
    Content-Transfer-Encoding: binary
    Content-Location: http://LocalHost/test/F3.htm
    <html><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252"><title>test</title></head><body><a href="F2.HTM">Normal Link</a></p><a href="F2.HTM#LN749">Anchor Link</a></body></html>
    --Boundry-test
    Content-Type: text/html
    Content-Transfer-Encoding: binary
    Content-Location: http://LocalHost/test/F2.htm
    <html><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252"><title>test</title></head><body>
    <a name="LN749"></a>Test</body></html>
    --Boundry-test--

    Monday, April 21, 2014 4:04 PM

All replies

  • Hello,

    I am trying to involve someone familiar with this issue to come into this thread. Thank you for your understanding.

    Regards,


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Wednesday, April 23, 2014 1:29 AM
    Moderator
  • Hi,

    So far as I know, there is no way to use an anchor/name in a web archive for IE.

    Duringy my test, IE failed to navigate to a certain position of another page even if I set absolute URL for an anchor.

    <a href=3D"1.html">normal link</a>
    <a href=3D"mhtml:file://C:\default.mht!1.html#id">test</a>

    Best regards,

    Sheng Jiang | Support Engineer

    Global Business Support | Microsoft Corporation


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.


    • Edited by APAC DSI - MSFT Thursday, April 24, 2014 5:33 AM Correct signature
    Thursday, April 24, 2014 5:33 AM
  • So far as I know, there is no way to use an anchor/name in a web archive for IE.

    Sheng Jiang | Support Engineer

    Global Business Support | Microsoft Corporation

    So do we classify this as a Bug or a Feature?

    I have a very good commercial use for having an Anchor in a Web archive and I am sure there are "Hundreds" more uses. Can this be fixed? (We use a Web archive to transfer medical records between totally disparate systems. The Web archive acts like an interactive view of the medical record as it appears in our software, so when placed as an attachment in other software it can be viewed correctly and in context).

    Thursday, April 24, 2014 8:31 AM
  • Hi,

    This should be by design. I am unable to guarantee there is no other workarounds to solve your problem. If you do need a workaround or request for a fix or a confirmation form Microsoft, maybe you need to visit the below link to see the various free and paid support options that are available to better meet your needs.

    http://support.microsoft.com/default.aspx?id=fh;en-us;offerprophone

    Best regards,

    Sheng Jiang | Support Engineer

    Global Business Support | Microsoft Corporation


    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.

    Friday, May 02, 2014 1:59 AM
  • It's not "by-design" per-se; it's simply the case that the MHT format has had zero significant investment since the Outlook Express team was disbanded in the mid-200Xs and the only fixes since then have been security-related.

    You might try playing with the encoding (e.g. using %23 for # and the like) to see if it helps but it's entirely possible that fragment navigations were killed as a part of a security fix.

    Friday, May 02, 2014 9:12 PM
  • So do we classify this as a Bug or a Feature?.

    File protocol is not the same as HTTP.  Anyway, how would we support users who actually had # in their file names?

    What you could do is host the file locally with HTTP.  Then anchors would work.



    Robert Aldwinckle
    ---

    Saturday, May 03, 2014 11:10 PM
  • File protocol is not the same as HTTP.  Anyway, how would we support users who actually had # in their file names?

    What you could do is host the file locally with HTTP.  Then anchors would work.


    Maybe I should explain the use case. We do Doctor Practice management software. Often we need to export a patient's medical record for importing into a totally incompatible software package. Many systems print the record to a PDF document and export scanned letters and referral letters to multiple files.  Our Windows package employs a Tabbed view of the patient record with Tabs like "Summary, Visits, Jabs, Scripts, Letters, Biochemistry, etc.". So our solution is to export each of these Tabs to individual HTML files and also export the Scans, Letters, Medical Images to TIF files. Because most software packages allow for the importing of any single FILE and execute that file for viewing based on the extension of the file, we take the files we export and create a single .MHT file which can be imported into to any package and executed and provide the doctor with an interactive view of the patient record. This file is written to a CD , DVD or USB stick and sent to the new GP.

    As to a File protocol vs an HTTP protocol, I don't agree the HREF used is exactly the same in both. The other difference with a .MHT file is that it is the file mention in the HREF is in the archive instead of a local file in the same directory and the use case for .MHT files is exactly what we are trying to do.

    Sunday, May 04, 2014 11:56 PM
  • As to a File protocol vs an HTTP protocol, I don't agree the HREF used is exactly the same in both.

    Ref:

    http://tools.ietf.org/html/rfc3986#section-3.5

    (BING search for
        rfc http url file format fragment

    )

    Apparently URI have changed since I last looked.  Now all you need to do is find out where the file "scheme" is documented and whether it has significantly changed since RFC1768.

    http://www.faqs.org/rfcs/rfc1738.html

    <quote>

    3.10 FILES

       The file URL scheme is used to designate files accessible on a
       particular host computer.

    </quote>

    So, what is a fragment of an OS file and how is it implemented?



    Robert Aldwinckle
    ---

    Monday, May 05, 2014 8:09 AM
  • The file protocol for a URL mentions: " the character "#" must be encoded within URLs even in systems that do not normally deal with fragment or anchor  identifiers, so that if the URL is copied into another system that does use them, it will not be necessary to change the URL encoding."

    So basically the support for anchor identifiers is OS specific in files. Internet Explorer SUPPORTS the anchor from one HTML file to another HTML file as the extracted files work before putting them into a Web archive (MHT) file.

    By definition a Web archive is a method to place all documents from a Web site or directory tree into a single file for ease of transport that should work exactly like the files were in a directory tree on disk. Except for Anchors this works as advertised. It is a bug, and there seems to be good use cases for fixing this bug.

    Carl Beame

    Monday, May 05, 2014 12:04 PM
  • Except for Anchors this works as advertised.

    Did you try Eric's suggestion?  What happens?  Use ProcMon to get a clearer idea?  Who should be seeing the anchor and how?  Perhaps you will now need to escape the # at least once to pass it into the MSHTML protocol but not through the file protocol (otherwise, my case, how to do a real # for the file protocol? escape the escape so MSHTML has to handle an escaped #?, and maintain compatibility how?)  Seems WAD to me.

    Try using the data: protocol instead?  But use of fragments in it apparently has a similar controversy.

    https://bugzilla.mozilla.org/show_bug.cgi?id=243917



    Robert Aldwinckle
    ---

    Monday, May 05, 2014 4:00 PM