Get html source with javascript


  • Hello,

    I would like to get all html from the page I'm viewing. Simmilar to inner and outerhtml.

    I have got this wich works:

    this.htmlsource = this.doctype + this.encode_unicode(this.contentdocument.documentElement.outerHTML);

    This do not include the extra linebreaks from the original document. How can I preserve these?
    I need to get the exact data.

    Any help is appreciated.


    Monday, August 24, 2009 10:31 AM

All replies

  • Hi,

    This works for me and puts a line number in the left column.

       srcStr=highLight( htmlEncode(win.document.documentElement.outerHTML ));

    function htmlEncode(s) {// ver.0.70b with line number
     s=s.replace(/ /ig,"\&nbsp;");
     //* Optional Line Number
     return s;

    the missing function highLight adds keyword color coding.

    Thanks you question helped me solve my own program.

    Tuesday, August 25, 2009 12:50 AM
  • Hello,

    Thanks for replying.
    I don't know if this will work, because when I get outerhtml from the page all extra line breaks are removed including extra
    white space. I would like to preserve these, most importantly linebreaks. In your code, wich is nice, the linebreaks i replaced
    by <br>. Sorry, this is not what I want. If I look in the "source" of the page through right mouse klick.. I get the extra linebreaks like
    the .html file is set up. It seems IE keeps removing these when using inner och outerhtml and I cannot get around it.

    An example:
    A webdeveloper, wich uses our cms system, would like to see the same structured html code(his code), not a new one that
    IE controls with striped line breaks and whitespace.

    So, that is what I am trying to accomplish and I think it's impossible.


    Tuesday, August 25, 2009 7:13 AM