locked
dowmload source of web page RRS feed

  • Question

  • hello everybody.

    i want to download web page and its source , i used this code but i have a problem that is for a lot of web page give me an error for example when i use www.google.com it works good but when i use www.ask.com or www.anything.x show me:" the underlying connection was closed". could anybody tells me?

    try
                {
                    
                    using (WebClient client = new WebClient())
                    {
                        client.DownloadFile(textBox2.Text, @"E:\html\1.html");
                        string value = client.DownloadString(textBox2.Text);
    
                        System.Windows.MessageBox.Show(value);
                        System.Windows.MessageBox.Show("NOTE:"+ "the address of your file is:E\\html\\1.html", "INFORMATION", MessageBoxButton.OK, MessageBoxImage.Information);
                    }
                }
    
    
                catch (Exception ex)
                {
    
                    System.Windows.MessageBox.Show("YOU HAVE ONE PROBLEM:" + ex.Message, "Error", MessageBoxButton.OK, MessageBoxImage.Asterisk);
                }
                
    

     

    Sunday, December 11, 2011 8:23 AM

Answers

  • I need either html file and source of file.

     

    Nobody knows it?


    What do you want that this code doesn't provide?

    using System;
    using System.ComponentModel;
    using System.Windows.Forms;
    using System.Net;
    using System.IO;
    using System.Diagnostics;
    namespace WindowsFormsApplication1
    {
      public partial class Form1 : Form
      {
        public Form1()
        {
          InitializeComponent();
        }
        private void button1_Click(object sender, EventArgs e)
        {
          WebClient wc = new WebClient();
          wc.DownloadFileCompleted += WC_DownloadComplete;
          wc.DownloadFileAsync(new Uri(@"http://www.yahoo.com"), "WebSite.html");
        }
        void WC_DownloadComplete(object sender, AsyncCompletedEventArgs e)
        {
          if (File.Exists("WebSite.txt")) File.Delete("WebSite.txt");
          File.Copy("WebSite.html", "WebSite.txt");
          Process.Start("WebSite.html");
          Process.Start("WebSite.txt");
          (sender as WebClient).Dispose();
        }
      }
    }
    

     


    • Edited by JohnWein Tuesday, December 13, 2011 7:45 PM
    • Marked as answer by jon smith tommy Sunday, January 1, 2012 3:05 AM
    Tuesday, December 13, 2011 7:10 PM
  • > i want to download web page and its source [...] when i use www.ask.com or www.anything.x show me:" the underlying connection was closed". could anybody tells me?
     
     
    try using the following code

    using System;
    using System.Windows.Forms;
    
    namespace WindowsFormsApplication5
    {
        public partial class Form1 : Form
        {
            public Form1()
            {
                this.Size = new System.Drawing.Size(600, 500);
                var tb = new RichTextBox { Parent = this, Dock = DockStyle.Fill };
                LoadHtml("www.ask.com", (url, html) =>
                    tb.AppendText(String.Concat("=== ", url, " ===\n", html, "\n\n")));
            }
    
            void LoadHtml(string url, Action<string, string> callback)
            {
                var wb = new WebBrowser();
                wb.DocumentCompleted += (s, e) =>
                    callback(e.Url.ToString(), wb.Document.Body.Parent.OuterHtml);
                wb.Navigate(url);
            }
        }
    }
    
    
      
      
    Thursday, December 22, 2011 10:08 AM

All replies

  • You're downloading the same thing twice.  Download the string and save the string to a file with an html extension.
    Sunday, December 11, 2011 9:46 AM
  • I need either html file and source of file.

    Nobody knows it?

    Tuesday, December 13, 2011 7:02 PM
  • I need either html file and source of file.

     

    Nobody knows it?


    What do you want that this code doesn't provide?

    using System;
    using System.ComponentModel;
    using System.Windows.Forms;
    using System.Net;
    using System.IO;
    using System.Diagnostics;
    namespace WindowsFormsApplication1
    {
      public partial class Form1 : Form
      {
        public Form1()
        {
          InitializeComponent();
        }
        private void button1_Click(object sender, EventArgs e)
        {
          WebClient wc = new WebClient();
          wc.DownloadFileCompleted += WC_DownloadComplete;
          wc.DownloadFileAsync(new Uri(@"http://www.yahoo.com"), "WebSite.html");
        }
        void WC_DownloadComplete(object sender, AsyncCompletedEventArgs e)
        {
          if (File.Exists("WebSite.txt")) File.Delete("WebSite.txt");
          File.Copy("WebSite.html", "WebSite.txt");
          Process.Start("WebSite.html");
          Process.Start("WebSite.txt");
          (sender as WebClient).Dispose();
        }
      }
    }
    

     


    • Edited by JohnWein Tuesday, December 13, 2011 7:45 PM
    • Marked as answer by jon smith tommy Sunday, January 1, 2012 3:05 AM
    Tuesday, December 13, 2011 7:10 PM
  • No my friend it works for me but when i use some web site else for download it give me an error(The underlying connction was closed ) but my browser can open it.

    do you know what's my problem?

    Tuesday, December 13, 2011 9:47 PM
  • "when i use some web site else for download it give me an error(The underlying connction was closed )"

    Any particular web site?  Errors can happen for any web site.  Check for an error and retry.

    Tuesday, December 13, 2011 10:01 PM
  • Hi jon,
    Welcome to the MSDN forum!

    You may try HttpWebRequest and HttpWebResponse instead. In some cases they give a more stable result when the traffic is not so well or when the source file is large.

    I suggest you look at the helpful links:
    http://support.microsoft.com/kb/826210 
    http://weblogs.asp.net/jan/archive/2004/01/28/63771.aspx

    Let me know if it helps.

    Thanks.

    Yoyo.


    Yoyo Jiang[MSFT]
    MSDN Community Support | Feedback to us


    Thursday, December 22, 2011 3:10 AM
    Moderator
  • > i want to download web page and its source [...] when i use www.ask.com or www.anything.x show me:" the underlying connection was closed". could anybody tells me?
     
     
    try using the following code

    using System;
    using System.Windows.Forms;
    
    namespace WindowsFormsApplication5
    {
        public partial class Form1 : Form
        {
            public Form1()
            {
                this.Size = new System.Drawing.Size(600, 500);
                var tb = new RichTextBox { Parent = this, Dock = DockStyle.Fill };
                LoadHtml("www.ask.com", (url, html) =>
                    tb.AppendText(String.Concat("=== ", url, " ===\n", html, "\n\n")));
            }
    
            void LoadHtml(string url, Action<string, string> callback)
            {
                var wb = new WebBrowser();
                wb.DocumentCompleted += (s, e) =>
                    callback(e.Url.ToString(), wb.Document.Body.Parent.OuterHtml);
                wb.Navigate(url);
            }
        }
    }
    
    
      
      
    Thursday, December 22, 2011 10:08 AM
  • Thanks Yoyo but i can't find Internet Information Services Manager in Administrative Tools for change timeout value.i use win7(x64). Thanks Malobukv but could you explain your code?
    • Edited by jon smith tommy Saturday, December 24, 2011 8:13 AM to be better
    Saturday, December 24, 2011 7:30 AM
  • What problems are you having now?
    Saturday, December 24, 2011 8:04 AM
  • I want when i type any address of website in textbox and click on the button(download source) my program read source of the webpage but my code doesn't work for some website .

    could you help  me?

    Saturday, December 24, 2011 8:19 AM
  • I want when i type any address of website in textbox and click on the button(download source) my program read source of the webpage but my code doesn't work for some website .

    could you help  me?


    Post your error checking code.  What sites do you have problems with?  I had no problem with ask.com.
    Saturday, December 24, 2011 8:31 AM
  • I have written the error in the first post , i dont know exactly. for some website like www.ask.com it doesn't work for me.i will check it again.

    Saturday, December 24, 2011 8:54 AM
  • The code I posted should do what you want when you added code to check for errors.  If there was an error you have to retry if you want to get the web page.  Checking for an error can be as simple as showing a message box if there is an error.  I don't understand what problem you are having.  If you need a 100% reliable connection to every web site, you are asking for something that isn't possible.
    Personally, if I get 5 errors I assume the site is down, advise the user and abandon further attemps.
    • Edited by JohnWein Saturday, December 24, 2011 9:30 AM
    Saturday, December 24, 2011 9:28 AM
  • I will just add to this that you have written several times "... like www.ask.com".  There are some sites that require or generate different content dependent on the useragent string passed to it.  I have seen some sites return errors when the default agent is used in the webclient, where specifying something, even something silly like "MOSAIC 0.89" in the user agent string "worked". 
     
     

    --
    Mike
    Saturday, December 24, 2011 12:40 PM