locked
Convert Email to Text RRS feed

  • Question

  • Is there any way in C# to convert an email into plain text? Attachments are not important. Just need the text as plain text in the body of the email. We are saving like 400 emails to a folder every day, may double soon. And need to pull just the text content from them.
    Monday, August 29, 2016 12:47 PM

Answers

  • Hello,

    Have only done this with eml files, not directly from say MS-Outlook. In the sample there are several email files (.eml) generated via a test project that redirects smtp message to files via PickUpFolderLocation.

    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Windows.Forms;
    
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }
        private string[] GetFileNames(string targetDrirectory)
        {
            return Directory.GetFiles(targetDrirectory);
        }
    
        private DateTime GetFileDateModified(string filePath)
        {
    
            FileInfo fileInfo = new FileInfo(filePath);
            return fileInfo.LastWriteTime;
        }
    
        private void button1_Click(object sender, EventArgs e)
        {
            List<Email> emailList = new List<Email>();
            string targetDirectory = "C:\\MailPickup";
            string[] fileNames = GetFileNames(targetDirectory);
            foreach (string emlFilePath in fileNames)
            {
    
                Email email = new Email();
    
                CDO.Message msg = new CDO.Message();
                ADODB.Stream stream = new ADODB.Stream();
    
                stream.Open(
                    Type.Missing, 
                    ADODB.ConnectModeEnum.adModeUnknown, 
                    ADODB.StreamOpenOptionsEnum.adOpenStreamUnspecified, 
                    String.Empty, 
                    String.Empty);
    
                stream.LoadFromFile(emlFilePath);
                stream.Flush();
                msg.DataSource.OpenObject(stream, "_Stream");
                msg.DataSource.Save();
    
                email.time = GetFileDateModified(emlFilePath);
                email.fileName = Path.GetFileName(emlFilePath);
                email.fromAddress = msg.From;
                email.toAddress = msg.To;
                email.emailSubject = msg.Subject;
                email.content = msg.HTMLBody;
                emailList.Add(email);
                stream.Close();
            }
    
            emailList.ForEach(mailItem =>
              {
                  Console.WriteLine(mailItem.fileName);
                  Console.WriteLine(mailItem.emailSubject);
                  Console.WriteLine(mailItem.fromAddress);
                  Console.WriteLine();
              }
    
            );
    
        }
    }
    
    
    public class Email
    {
        public Int32 id { get; set; }
        public DateTime time { get; set; }
        public string fileName { get; set; }
        public string fromAddress { get; set; }
        public string toAddress { get; set; }
        public string emailSubject { get; set; }
        public string content { get; set; }
    }



    Please remember to mark the replies as answers if they help and unmark them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via my MSDN profile but will not answer coding question on either.
    VB Forums - moderator
    profile for Karen Payne on Stack Exchange, a network of free, community-driven Q&A sites

    • Proposed as answer by Kevin Linq Wednesday, September 7, 2016 4:38 AM
    • Marked as answer by DotNet Wang Thursday, September 8, 2016 5:56 AM
    Monday, August 29, 2016 3:00 PM
  • Depends upon how you're getting the emails?  If you're reading it straight from Exchange or another API then you'll have to post in those forums. If you're reading the raw message from disk then it would be different. 

    Curious to know why you'd be doing this though?  Auditing and archiving is support by every mail server so why not just set that up and then you don't have to worry about it. If you're wanting to track outgoing mail then set up a centralized mail service that all your apps send emails to. It can then log the message in a DB or something for auditing purposes before forwarding on to your mail server. Alternatively you could set up an SMTP server that supports auditing and then simply relays the message to your actual mail server. This isn't really a situation where I think programming is necessary.

    Michael Taylor
    http://www.michaeltaylorp3.net

    • Marked as answer by DotNet Wang Thursday, September 8, 2016 5:56 AM
    Monday, August 29, 2016 3:04 PM
  • Hi HTHP :

    Thanks for posting there .

    I hope my reply would do help to you .

    1. If you want to read the text straightly ,you can use Exchange webservice api (Microsoft.Exchange.WebServices).

    2 If you want to convert HTML to Plain Text by C# language , you need to  uses System.Text.RegularExpressions namespace and consists of a single f


    unction, StripHTML().  Here is the codesample :

    private string StripHTML(string source) { try { string result; // Remove HTML Development formatting // Replace line breaks with space // because browsers inserts space result = source.Replace("\r", " "); // Replace line breaks with space // because browsers inserts space result = result.Replace("\n", " "); // Remove step-formatting result = result.Replace("\t", string.Empty); // Remove repeating spaces because browsers ignore them result = System.Text.RegularExpressions.Regex.Replace(result, @"( )+", " "); // Remove the header (prepare first by clearing attributes) result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*head([^>])*>","<head>", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"(<( )*(/)( )*head( )*>)","</head>", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, "(<head>).*(</head>)",string.Empty, System.Text.RegularExpressions.RegexOptions.IgnoreCase); // remove all scripts (prepare first by clearing attributes) result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*script([^>])*>","<script>", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"(<( )*(/)( )*script( )*>)","</script>", System.Text.RegularExpressions.RegexOptions.IgnoreCase); //result = System.Text.RegularExpressions.Regex.Replace(result, // @"(<script>)([^(<script>\.</script>)])*(</script>)", // string.Empty, // System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"(<script>).*(</script>)",string.Empty, System.Text.RegularExpressions.RegexOptions.IgnoreCase); // remove all styles (prepare first by clearing attributes) result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*style([^>])*>","<style>", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"(<( )*(/)( )*style( )*>)","</style>", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, "(<style>).*(</style>)",string.Empty, System.Text.RegularExpressions.RegexOptions.IgnoreCase); // insert tabs in spaces of <td> tags result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*td([^>])*>","\t", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // insert line breaks in places of <BR> and <LI> tags result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*br( )*>","\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*li( )*>","\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // insert line paragraphs (double line breaks) in place // if <P>, <DIV> and <TR> tags result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*div([^>])*>","\r\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*tr([^>])*>","\r\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*p([^>])*>","\r\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // Remove remaining tags like <a>, links, images, // comments etc - anything that's enclosed inside < > result = System.Text.RegularExpressions.Regex.Replace(result, @"<[^>]*>",string.Empty, System.Text.RegularExpressions.RegexOptions.IgnoreCase); // replace special characters: result = System.Text.RegularExpressions.Regex.Replace(result, @" "," ", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&bull;"," * ", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&lsaquo;","<", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&rsaquo;",">", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&trade;","(tm)", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&frasl;","/", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&lt;","<", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&gt;",">", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&copy;","(c)", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&reg;","(r)", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // Remove all others. More can be added, see // http://hotwired.lycos.com/webmonkey/reference/special_characters/ result = System.Text.RegularExpressions.Regex.Replace(result, @"&(.{2,6});", string.Empty, System.Text.RegularExpressions.RegexOptions.IgnoreCase); // for testing //System.Text.RegularExpressions.Regex.Replace(result, // this.txtRegex.Text,string.Empty, // System.Text.RegularExpressions.RegexOptions.IgnoreCase); // make line breaking consistent result = result.Replace("\n", "\r"); // Remove extra line breaks and tabs: // replace over 2 breaks with 2 and over 4 tabs with 4. // Prepare first to remove any whitespaces in between // the escaped characters and remove redundant tabs in between line breaks result = System.Text.RegularExpressions.Regex.Replace(result, "(\r)( )+(\r)","\r\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, "(\t)( )+(\t)","\t\t", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, "(\t)( )+(\r)","\t\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, "(\r)( )+(\t)","\r\t", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // Remove redundant tabs result = System.Text.RegularExpressions.Regex.Replace(result, "(\r)(\t)+(\r)","\r\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // Remove multiple tabs following a line break with just one tab result = System.Text.RegularExpressions.Regex.Replace(result, "(\r)(\t)+","\r\t", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // Initial replacement target string for line breaks string breaks = "\r\r\r"; // Initial replacement target string for tabs string tabs = "\t\t\t\t\t"; for (int index=0; index<result.Length; index++) { result = result.Replace(breaks, "\r\r"); result = result.Replace(tabs, "\t\t\t\t"); breaks = breaks + "\r"; tabs = tabs + "\t"; } // That's it. return result; } catch { MessageBox.Show("Error"); return source; }


    In addition , I found a blog which satisfied your need .

    here is the link :https://blogs.msdn.microsoft.com/ukcrm/2008/07/10/converting-html-e-mail-to-plain-text/

    If you think it is helpful ,please mark it .

    Best regards

     Kevin


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place. Click HERE to participate the survey.

    • Marked as answer by DotNet Wang Thursday, September 8, 2016 5:57 AM
    Tuesday, August 30, 2016 4:58 AM
  • I thank everyone for the contributions. This community is great!

    It turns out I have to move on from this issue and do not have time to resolve it fully. We have come up with an alternate solution to our problem.

    If I can come back to this, I will. It is not to archive the emails, but to extract data from them. The html was making that a pain, so I wanted to convert it to plain text. However, as I said, I think we are taking a different approach besides email body.

    Thanks again!
    • Proposed as answer by DotNet Wang Wednesday, September 7, 2016 5:51 AM
    • Marked as answer by DotNet Wang Thursday, September 8, 2016 5:56 AM
    Tuesday, August 30, 2016 12:51 PM

All replies

  • Hello,

    Have only done this with eml files, not directly from say MS-Outlook. In the sample there are several email files (.eml) generated via a test project that redirects smtp message to files via PickUpFolderLocation.

    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Windows.Forms;
    
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }
        private string[] GetFileNames(string targetDrirectory)
        {
            return Directory.GetFiles(targetDrirectory);
        }
    
        private DateTime GetFileDateModified(string filePath)
        {
    
            FileInfo fileInfo = new FileInfo(filePath);
            return fileInfo.LastWriteTime;
        }
    
        private void button1_Click(object sender, EventArgs e)
        {
            List<Email> emailList = new List<Email>();
            string targetDirectory = "C:\\MailPickup";
            string[] fileNames = GetFileNames(targetDirectory);
            foreach (string emlFilePath in fileNames)
            {
    
                Email email = new Email();
    
                CDO.Message msg = new CDO.Message();
                ADODB.Stream stream = new ADODB.Stream();
    
                stream.Open(
                    Type.Missing, 
                    ADODB.ConnectModeEnum.adModeUnknown, 
                    ADODB.StreamOpenOptionsEnum.adOpenStreamUnspecified, 
                    String.Empty, 
                    String.Empty);
    
                stream.LoadFromFile(emlFilePath);
                stream.Flush();
                msg.DataSource.OpenObject(stream, "_Stream");
                msg.DataSource.Save();
    
                email.time = GetFileDateModified(emlFilePath);
                email.fileName = Path.GetFileName(emlFilePath);
                email.fromAddress = msg.From;
                email.toAddress = msg.To;
                email.emailSubject = msg.Subject;
                email.content = msg.HTMLBody;
                emailList.Add(email);
                stream.Close();
            }
    
            emailList.ForEach(mailItem =>
              {
                  Console.WriteLine(mailItem.fileName);
                  Console.WriteLine(mailItem.emailSubject);
                  Console.WriteLine(mailItem.fromAddress);
                  Console.WriteLine();
              }
    
            );
    
        }
    }
    
    
    public class Email
    {
        public Int32 id { get; set; }
        public DateTime time { get; set; }
        public string fileName { get; set; }
        public string fromAddress { get; set; }
        public string toAddress { get; set; }
        public string emailSubject { get; set; }
        public string content { get; set; }
    }



    Please remember to mark the replies as answers if they help and unmark them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via my MSDN profile but will not answer coding question on either.
    VB Forums - moderator
    profile for Karen Payne on Stack Exchange, a network of free, community-driven Q&A sites

    • Proposed as answer by Kevin Linq Wednesday, September 7, 2016 4:38 AM
    • Marked as answer by DotNet Wang Thursday, September 8, 2016 5:56 AM
    Monday, August 29, 2016 3:00 PM
  • Depends upon how you're getting the emails?  If you're reading it straight from Exchange or another API then you'll have to post in those forums. If you're reading the raw message from disk then it would be different. 

    Curious to know why you'd be doing this though?  Auditing and archiving is support by every mail server so why not just set that up and then you don't have to worry about it. If you're wanting to track outgoing mail then set up a centralized mail service that all your apps send emails to. It can then log the message in a DB or something for auditing purposes before forwarding on to your mail server. Alternatively you could set up an SMTP server that supports auditing and then simply relays the message to your actual mail server. This isn't really a situation where I think programming is necessary.

    Michael Taylor
    http://www.michaeltaylorp3.net

    • Marked as answer by DotNet Wang Thursday, September 8, 2016 5:56 AM
    Monday, August 29, 2016 3:04 PM
  • Hi HTHP :

    Thanks for posting there .

    I hope my reply would do help to you .

    1. If you want to read the text straightly ,you can use Exchange webservice api (Microsoft.Exchange.WebServices).

    2 If you want to convert HTML to Plain Text by C# language , you need to  uses System.Text.RegularExpressions namespace and consists of a single f


    unction, StripHTML().  Here is the codesample :

    private string StripHTML(string source) { try { string result; // Remove HTML Development formatting // Replace line breaks with space // because browsers inserts space result = source.Replace("\r", " "); // Replace line breaks with space // because browsers inserts space result = result.Replace("\n", " "); // Remove step-formatting result = result.Replace("\t", string.Empty); // Remove repeating spaces because browsers ignore them result = System.Text.RegularExpressions.Regex.Replace(result, @"( )+", " "); // Remove the header (prepare first by clearing attributes) result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*head([^>])*>","<head>", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"(<( )*(/)( )*head( )*>)","</head>", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, "(<head>).*(</head>)",string.Empty, System.Text.RegularExpressions.RegexOptions.IgnoreCase); // remove all scripts (prepare first by clearing attributes) result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*script([^>])*>","<script>", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"(<( )*(/)( )*script( )*>)","</script>", System.Text.RegularExpressions.RegexOptions.IgnoreCase); //result = System.Text.RegularExpressions.Regex.Replace(result, // @"(<script>)([^(<script>\.</script>)])*(</script>)", // string.Empty, // System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"(<script>).*(</script>)",string.Empty, System.Text.RegularExpressions.RegexOptions.IgnoreCase); // remove all styles (prepare first by clearing attributes) result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*style([^>])*>","<style>", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"(<( )*(/)( )*style( )*>)","</style>", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, "(<style>).*(</style>)",string.Empty, System.Text.RegularExpressions.RegexOptions.IgnoreCase); // insert tabs in spaces of <td> tags result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*td([^>])*>","\t", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // insert line breaks in places of <BR> and <LI> tags result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*br( )*>","\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*li( )*>","\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // insert line paragraphs (double line breaks) in place // if <P>, <DIV> and <TR> tags result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*div([^>])*>","\r\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*tr([^>])*>","\r\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"<( )*p([^>])*>","\r\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // Remove remaining tags like <a>, links, images, // comments etc - anything that's enclosed inside < > result = System.Text.RegularExpressions.Regex.Replace(result, @"<[^>]*>",string.Empty, System.Text.RegularExpressions.RegexOptions.IgnoreCase); // replace special characters: result = System.Text.RegularExpressions.Regex.Replace(result, @" "," ", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&bull;"," * ", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&lsaquo;","<", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&rsaquo;",">", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&trade;","(tm)", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&frasl;","/", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&lt;","<", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&gt;",">", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&copy;","(c)", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, @"&reg;","(r)", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // Remove all others. More can be added, see // http://hotwired.lycos.com/webmonkey/reference/special_characters/ result = System.Text.RegularExpressions.Regex.Replace(result, @"&(.{2,6});", string.Empty, System.Text.RegularExpressions.RegexOptions.IgnoreCase); // for testing //System.Text.RegularExpressions.Regex.Replace(result, // this.txtRegex.Text,string.Empty, // System.Text.RegularExpressions.RegexOptions.IgnoreCase); // make line breaking consistent result = result.Replace("\n", "\r"); // Remove extra line breaks and tabs: // replace over 2 breaks with 2 and over 4 tabs with 4. // Prepare first to remove any whitespaces in between // the escaped characters and remove redundant tabs in between line breaks result = System.Text.RegularExpressions.Regex.Replace(result, "(\r)( )+(\r)","\r\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, "(\t)( )+(\t)","\t\t", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, "(\t)( )+(\r)","\t\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); result = System.Text.RegularExpressions.Regex.Replace(result, "(\r)( )+(\t)","\r\t", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // Remove redundant tabs result = System.Text.RegularExpressions.Regex.Replace(result, "(\r)(\t)+(\r)","\r\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // Remove multiple tabs following a line break with just one tab result = System.Text.RegularExpressions.Regex.Replace(result, "(\r)(\t)+","\r\t", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // Initial replacement target string for line breaks string breaks = "\r\r\r"; // Initial replacement target string for tabs string tabs = "\t\t\t\t\t"; for (int index=0; index<result.Length; index++) { result = result.Replace(breaks, "\r\r"); result = result.Replace(tabs, "\t\t\t\t"); breaks = breaks + "\r"; tabs = tabs + "\t"; } // That's it. return result; } catch { MessageBox.Show("Error"); return source; }


    In addition , I found a blog which satisfied your need .

    here is the link :https://blogs.msdn.microsoft.com/ukcrm/2008/07/10/converting-html-e-mail-to-plain-text/

    If you think it is helpful ,please mark it .

    Best regards

     Kevin


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place. Click HERE to participate the survey.

    • Marked as answer by DotNet Wang Thursday, September 8, 2016 5:57 AM
    Tuesday, August 30, 2016 4:58 AM
  • I thank everyone for the contributions. This community is great!

    It turns out I have to move on from this issue and do not have time to resolve it fully. We have come up with an alternate solution to our problem.

    If I can come back to this, I will. It is not to archive the emails, but to extract data from them. The html was making that a pain, so I wanted to convert it to plain text. However, as I said, I think we are taking a different approach besides email body.

    Thanks again!
    • Proposed as answer by DotNet Wang Wednesday, September 7, 2016 5:51 AM
    • Marked as answer by DotNet Wang Thursday, September 8, 2016 5:56 AM
    Tuesday, August 30, 2016 12:51 PM
  • Hi HTHP:

    Thanks for feedback .

    Have you solve your issue now?

    If yes ,please remember to mark the useful reply as answer to close the thread.

    If no ,see the reply below . If you use outlook2010 and later, you can follow 6 steps :

    • Start Outlook.
    • Click the File tab in the Ribbon, and then click Options on the menu.
    • Click Trust Center on the Options menu.
    • Click the Trust Center Settings tab.
    • Click E-mail Security.
    • Under Read as Plain Text, click to select the Read all standard mail in plain text check box.
     also ,you can refer to this link:

    http://lifehacker.com/5407412/convert-outlook-emails-to-plain-text-one-by-one-or-permanently 

    If you want to extract data , you can use the free and open source HtmlAgilityPack which has in one of its samples a method that converts from HTML to plain text . Here is the code snippet :

    var plainText = ConvertToPlainText(string html);

    From your description , I understand you extract data from email .

    Best regards

    Kevin


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place. Click HERE to participate the survey.


    • Edited by Kevin Linq Wednesday, September 7, 2016 1:13 PM
    Wednesday, September 7, 2016 1:13 PM