locked
Extract email list from a csv file

    Question

  • Hello,

    I have hundreds of contacts in my gmail. I exported all contacts to a csv file by using gmail built-in export function. All contacts fall into several catalogues such as WTCDE New All_2 etc. The csv file's format is wild, each contact has one line in the file.
    Ex,

    Roger Sune,Roger,,Sune,,,,,,,,,,,,,,,,,,,,,,,WTCDE New All_2 ::: WTCDE Bible ::: WTCDE attendants ::: WTCDE formal members ::: sunday worship coworkers ::: * My Contacts,* ,rogerpkSune@aol.com,,,,,,,,,,,,,,,,,,,,,,,,,
    Rollin Burwers,Rollin,,Burwers,,,,,,,,,,,,,,,,,,,,,,,,* ,burwers@mindspring.com,,,,,,,,,,,,,,,,,,,,,,,,,
    aking@abcd.us,aking@abcd.us,,,,,,,,,,,,,,,,,,,,,,,,,,* ,aking@abcd.us,,,,,,,,,,,,,,,,,,,,,,,,,

    Now I want to extract all email to a file.
    The output file's format likes
    rogerpkSune@aol.com,burwers@mindspring.com,aking@abcd.us 
    

    Thanks.
    Monday, September 19, 2011 5:53 PM

Answers

  • using System;
    using System.Collections.Generic;
    using System.Text;
    using System.Text.RegularExpressions;
    
    namespace SharxXEmailExtractorTest
    {
        public class SharxXEmailAddressesExtractor
        {
            public static string GetEmailAddressesFromString(string sourceString)
            { 
                string retval = string.Empty;
                 
                MatchCollection matchCollection = Regex.Matches(sourceString, @"([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})");
    
                 foreach(Match match in matchCollection)
                 {
                     if (retval == string.Empty)
                     {
                         retval += match.Value;
                     }
                     else
                     {
                         retval += string.Format(", {0}", match.Value);
                     }
                 }
    
                return retval;
            }
        }
    }
    
    

    using System;
    using System.Collections.Generic;
    using System.ComponentModel;
    using System.Data;
    using System.Drawing;
    using System.Text;
    using System.Windows.Forms;
    using System.IO;
    
    namespace SharxXEmailExtractorTest
    {
        public partial class SharxXTestForm : Form
        {
            public SharxXTestForm()
            {
                InitializeComponent();
            }
    
            private void button1_Click(object sender, EventArgs e)
            {
                OpenFileDialog opf = new OpenFileDialog();
                opf.Filter = "|*.csv";
                opf.ShowDialog();
    
                if (!string.IsNullOrEmpty(opf.FileName))
                {
                    string fileContent = File.ReadAllText(opf.FileName);
    
                    string addresses = SharxXEmailAddressesExtractor.GetEmailAddressesFromString(fileContent);
                    MessageBox.Show(addresses);
                }
            }
        }
    }
    


    • Marked as answer by ardmore Monday, September 19, 2011 8:31 PM
    Monday, September 19, 2011 6:22 PM

All replies

  • Does the http://www.stellman-greene.com/CSVReader/ fits you? It returns DataTable which you can easily iterate (http://www.dotnetperls.com/datatable-foreach). Than just read the right column and concatenate it to the result and handle cases when the e-mail address field is empty
    • Edited by MCCZ Monday, September 19, 2011 6:07 PM
    Monday, September 19, 2011 6:05 PM
  • Hi,

    connect via Odbc or Oledb to the csv-file.

    http://www.switchonthecode.com/tutorials/csharp-tutorial-using-the-built-in-oledb-csv-parser

    http://dotnet-snippets.de/dns/csv-datei-in-datatable-einlesen-SID518.aspx

    BTW: Please do not post *valid* email-addresses from third persons, they might get mails they dont want to receive ;-)

    Regards,

      Thorsten


    Monday, September 19, 2011 6:07 PM
  • Hmmm, what you have to do to split the text somehow into "lines", and then check which "line" contains @.

    But the main problem is to do the splitting. Try to do the splitting by comma ",". 

    Do this:

    //get all into a string...
    string wholeText = //...
    //then:
    string [] lines = wholeText.Split(new char[] {','}, StringSplitOptions.RemoveEmptryEntries);
    List<string> emails = new List<string>();
    foreach(string line in lines)
    {
           if(line.Contains("@"))
                 emails.Add(line);
    }
    



    Mitja
    Monday, September 19, 2011 6:15 PM
  • using System;
    using System.Collections.Generic;
    using System.Text;
    using System.Text.RegularExpressions;
    
    namespace SharxXEmailExtractorTest
    {
        public class SharxXEmailAddressesExtractor
        {
            public static string GetEmailAddressesFromString(string sourceString)
            { 
                string retval = string.Empty;
                 
                MatchCollection matchCollection = Regex.Matches(sourceString, @"([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})");
    
                 foreach(Match match in matchCollection)
                 {
                     if (retval == string.Empty)
                     {
                         retval += match.Value;
                     }
                     else
                     {
                         retval += string.Format(", {0}", match.Value);
                     }
                 }
    
                return retval;
            }
        }
    }
    
    

    using System;
    using System.Collections.Generic;
    using System.ComponentModel;
    using System.Data;
    using System.Drawing;
    using System.Text;
    using System.Windows.Forms;
    using System.IO;
    
    namespace SharxXEmailExtractorTest
    {
        public partial class SharxXTestForm : Form
        {
            public SharxXTestForm()
            {
                InitializeComponent();
            }
    
            private void button1_Click(object sender, EventArgs e)
            {
                OpenFileDialog opf = new OpenFileDialog();
                opf.Filter = "|*.csv";
                opf.ShowDialog();
    
                if (!string.IsNullOrEmpty(opf.FileName))
                {
                    string fileContent = File.ReadAllText(opf.FileName);
    
                    string addresses = SharxXEmailAddressesExtractor.GetEmailAddressesFromString(fileContent);
                    MessageBox.Show(addresses);
                }
            }
        }
    }
    


    • Marked as answer by ardmore Monday, September 19, 2011 8:31 PM
    Monday, September 19, 2011 6:22 PM
  • I'm very glad to read the active discussions/suggestions.

    Thanks all for your quick help.


    Martin Xie [MSFT]
    MSDN Community Support | Feedback to us
    Get or Request Code Sample from Microsoft
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.

    Tuesday, September 20, 2011 4:56 AM
    Moderator