none
comparing string patterns in C# RRS feed

  • Question

  • Hi -

    I am unsure where to start being new to C#. Currently I have a text file with a lot of IP's and Domain name URLs. I want to compare IP's and domain names and put them into two separate lists while keeping the matching ID from the first column. 

    Do not have any code started at this time. If there is anything I can add to this question to make it more clear, I am happy to do so. Anything I search is not even close to what I am looking for. At times my biggest problem is just knowing what search terms to use to retrieve better results.

    Here is a sample of the file I am working with:

    311632,maxicollection.us
    311631,211.75.103.32
    311630,122.175.245.171
    311629,pastebin.com
    311628,176.113.161.126
    311627,72.2.248.221
    311577,46.248.193.75
    311576,dianrizkisantosa.com
    311575,111.40.111.202

    Thanks - Keith

    Saturday, February 8, 2020 6:03 PM

Answers


  • Then ask yourself what distinguishes a DN form an IP. How do you tell them
    apart when you eyeball them? Then write code that does the same checking.

    Some obvious criteria:

    A DN will always have alpha in it I expect. Such as the .com or .net, or .org
    etc. Will an IP address ever have any alpha in it?

    As a PoC (Proof of Concept) example, consider this:

    static void Main(string[] args)
    {
        List<string> ls = new List<string> {
            "311632,maxicollection.us",
            "311631,211.75.103.32",
            "311630,122.175.245.171",
            "311629,pastebin.com",
            "311628,176.113.161.126",
            "311627,72.2.248.221",
            "311577,46.248.193.75",
            "311576,dianrizkisantosa.com",
            "311575,111.40.111.202"};
    
        List<string> DNs = new List<string>();
        List<string> IPs = new List<string>();
    
        foreach (string s in ls )
        {
            bool isAlpha = false;
            var line = s.Split(',');
            for (int i = 0; i < line[1].Length; i++)
            {
                if (char.IsLetter(line[1][i]))
                {
                    isAlpha = true;
                    break;
                }
            }
            if (isAlpha) DNs.Add(s);
            else IPs.Add(s);
        }            
    }
    

    Error checking is left as an exercise for the OP.

    If the file contains lines of a different format then they will have to be
    filtered out.

    - Wayne

    • Marked as answer by studysession Saturday, February 8, 2020 11:10 PM
    Saturday, February 8, 2020 9:39 PM
  •  I have learned a lot.

    A little additional:

    As I mentioned in an earlier reply, an alternative to using Split() and working
    with the resulting substrings is to use IndexOf and examine the DN/IP part of
    the overall string itself.

    foreach (string s in ls )
    {
        bool isAlpha = false;
        int idx = s.IndexOf(',');
        for (int n = idx + 1; n < s.Length; n++)
        {
            if (char.IsLetter(s[n]))
            {
                isAlpha = true;
                break;
            }
        }
        if (isAlpha) DNs.Add(s);
        else IPs.Add(s);
    }
    

    - Wayne

    • Marked as answer by studysession Monday, February 10, 2020 12:37 AM
    Monday, February 10, 2020 12:28 AM

All replies


  • I am unsure where to start being new to C#. Currently I have a text file with a lot of IP's and Domain name URLs. I want to compare IP's and domain names and put them into two separate lists while keeping the matching ID from the first column. 

    Do not have any code started at this time. If there is anything I can add to this question to make it more clear, I am happy to do so. Anything I search is not even close to what I am looking for. At times my biggest problem is just knowing what search terms to use to retrieve better results.


    Some questions:

    Are you trying to compare on the first column (substring) for matches?

    If so, is the file in sorted order by that column? 
    If so, ascending or descending?

    Is your aim to put strings/lines with matches in separate lists - one with
    lines that have domain names and the other with lines that have IP addresses?

    If so, what about lines that don't have matches?

    e.g. -

    Assuming input:

    311632,maxicollection.us
    311632,211.75.103.32
    311630,122.175.245.171
    311629,pastebin.com
    311629,176.113.161.126
    311628,pastry.com
    311627,72.2.248.221

    Then list1 (DNs) will have:

    311632,maxicollection.us
    311629,pastebin.com

    and list2 (IPs) will have:

    311632,211.75.103.32
    311629,176.113.161.126

    Is that what you want to achieve? If not, then explain more fully and give
    examples of input and expected output.

    >Anything I search is not even close to what I am looking for. At times my 
    >biggest problem is just knowing what search terms to use to retrieve better
    >results.

    Apologies if the following sounds patronizing or "preachy":

    <start of sermon>
    I would suggest that the biggest problem is that you are trying to find
    pre-built solutions instead of programming the solutions yourself. Programming
    is about creative use of applied logic, not about finding code that others
    have written for every new task encountered. (Obviously code reuse is an
    important aspect of programming, but it is not the solution to every new
    problem.)

    Writing a program in any language is much like writing a composition using any
    natural language (English, French, Italian, etc.) You take the basic tools of
    the language - syntax, grammar, "alphabet", etc. and you craft the desired
    outcome. You only develop those skills by constantly exercising them, learning
    what works and what doesn't by personal experience.

    To code effectively in C# you need first to become as familiar as possible with
    the language, its quirks, strengths, weaknesses, resources, etc. Then apply
    that knowledge to the creation of task solutions.
    <end of sermon>

    - Wayne

    After-thought: If there are lines in the file with different content/formats
    then you need to specify that as well.

    Saturday, February 8, 2020 7:13 PM
  • Will there be cases where there may be multiple lines that have the same "key"?

    e.g. - 

    311632,maxicollection.us
    311632,211.75.103.32
    311632,maxicollection.us
    311632,maxedcollection.us
    311632,211.75.103.32
    311632,211.75.103.99

    etc.

    - Wayne

    Saturday, February 8, 2020 7:18 PM
  • As Wayne has made very clear, the first step is to clearly define the requirements. If you were to clearly define the requirements for yourself the the next step is to determine how to do sub-tasks and if you need help with a sub-task then you can ask about that. This question is asking how to do the entire project. It works better to ask about only pieces at a time. And I am confident that you can find previous answers and other material that already exist to help you with most or all of the pieces.


    Sam Hobbs
    SimpleSamples.Info

    Saturday, February 8, 2020 7:49 PM
  • I want to keep first column matched to the second column how it shows in the example.

    I want to make two lists. 

    • One list showing 
    311631,211.75.103.32
    311630,122.175.245.171
    311628,176.113.161.126
    311627,72.2.248.221
    311577,46.248.193.75
    311575,111.40.111.202

    • The second list showing
    311632,maxicollection.us
    311629,pastebin.com
    311576,dianrizkisantosa.com

    Hope that shows a good enough example of what I am asking.

    I am unsure where do I start to compare the second columns to determine IP vs FQDN.

    Thanks - Keith


    Saturday, February 8, 2020 7:59 PM
  • I want to keep first column matched to the second column how it shows in the example.

    I want to make two lists. 

    • One list showing 
    311631,211.75.103.32
    311630,122.175.245.171
    311628,176.113.161.126
    311627,72.2.248.221
    311577,46.248.193.75
    311575,111.40.111.202

    • The second list showing
    311632,maxicollection.us
    311629,pastebin.com
    311576,dianrizkisantosa.com


    First you want to isolate the second column from the first for checking/comparison purposes.

    You can do that using String.Split on the comma. Or you can use IndexOf to
    find the location following the comma in the string.

    Then ask yourself what distinguishes a DN from an IP. How do you tell them
    apart when you eyeball them? Then write code that does the same checking.

    Some obvious criteria:

    A DN will always have alpha in it I expect. Such as the .com or .net, or .org
    etc. Will an IP address ever have any alpha in it?

    - Wayne


    • Edited by WayneAKing Sunday, February 9, 2020 3:44 AM
    Saturday, February 8, 2020 9:00 PM

  • Then ask yourself what distinguishes a DN form an IP. How do you tell them
    apart when you eyeball them? Then write code that does the same checking.

    Some obvious criteria:

    A DN will always have alpha in it I expect. Such as the .com or .net, or .org
    etc. Will an IP address ever have any alpha in it?

    As a PoC (Proof of Concept) example, consider this:

    static void Main(string[] args)
    {
        List<string> ls = new List<string> {
            "311632,maxicollection.us",
            "311631,211.75.103.32",
            "311630,122.175.245.171",
            "311629,pastebin.com",
            "311628,176.113.161.126",
            "311627,72.2.248.221",
            "311577,46.248.193.75",
            "311576,dianrizkisantosa.com",
            "311575,111.40.111.202"};
    
        List<string> DNs = new List<string>();
        List<string> IPs = new List<string>();
    
        foreach (string s in ls )
        {
            bool isAlpha = false;
            var line = s.Split(',');
            for (int i = 0; i < line[1].Length; i++)
            {
                if (char.IsLetter(line[1][i]))
                {
                    isAlpha = true;
                    break;
                }
            }
            if (isAlpha) DNs.Add(s);
            else IPs.Add(s);
        }            
    }
    

    Error checking is left as an exercise for the OP.

    If the file contains lines of a different format then they will have to be
    filtered out.

    - Wayne

    • Marked as answer by studysession Saturday, February 8, 2020 11:10 PM
    Saturday, February 8, 2020 9:39 PM
  • Much appreciated - I believe I follow what you did other than the line with .Length.

    Why would the .Length be there?

    Saturday, February 8, 2020 11:10 PM
  • Cannot thank you enough for the quick response and help. This really was a show stopper for me.

    Trying to tech myself C#. Made up a project manipulating all different kinds of file formats.

    Thank you

    Saturday, February 8, 2020 11:14 PM
  • Change a little formatting and then wrote it to the screen using your code.

    List<string> ls = new List<string> {
            "311632,maxicollection.us",
            "311631,211.75.103.32",
            "311630,122.175.245.171",
            "311629,pastebin.com",
            "311628,176.113.161.126",
            "311627,72.2.248.221",
            "311577,46.248.193.75",
            "311576,dianrizkisantosa.com",
            "311575,111.40.111.202"};
    
                List<string> DNs = new List<string>();
                List<string> IPs = new List<string>();
    
                foreach (string s in ls)
                {
                    bool isAlpha = false;
                    var line = s.Split(',');
    
                    Console.WriteLine(line[1]);
                                    
                    for (int i = 0; i < line[1].Length; i++)
                    {                    
                        if (char.IsLetter(line[1][i]))
                        {
                            isAlpha = true;
                            break;
                        }
                    }
                    //if (isAlpha) DNs.Add(s);
                    //else IPs.Add(s);
                    if (isAlpha)
                    {
                        DNs.Add(s);                    
                    }
                    else
                    {
                        IPs.Add(s);                    
                    }
    
                }
                foreach (string v1 in DNs)
                {
                    Console.WriteLine("\tDNs: " + v1);
                }
                foreach (string v1 in IPs)
                {
                    Console.WriteLine("\tIPs: " + v1);
                }

    Saturday, February 8, 2020 11:16 PM
  •  I believe I follow what you did other than the line with .Length.

    Why would the .Length be there?

    line[1].Length is the number of characters in the substring that we are going 
    to check. It controls how many times the loop will iterate, so we can check 
    each character in the string to see if it is a letter.

    - Wayne

    Saturday, February 8, 2020 11:51 PM
  • Thank you - I have learned a lot.
    Sunday, February 9, 2020 2:54 PM
  •  I have learned a lot.

    A little additional:

    As I mentioned in an earlier reply, an alternative to using Split() and working
    with the resulting substrings is to use IndexOf and examine the DN/IP part of
    the overall string itself.

    foreach (string s in ls )
    {
        bool isAlpha = false;
        int idx = s.IndexOf(',');
        for (int n = idx + 1; n < s.Length; n++)
        {
            if (char.IsLetter(s[n]))
            {
                isAlpha = true;
                break;
            }
        }
        if (isAlpha) DNs.Add(s);
        else IPs.Add(s);
    }
    

    - Wayne

    • Marked as answer by studysession Monday, February 10, 2020 12:37 AM
    Monday, February 10, 2020 12:28 AM