none
Algorithm the longest common substring of two strings RRS feed

  • Question

  • For example

    string a="qwegf12ytrtk";

    string b="hghgsd12ytrtk";

    output is "12ytrtk". How to achieve it in using C# (most concise code).


    • Edited by Aimee.Young Sunday, November 19, 2017 2:52 PM
    Sunday, November 19, 2017 2:51 PM

Answers

  • Well, if you want a good algorithm, you can find it in Wikipedia:

    https://en.wikipedia.org/wiki/Longest_common_substring_problem

    The algorithm there is written in pseudo-code, but it shouldn't be horribly difficult to translate it into C#.

    However, if you want something quick and dirty but easy to understand, I suggest building two lists of all the substrings in each string, then doing the Intersection of both lists (can be done with LINQ) and taking the longest:

        class Program
        {
            static void Main(string[] args)
            {
                string a = "qwegf12ytrtk";
                string b = "hghgsd12ytrtk";
                var substringsOfA = FindAllSubstrings(a);
                var substringsOfB = FindAllSubstrings(b);
                var commonSubstrings = substringsOfA.Intersect(substringsOfB);
                string longestSubstring = commonSubstrings.OrderByDescending(x => x.Length).FirstOrDefault();
                Console.WriteLine(longestSubstring);
            }
    
            private static IEnumerable<string> FindAllSubstrings(string s)
            {
                List<string> list = new List<string>();
                for (int i = 0; i < s.Length; i++)
                {
                    for (int j = i; j < s.Length; j++)
                    {
                        string ss = s.Substring(i, j - i + 1);
                        list.Add(ss);
                    }
                }
                return list;
            }
        }


    Sunday, November 19, 2017 3:42 PM
    Moderator
  • Hello Aimee,

    Try the below code. which compares each char between the two strings and recode the common string's position and length in the two-dimensional array.

     public static string lcs(string a, string b)
            {
                var lengths = new int[a.Length, b.Length];
                int greatestLength = 0;
                string output = "";
                for (int i = 0; i < a.Length; i++)
                {
                    for (int j = 0; j < b.Length; j++)
                    {
                        if (a[i] == b[j])
                        {
                            lengths[i, j] = i == 0 || j == 0 ? 1 : lengths[i - 1, j - 1] + 1;
                            if (lengths[i, j] > greatestLength)
                            {
                                greatestLength = lengths[i, j];
                                output = a.Substring(i - greatestLength + 1, greatestLength);
                            }
                        }
                        else
                        {
                            lengths[i, j] = 0;
                        }
                    }
                }
                return output;
            }

    Test code

             
       string a1 = "qwegf12ytrtk";
                string b2 = "hghgsd12ytrtk";
    
                string output= lcs(a1, b2);

    Sincerely,

    Fei Hu


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, November 20, 2017 6:23 AM
    Moderator

All replies

  • Well, if you want a good algorithm, you can find it in Wikipedia:

    https://en.wikipedia.org/wiki/Longest_common_substring_problem

    The algorithm there is written in pseudo-code, but it shouldn't be horribly difficult to translate it into C#.

    However, if you want something quick and dirty but easy to understand, I suggest building two lists of all the substrings in each string, then doing the Intersection of both lists (can be done with LINQ) and taking the longest:

        class Program
        {
            static void Main(string[] args)
            {
                string a = "qwegf12ytrtk";
                string b = "hghgsd12ytrtk";
                var substringsOfA = FindAllSubstrings(a);
                var substringsOfB = FindAllSubstrings(b);
                var commonSubstrings = substringsOfA.Intersect(substringsOfB);
                string longestSubstring = commonSubstrings.OrderByDescending(x => x.Length).FirstOrDefault();
                Console.WriteLine(longestSubstring);
            }
    
            private static IEnumerable<string> FindAllSubstrings(string s)
            {
                List<string> list = new List<string>();
                for (int i = 0; i < s.Length; i++)
                {
                    for (int j = i; j < s.Length; j++)
                    {
                        string ss = s.Substring(i, j - i + 1);
                        list.Add(ss);
                    }
                }
                return list;
            }
        }


    Sunday, November 19, 2017 3:42 PM
    Moderator
  • Hello Aimee,

    Try the below code. which compares each char between the two strings and recode the common string's position and length in the two-dimensional array.

     public static string lcs(string a, string b)
            {
                var lengths = new int[a.Length, b.Length];
                int greatestLength = 0;
                string output = "";
                for (int i = 0; i < a.Length; i++)
                {
                    for (int j = 0; j < b.Length; j++)
                    {
                        if (a[i] == b[j])
                        {
                            lengths[i, j] = i == 0 || j == 0 ? 1 : lengths[i - 1, j - 1] + 1;
                            if (lengths[i, j] > greatestLength)
                            {
                                greatestLength = lengths[i, j];
                                output = a.Substring(i - greatestLength + 1, greatestLength);
                            }
                        }
                        else
                        {
                            lengths[i, j] = 0;
                        }
                    }
                }
                return output;
            }

    Test code

             
       string a1 = "qwegf12ytrtk";
                string b2 = "hghgsd12ytrtk";
    
                string output= lcs(a1, b2);

    Sincerely,

    Fei Hu


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, November 20, 2017 6:23 AM
    Moderator