locked
Inserting a char in a text file RRS feed

  • Question

  • Hi everybody in the forum,


    I am trying to compare two text files. I can compare both of them in C#. But I am unable to point out where the difference actually occured. For example:

    textfile1.txt                                                                  textfile2.txt

    This is a text file                                                          This is b textfille 
    I go to school everyday.                                              I go to schoool everyday.

    Is textfile1.txt is my text file and textfile2.txt is the one I am comparing against. I want to insert a symbol like ^ where it went wrong. So, after the compare textfile2.txt my C#.net application should do this to the textfile2.txt:

    textfile2.txt

    This is b textfille
              ^         ^
    I go to schoool everyday.
                         ^

    How can I do this...Please help me. Thank you.



    Suman
            
    Tuesday, September 23, 2008 7:15 AM

Answers

  • Best if you have the latest NUnit handy ;)

    1           [RowTest] 
    2        [Row( "abc""abc""   ""perfect match" )] 
    3        [Row( "abcd""abc""   ^""first line is longer" )] 
    4        [Row( "abc""abcd""   ^""second line is longer" )] 
    5        [Row( "adc""abc"" ^ ""difference in the middle" )] 
    6        [Row( "adc""abc"" ^ ""difference in the middle" )] 
    7        public void TestCompare (string line1, string line2, string expect, string explain) { 
    8            string difference = Compare( line1, line2 ).ToString(); 
    9            Assert.AreEqual( expect, difference, explain ); 
    10        } 
    11 
    12 
    13        /// <summary> 
    14        /// Compares 2 strings, and returns a StringBuilder which shows which characters,  
    15        /// in line2, are different from line1 
    16        /// </summary> 
    17        /// <param name="line1">line to compare against</param> 
    18        /// <param name="line2">line whose differences to show</param> 
    19        /// <returns>a string containing (space) for a matching character and  
    20        /// '^' for a difference</returns> 
    21        public StringBuilder Compare (string line1, string line2) { 
    22            int min = Math.Min( line1.Length, line2.Length ); 
    23            int max = Math.Max( line1.Length, line2.Length ); 
    24 
    25            StringBuilder difference = new StringBuilder( max ); 
    26 
    27            for ( int i = 0; i < min; i++ ) { 
    28                if ( line1[i] != line2[i] ) { 
    29                    difference.Append( '^' ); 
    30                } else { 
    31                    difference.Append( ' ' ); 
    32                } 
    33            } 
    34 
    35            difference.Append( '^', max - difference.Length ); 
    36            return difference; 
    37        } 

    You'll figure something out from here, right?

    ...

    Ok, the Compare(string, string) method is what you asked. Just write it to the file after each line.
    I wrote it in 5 minutes for you, it does a line comparison, not a word comparison, because it looks much more complicated. If you need a word comparison but can't figure it out, ask again.


    if a problem looks too big, break it into smaller objects
    Tuesday, September 23, 2008 9:53 AM
  • Hi

    if you check the initial post you see that the problem is trickier than that.
    In fact, if you e.g. have a character too much in one word the application should be able to realign the rest of the string in order to find the best match and the least errors. With the code proposed, every character after an additional one (position shift) would be treated as a difference or error which doesn't help very much.

    This problem basically needs a sort of Diff algorithm like:
    http://www.codeproject.com/KB/recipes/diffengine.aspx

    Or the library Meneese Diff:
    http://www.menees.com/index.html

    Alex
    Tuesday, September 23, 2008 10:22 AM

All replies

  • Hi

    just as an idea - there might be better ones and you might be doing steps of it already of course.

    #1 Read the texts line by line into an array
    #2 Compare those individual lines
    #3 If a line doesn't match, split it up into an array of characters
    #4 Compare these arrays character by character and store the positions where they don't match
    #5 Recreate those lines from the char array and insert your special character at the stored positions
    #6 Write the marked text file


    Regards

    Alex
    Tuesday, September 23, 2008 7:28 AM
  • Or use string.split() to create 2 array of words, which you can run several foreach passes on to compare each word, which should give you a list of differences down to the word and character location,

    Thanks
    Software Development Expert, playing in .NET
    Tuesday, September 23, 2008 8:55 AM
  • hi. Thank you for the reply.

    Can u provide me with a code snippet how to differentiate the words in the arrays and insert a character in the bottom of the line but not in the line??
    Tuesday, September 23, 2008 9:02 AM
  • Best if you have the latest NUnit handy ;)

    1           [RowTest] 
    2        [Row( "abc""abc""   ""perfect match" )] 
    3        [Row( "abcd""abc""   ^""first line is longer" )] 
    4        [Row( "abc""abcd""   ^""second line is longer" )] 
    5        [Row( "adc""abc"" ^ ""difference in the middle" )] 
    6        [Row( "adc""abc"" ^ ""difference in the middle" )] 
    7        public void TestCompare (string line1, string line2, string expect, string explain) { 
    8            string difference = Compare( line1, line2 ).ToString(); 
    9            Assert.AreEqual( expect, difference, explain ); 
    10        } 
    11 
    12 
    13        /// <summary> 
    14        /// Compares 2 strings, and returns a StringBuilder which shows which characters,  
    15        /// in line2, are different from line1 
    16        /// </summary> 
    17        /// <param name="line1">line to compare against</param> 
    18        /// <param name="line2">line whose differences to show</param> 
    19        /// <returns>a string containing (space) for a matching character and  
    20        /// '^' for a difference</returns> 
    21        public StringBuilder Compare (string line1, string line2) { 
    22            int min = Math.Min( line1.Length, line2.Length ); 
    23            int max = Math.Max( line1.Length, line2.Length ); 
    24 
    25            StringBuilder difference = new StringBuilder( max ); 
    26 
    27            for ( int i = 0; i < min; i++ ) { 
    28                if ( line1[i] != line2[i] ) { 
    29                    difference.Append( '^' ); 
    30                } else { 
    31                    difference.Append( ' ' ); 
    32                } 
    33            } 
    34 
    35            difference.Append( '^', max - difference.Length ); 
    36            return difference; 
    37        } 

    You'll figure something out from here, right?

    ...

    Ok, the Compare(string, string) method is what you asked. Just write it to the file after each line.
    I wrote it in 5 minutes for you, it does a line comparison, not a word comparison, because it looks much more complicated. If you need a word comparison but can't figure it out, ask again.


    if a problem looks too big, break it into smaller objects
    Tuesday, September 23, 2008 9:53 AM
  • Hi

    if you check the initial post you see that the problem is trickier than that.
    In fact, if you e.g. have a character too much in one word the application should be able to realign the rest of the string in order to find the best match and the least errors. With the code proposed, every character after an additional one (position shift) would be treated as a difference or error which doesn't help very much.

    This problem basically needs a sort of Diff algorithm like:
    http://www.codeproject.com/KB/recipes/diffengine.aspx

    Or the library Meneese Diff:
    http://www.menees.com/index.html

    Alex
    Tuesday, September 23, 2008 10:22 AM