locked
slpitting strings? RRS feed

  • Question

  • Hey i have a string and i am spitting it on some values and that works pretty well but anyway i can split the strings to return a key value pair dictionary??? this is what i have right now to get me the one side of the values..but if i can get these keys and their values (splits) thhat would be perfect any help?!??

     

         string fileContent = reader.ReadToEnd();
         string [] separators = new string[] {"ZZ","BA","FL", "DT","IX", "TZ", "TI", "SK", "SX", "SQ", "NK", "NX", "NQ", "IQ", "AB", "FR", "SC", "CG", "TC", "SO", "UZ", "PF", "XZ"};
         string[] spilts = fileContent.Split(separators, StringSplitOptions.None);
    

     

     

     


    Give youself a round of applause!!
    Tuesday, December 7, 2010 1:45 AM

Answers

  • Regular expressions to do word/string parsing should be learned by every developer. Once one learns the basics of regex one never stops using it. For example:

     

    string data = @"BA 0KEH
    
    FL sic_0100.txt.dlgb
    
    DT Industry Overview";
    
    
    string pattern = @"^(?<ID>[^\s]+)(?:\s+)(?<Data>[^\r\n]+)";
    
    Dictionary<string, string> dtc = Regex.Matches( data, pattern, RegexOptions.Multiline )
                        .OfType<Match>()
                        .ToDictionary( m => m.Groups["ID"].Value, m => m.Groups["Data"].Value );
    
    foreach (var kvp in dtc)
      Console.WriteLine(string.Format("{0} contains {1}", kvp.Key, kvp.Value));
    /* Output:
    BA contains 0KEH
    FL contains sic_0100.txt.dlgb
    DT contains Industry Overview
      * */
    

    To start you on your Regular Expression Journey check out the .Net regular expression forum Regular Expressions and its informative top level post .Net Regex Resources Reference , which is a useful reference for beginners to experts.
    Also one can use this free tool (Expresso ) to test and learn about out regex patterns outside of ones .Net code.
    William Wegerson (www.OmegaCoder.Com)
    • Marked as answer by Tryin2Bgood Tuesday, December 7, 2010 8:13 PM
    Tuesday, December 7, 2010 5:01 PM
    Moderator

All replies


  • I'm not sure I understand your plan.  Suppose that the FileContent string contains "New YorkZZConnecticutFLAruba".  After the split, you have an array with "New York", "Connecticut" and "Aruba".  You say these are the keys or the values, and the splits are the others.  Do you mean the "ZZ" and "FL" are important?  If so, I think this will require regular expressions, rather than the Split function to get what you want.  Please clarify if I'm understanding you correctly.  A good, simple, input and output example showing what you want would really help.

    --
    Mike
    Tuesday, December 7, 2010 2:16 AM
  • I am getting a bunch of data in this format...it will always be this format

     

    ZZ 2501600000

    BA 0KEH  

     

    I want to split those strings on those first two keys but keep the values in a dictionary if i can...right now i have the data in filecontents

     

    2501600000

    0KEH  

     

    but is there anyway i can split them and create a dictionary that will have values like

    Dictionary.Add("ZZ","2501600000");

    Dictionary.Add("BA ","0KEH");

     


    Give youself a round of applause!!
    Tuesday, December 7, 2010 2:32 PM
  • if the format is line by line... why not have it split again

                Dictionary<stringstring> dictionary=new Dictionary<stringstring>();
                string[] strings = new string[] { "test 1""test 2""test 3" };
                foreach (string s in strings)
                {
                    string[] tmpString = s.Split(new char[] {' '});
                    dictionary.Add(tmpString[1],tmpString[0]);
                }
    not the best, but it works... or ar you looking form some thing more efficient?
    Tuesday, December 7, 2010 2:55 PM
  • Yeah I want to make sure the values and the keys are always correct....

     

    BA 0KEH

    FL sic_0100.txt.dlgb

    DT Industry Overview

     

    So i want to make sure 'BA' still have value of '0KEH' and not anything else....I am going to try what u suggested...thanks!


    Give youself a round of applause!!
    Tuesday, December 7, 2010 2:58 PM
  • if there are multiple spaces in one line... there might be an issue... either have the elements 1 and what have you... concatenate into one string.

    did not know that there is possible multiple spaces in same line string.

    Tuesday, December 7, 2010 3:21 PM
  • Try this.

    Dictionary<stringstring> dictionary=new Dictionary<stringstring>();
    string[] strings = new string[] { "10 test this""20 test that""30 test those" };
    foreach (string s in strings)
    {
        string[] tmpString = s.Split(new char[] {' '});
        dictionary.Add(tmpString[0],s.Replace(tmpString[0],"").Trim());
    }

     

    Tuesday, December 7, 2010 3:25 PM
  •    string msg = "ZZ 2501600000 BA 0KEH FL sic_0100.txt.dlgb DT Industry Overview";
       string[] separators = new string[] { "ZZ", "BA", "FL", "DT" };
       string[] spilts = msg.Split(separators, StringSplitOptions.RemoveEmptyEntries);
    
       Dictionary<string, string> dic = new Dictionary<string, string>();
       for (int i = 0; i < separators.Length; i++)
        dic.Add(separators[i], spilts[i]);
    

    I have one question, are separators always 2 characters long, and all of the other key words are at least 3+ char. long?

    Tuesday, December 7, 2010 4:55 PM
  • Regular expressions to do word/string parsing should be learned by every developer. Once one learns the basics of regex one never stops using it. For example:

     

    string data = @"BA 0KEH
    
    FL sic_0100.txt.dlgb
    
    DT Industry Overview";
    
    
    string pattern = @"^(?<ID>[^\s]+)(?:\s+)(?<Data>[^\r\n]+)";
    
    Dictionary<string, string> dtc = Regex.Matches( data, pattern, RegexOptions.Multiline )
                        .OfType<Match>()
                        .ToDictionary( m => m.Groups["ID"].Value, m => m.Groups["Data"].Value );
    
    foreach (var kvp in dtc)
      Console.WriteLine(string.Format("{0} contains {1}", kvp.Key, kvp.Value));
    /* Output:
    BA contains 0KEH
    FL contains sic_0100.txt.dlgb
    DT contains Industry Overview
      * */
    

    To start you on your Regular Expression Journey check out the .Net regular expression forum Regular Expressions and its informative top level post .Net Regex Resources Reference , which is a useful reference for beginners to experts.
    Also one can use this free tool (Expresso ) to test and learn about out regex patterns outside of ones .Net code.
    William Wegerson (www.OmegaCoder.Com)
    • Marked as answer by Tryin2Bgood Tuesday, December 7, 2010 8:13 PM
    Tuesday, December 7, 2010 5:01 PM
    Moderator
  • Wow thanks for that explanation.....but I have run into another issue...i have keys that may be repeated...as in BA being in the raw string twice...so now i get exception saying key is already in dictionary......
    Give youself a round of applause!!
    Tuesday, December 7, 2010 7:51 PM
  • lol... I thought there will be an issue, but I didnt ask you. Now you are screwed with this kind of patent. Its better to look for some other solution, that there wont be doubling of a value. 

    Or there is another change, before inserting into a Dictonary, you can check it all if the value already exists in it.. if it does, change the value, for example add (_2), like BA  aaaaa, BA_2 bbbbb. Or something. There is plenty of options, but none wont be 100% sure. The problem is, that your Key value is only on 2 characters - that means that it can come to the repeating of it very fast (soon). 

    Its up to you how you will decide...

    Personally I would go soem a completely different solution, or rise the number of characters for the key value (4,5 characters)  - then you very minimize the change of getting the value which already exist.

     

    Hope it helps,

    Mitja

    Tuesday, December 7, 2010 7:59 PM
  • this would depends on how you want to handle the duplicate key. Ofcourse the key must be unique in the Dictionary

    however if you would like to store more than one item (value) then the dictionary must be defined

    Dictionary<string,List<string>>
    if not, then you would have to do some dirty work on compare and see what the difference on the new entry value vs. the existing value.
    Tuesday, December 7, 2010 8:02 PM
  • @ Trying2Bgood,

    try a dictionary<string, List<string>>.  That way for each key(BA for example) you can store multiple values for it.

    Tuesday, December 7, 2010 8:05 PM
  •     MatchCollection co = Regex.Matches(t, pattern, RegexOptions.Multiline);

     

    will do and I can just use the values in the MatchCollection....Thanks guys for the help...

     

     

     

     


    Give youself a round of applause!!
    Tuesday, December 7, 2010 8:14 PM
  • Post your regex questions to our regex forum I mentioned. We would love to help you learn.

    William Wegerson (www.OmegaCoder.Com)
    Wednesday, December 8, 2010 12:15 AM
    Moderator
  • thanks! I will do just that!
    Give youself a round of applause!!
    Wednesday, December 8, 2010 2:35 PM