locked
Separate a string into two string with differenct cultures RRS feed

  • Question

  • User360451555 posted

    Hello, i am having this piece of text "SHARIF SIRAJ MUWONGE     شريف سراج موونغي"

    I would like to split the above text in that the English name is separate from the Arabic name

    Wednesday, August 3, 2011 7:16 AM

Answers

  • User2060565753 posted
     string str = "عبد الرحيم";
    
                    
                    foreach (char character in str.ToCharArray())
                    {
    
                        if (HasArabicCharacters(character.ToString()) || (character.ToString().Trim() == "" ))
                        {
                            //Console.Write(b + "\n"); 
                            str = str +  character.ToString();
                         
                        }
                    
                    }
    
     static bool HasArabicCharacters(string text)
            {
                Regex regex = new Regex(
                  "[\u0600-\u06ff]|[\u0750-\u077f]|[\ufb50-\ufc3f]|[\ufe70-\ufefc]");
                return regex.IsMatch(text);
            }
     
    //issue is resolved now 
    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, August 4, 2011 6:00 AM

All replies

  • User2060565753 posted

    u can make ascii value comparision..getting me?

    loop through and check its ascii equivalent...

    Wednesday, August 3, 2011 7:29 AM
  • User360451555 posted

    here is what am doin but still aint geting the required result

            static void Main(string[] args)
            {
                const String str = "ABDUL RAHUM عبد الرحيم";
    
                int i = 0;
                foreach (byte b in Encoding.UTF8.GetBytes(str.ToCharArray()))
                {
                    if (b < 32 && b > 90)
                    {
                        //Console.Write(b + "\n");
                        str.Insert(i - 1, "_ ");
                        break;
                    }
                    i++;
                }
    
                ArrayList texts = new ArrayList(str.Split('_'));
    
                foreach (String V in texts)
                {
                    Console.Write(V + "\n");
                }
    
                Console.ReadLine();
            }
    Wednesday, August 3, 2011 8:16 AM
  • User2060565753 posted

    Convert.ToChar( myvar.substr( 1, 1) ) == 65

    dont convert to byte and try...

     

    http://www.dotnetperls.com/ascii-table

    Wednesday, August 3, 2011 8:26 AM
  • User2060565753 posted
    const String str = "عبد الرحيم"; 
     
                int i = 0; 
                foreach (byte b in Encoding.UTF8.GetBytes(str.ToCharArray())) 
                { 
                    if (b < 32 && b > 90) 
                    { 
                        //Console.Write(b + "\n"); 
                        str.Insert(i - 1, "_ "); 
                        break; 
                    } 
                    i++; 
                } 
    i tried this...
    check ur if condition it seems wrong
    //so if i put if (b < 32 ||b > 90) 
    // get the values in new string 
    // i can get in arabic in seperate string
     
    getting me??
     
    Wednesday, August 3, 2011 8:29 AM
  • User360451555 posted

    My new improved code but still aint getting it right.....can someone please gimmi a code

    static void Main(string[] args)
            {
                const String str = "A ب";
    
                int i = 0;
                foreach (byte b in Encoding.UTF8.GetBytes(str.ToCharArray()))
                {
                    if (b > 122)
                    {
                        str.Insert(i - 1, "_" + " ");
                        break;
                    }
                    i++;
                }
    
                ArrayList texts = new ArrayList(str.Split('_'));
    
                foreach (String V in texts)
                {
                    Console.Write(V + "\n");
                }
    
                Console.ReadLine();
            }
    Wednesday, August 3, 2011 8:37 AM
  • User2060565753 posted
    this should help you
    
    public static bool IsEnglish(string input)
      {
        const string Numeric = "0123456789";
        const string Alpha = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
        const string Punctuation = " .,-=+()[]{}¬";
        const string Special = "\"\\\t\r\n";
        const string Lookup = Numeric + Alpha + Punctuation + Special;
        input = "" + input;
        for (int i = 0; i < input.Length; i++)
        {
        if (!Lookup.Contains(input[i].ToString()))
        {
          return false;
        }
        }
        return true;
      }
    Wednesday, August 3, 2011 10:01 AM
  • User2060565753 posted

    finally i got it...sorry for giving you solution without trying

     

    string input = "A ب;";
    
                    
                    
                                    string spe = "";
    
                    foreach (char character in input.ToCharArray())
                    {
                        if (HasArabicCharacters(character.ToString()))
                            spe = character.ToString();
                    }
    
     
     static bool HasArabicCharacters(string text)
            {
                Regex regex = new Regex(
                  "[\u0600-\u06ff]|[\u0750-\u077f]|[\ufb50-\ufc3f]|[\ufe70-\ufefc]");
                return regex.IsMatch(text);
            }
     
                 
    Wednesday, August 3, 2011 11:08 AM
  • User360451555 posted

    Hello Mehta.....am soo pleased with your solution but not yet fully pleased. Now if i get this block of text "RAJAB ISHAQ ZEMBE   رجب إسحاق زيمبي", i get ????????? marks i think they are representing the Arabic letters, how can i have my arabic letters returned and printed out just the way they look in the unsplit string 

    Wednesday, August 3, 2011 11:25 AM
  • User2060565753 posted

    i installed a arabic font as well on my machine..but it is working on my machine...

    not sure because i installed font

    Wednesday, August 3, 2011 12:04 PM
  • User2060565753 posted

    https://sites.google.com/site/rahulkmlx/test/specialchar.png

     

    please find screen shot, debugger shows arabic font after installation of font , also added code above.

    Please let me know if u still getting issue after installing font?

    Wednesday, August 3, 2011 1:02 PM
  • User360451555 posted

    Ok sure it worked out for me.....it doesnt display well in Console Applications but in windows forms and WFP, it sure is a superb thing....but there is one more thing remainig for this to become perfect..

    You see if i have a text of the sort "RAJAB ISHAQ ZEMBE رجب إسحاق زيمبي" the application successfully gets the Arabic portion of the text, unfortunately, it displays the Arabic portion minus its original spaces and i would like to maintain those white spaces so that someone's name is not mispelled in Arabic

    Wednesday, August 3, 2011 2:50 PM
  • User2060565753 posted

    at start of logic split on basis of "space"

     

    stringvar.ToString().Split('');

    Once you have array traverse it throuh the loop and regularexpression I have given..

    does that answer ur question?

    Thursday, August 4, 2011 5:25 AM
  • User2060565753 posted
     string str = "عبد الرحيم";
    
                    
                    foreach (char character in str.ToCharArray())
                    {
    
                        if (HasArabicCharacters(character.ToString()) || (character.ToString().Trim() == "" ))
                        {
                            //Console.Write(b + "\n"); 
                            str = str +  character.ToString();
                         
                        }
                    
                    }
    
     static bool HasArabicCharacters(string text)
            {
                Regex regex = new Regex(
                  "[\u0600-\u06ff]|[\u0750-\u077f]|[\ufb50-\ufc3f]|[\ufe70-\ufefc]");
                return regex.IsMatch(text);
            }
     
    //issue is resolved now 
    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, August 4, 2011 6:00 AM