Le réseau pour les développeurs > Forums - Accueil > Visual C# Language > Fastest way to see if string contains any of three chars
Poser une questionPoser une question
 

TraitéeFastest way to see if string contains any of three chars

  • mercredi 4 novembre 2009 15:19tcCoder Médailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    I have a string and I want to know if it is greater than 4 chars long and if it contains any of the following three characters:  " ", "/","\" (space, forward slash, or a backslash.

    What is the fastest way to do that?  I tried:

     

    if (mystring.Trim().Length > 4 && Regex.Matches(mystring.Trim(), " ").Count > 0)

    works but as soon as I try to add two OR conditions after the first Regex.Matches(), it bombs and tells me I cannot use ||

    mystring.contains() would work, but isn't that pretty slow?

Réponses

  • mercredi 4 novembre 2009 15:33David M MortonMVP, ModérateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     Traitée

    The only way to tell is to test:

    class Program
    {

        public static void Main()
        {

            int repetitions = 100000;
            Measure(() => { bool result = Regex.IsMatch("1234567890", "[357]"); }, repetitions, "Regex");
            Measure(() => { bool result = "1234567890".IndexOfAny(new[] {'3', '5', '7'}) > -1; }, repetitions, "IndexOfAny");
        }

        public static void Measure(Action action, int repetitions, string title)
        {
            Stopwatch sw = Stopwatch.StartNew();

            for (int i = 0; i < repetitions; i++)
                action();
            sw.Stop();
            Console.WriteLine("{0}: {1}", title, sw.ElapsedMilliseconds);
        }
    }

    Output:

    Regex: 144
    IndexOfAny: 7

    Nope, looks like IndexOfAny is much faster.

    (By the way, for future reference, Regex is typically slow compared to most string operations. It's usually a good bet). 


    Coding Light - Illuminated Ideas and Algorithms in Software
    Coding Light WikiLinkedInForumsBrowser
    • Marqué comme réponsetcCoder mercredi 4 novembre 2009 15:49
    •  

Toutes les réponses

  • mercredi 4 novembre 2009 15:23David Anton Médailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    Have you tried 'IndexOfAny' ?

    Convert between VB, C#, C++, & Java (http://www.tangiblesoftwaresolutions.com)
  • mercredi 4 novembre 2009 15:24tcCoder Médailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    I was just going to add that to my post (with the contains()). Isn't that slow in comparison to using regex.matches?

    var match = str.IndexOfAny(new char[] { ' ', '/', '\\' }) != -1
  • mercredi 4 novembre 2009 15:33David M MortonMVP, ModérateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     Traitée

    The only way to tell is to test:

    class Program
    {

        public static void Main()
        {

            int repetitions = 100000;
            Measure(() => { bool result = Regex.IsMatch("1234567890", "[357]"); }, repetitions, "Regex");
            Measure(() => { bool result = "1234567890".IndexOfAny(new[] {'3', '5', '7'}) > -1; }, repetitions, "IndexOfAny");
        }

        public static void Measure(Action action, int repetitions, string title)
        {
            Stopwatch sw = Stopwatch.StartNew();

            for (int i = 0; i < repetitions; i++)
                action();
            sw.Stop();
            Console.WriteLine("{0}: {1}", title, sw.ElapsedMilliseconds);
        }
    }

    Output:

    Regex: 144
    IndexOfAny: 7

    Nope, looks like IndexOfAny is much faster.

    (By the way, for future reference, Regex is typically slow compared to most string operations. It's usually a good bet). 


    Coding Light - Illuminated Ideas and Algorithms in Software
    Coding Light WikiLinkedInForumsBrowser
    • Marqué comme réponsetcCoder mercredi 4 novembre 2009 15:49
    •  
  • mercredi 4 novembre 2009 16:15OmegaManMVP, ModérateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    (By the way, for future reference, Regex is typically slow compared to most string operations. It's usually a good bet).
    Just as any higher level language paradigm requires processor time regex and linq are going to eat up more cycles; ie if you want something really fast program in assembly.

    But frankly when you are talking about taking 5-20 milleseconds longer is it really failing? I would prefer to use Linq over ado.net any day. I believe Dave was presenting a fact and not an opinion about regex so this is not directed in that fashion. Don't count out regex because it takes 10 milleseconds longer. It was designed to be a higher level programming paradigm to handle text manipulation and to that end it shines.

    Once one gets past basic examples where string methods are no longer feasable, regex is the best bet for true string manipulations; so don't avoid it because of any perceived, "It takes longer comments.". IMHO

    To that end I wrote a blog article entitled: Are C# .Net Regular Expressions Fast Enough for You?
    William Wegerson (www.OmegaCoder.Com )
  • mercredi 4 novembre 2009 17:42David Anton Médailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    Use the most direct solution - then try to speed it up if you need to (usually there will be no performance issue).

    You might find that 'IndexOfAny' is the most direct - if you live, eat, and breath regex then you'll find that more direct.
    Convert between VB, C#, C++, & Java (http://www.tangiblesoftwaresolutions.com)