c implements an example of sunday algorithm

  • 2020-05-17 06:12:36
  • OfStack

Died from a regular expression search is always a cycle, start thinking about change to other search way, because. net own IndexOf default can find only the first or the last one, if you want to find out all the matches, you also need to write their own circulation SubString, so did want to find the ready-made, was discovered in this area, BM algorithm is king, is said to be the best sunday algorithm improved version of the 1 point I don't have found on site especially wiki from abroad, but talking about sunday articles in Chinese a lot, I hope that it is the best.


public static int SundaySearch(string text, string pattern)
        {
            int i = 0;
            int j = 0;
            int m = pattern.Length ;
            int matchPosition = i;
            while (i < text.Length && j < pattern.Length)
            {
                if (text[i] == pattern[j])
                {
                    i++;
                    j++;
                }
                else
                {
                    if(m==text.Length-1)break;

                    int k = pattern.Length - 1;
                    while (k >= 0 && text[m ] != pattern[k])
                    {
                        k--;
                    }
                    int gap = pattern.Length - k;
                    i += gap;
                    m = i + pattern.Length;
                    if (m > text.Length) m = text.Length - 1;
                    matchPosition = i;
                    j = 0;
                }
            }
            if (i <= text.Length)
            {
                return matchPosition;
            }
            return -1;
        }

Okay, now let's test the performance:

public static void PerformanceTest()
        {
            StreamReader reader = new StreamReader("D:\\LogConfiguration.xml", Encoding.ASCII);
            string context = reader.ReadToEnd();
            string pattern = "xxxx";
            int count = 1000*10;
            Stopwatch watch=new Stopwatch();
            //watch.Start();
            //for (int i = 0; i < count; i++)
            //{
            //    int pos= Sunday.GetPositionFirst(context, pattern, true);
            //}
            //watch.Stop();
            //Console.WriteLine(watch.ElapsedMilliseconds);
            watch.Reset();
            watch.Start();
            for (int i = 0; i < count; i++)
            {
                int pos = context.IndexOf(pattern);
            }
            watch.Stop();
            Console.WriteLine(watch.ElapsedMilliseconds);
            watch.Reset();
            watch.Start();
            for (int i = 0; i < count; i++)
            {
                int pos = Sunday.SundaySearch(context, pattern);
            }
            watch.Stop();
            Console.WriteLine(watch.ElapsedMilliseconds);
        }

In the case that a match can be found and a match cannot be found, the time of sunday algorithm is about 20% of that of indexof. The algorithm does work.

But never use substring to implement the algorithm, which will generate a lot of new string intermediate variables. The benefit of the algorithm is far less than the cost of allocating memory to copy the string. The commented out part is implemented using substring, which is much slower than indexof.


Related articles: