python longest palindrome string algorithm

  • 2020-10-31 21:48:41
  • OfStack

Given a string, find the oldest string in the string that conforms to the palindromic nature. Palindromes are strings such as "aba "," ababa", and "abba", although a single character and two adjacent characters of the same character also satisfy palindromes.

Seeing this problem, the first solution that comes to mind is, of course, violent enumeration. By enumerating the starting point of all strings in a string, the substring that satisfies palinisability is determined one by one, and the length is recorded and the maximum length is updated. Obviously, the time complexity of this algorithm is very high, and can reach O (N*N) in the worst case. Therefore, an optimized scheme is proposed here. By enumerating the center of the character string substring instead of the starting point and spreading to both sides simultaneously, palindromes of the substring are still judged one by one. This optimization algorithm is much more efficient than previous algorithms in the worst case (i.e., strings with only one character).

From the above optimization scheme, we know that the enumeration center is more efficient than the enumeration starting point, but this is not the optimal algorithm. Since the enumeration center algorithm affects the characters on both sides of the center at the same time, we can judge the palindromes of the characters on the right of the enumeration center as the central substring by the palindromes of the characters on the left of the enumeration center as the central substring, which is manacher algorithm.

The manacher algorithm is very clever in that it first traverses the string, assuming that i is the center of the enumeration, then j (j) < The length of f[j], the longest context substring at the center of i, has been calculated, and the influence range of j is [j-ES24en [j]/2,j+f [j]]. To maximize the impact of the character j on the right side of the enumeration center, you need to maximize j+f[j]/2. After finding the largest j that satisfies j+f[j]/2, if i is in [j,j+f[j]/2], there are two cases:

1. i The scope of influence of the symmetric character i' with respect to j is entirely contained within the scope of influence of j, then the scope of influence of i is greater than or equal to the scope of influence of i', i.e. f[i], due to palindrome > =f[i']

2. The influence range of i on the symmetric character i' is not completely included in the influence range of j, then the influence range on the right side of i is greater than or equal to [ES61en-ES62en [j]/2,i'], that is, i+f[i]/2 > =i'-j+f[j]/2

Due to symmetry, i+i" = 2*j. So in case 1, f[i] > = f (2 * j - i]; In case 2, f[i] > = f j [j] + 2 * 2 * i.

To sum up 1,2, f[i] > = min (f j i] - [2 * and f [j] j - 2 + 2 * * i). Since there are untraversed characters to the right of i, you can build on this and continue to expand to both sides until you find the longest textual substring.

If i still follows j+f[j]/2, then i is not affected by the preceding character and can only expand to both sides by 1.

This algorithm can achieve a time complexity of O (N) because it only needs to traverse the string once and has a limited number of extensions.

The following is the program of Pthon3. In order to test the efficiency of the algorithm, the original violence enumeration algorithm is still provided as a reference for the worst algorithm.

python code:


# Find the longest palindrome string class  
class LPS:      
 # Initialize, need to provide 1 A string  
 def __init__(self,string): 
  self.string = string 
  self.lens = len(self.string) 
  
 # Enumeration of violence: as a reference for algorithm efficiency  
 def brute_force(self): 
  maxcount = 0 
  for j in range(self.lens):      
   for k in range(j,self.lens): 
    count = 0 
    l,m = j,k 
    while m>=l: 
     if self.string[l]==self.string[m]: 
      l,m = l+1,m-1 
     else: 
      break 
    if m<l: 
     count = k-j+1 
    if count>maxcount : 
     maxcount = count 
  return maxcount 
  
 # Optimized version: Enumerate substring centers  
 def brute_force_opti(self): 
  maxcount = 0 
  if self.lens == 1:        # only 1 The characters are returned directly 1 
   return 1 
  for j in range(self.lens-1):     # Enumeration center  
   count,u = 1,j 
   # For odd substrings, extend directly  
   for k in range(1,j+1):      # Both sides extension  
    l,m = u+k,j-k 
    if (m>=0)&(l<self.lens): 
     if(self.string[l]==self.string[m]): 
      count += 2 
     else: 
      break 
   if count>maxcount :       # Update the longest length of the callback substring  
    maxcount = count 
   if self.string[j]==self.string[j+1]:  # Process even substrings, treating two adjacent identical elements as a whole  
    u,count= j+1,2 
   for k in range(1,j+1):      # Both sides extension  
    l,m = u+k,j-k 
    if (m>=0)&(l<self.lens): 
     if(self.string[l]==self.string[m]): 
      count += 2 
     else: 
      break 
   if count>maxcount :       # Update the longest length of the callback substring  
    maxcount = count 
  return maxcount 
   
 #manacher algorithm  
 def manacher(self): 
  s = '#'+'#'.join(self.string)+'#'    # String processing, with special character isolation string, easy to handle even substring  
  lens = len(s) 
  f = []           # Auxiliary list: f[i] said i The length of the longest textual substring at the center  
  maxj = 0          # Records of i The most influential character position on the right j 
  maxl = 0          # record j The right edge of the sphere of influence  
  maxd = 0          # Record the longest textual substring length  
  for i in range(lens):       # Traversal string  
   if maxl>i:         
    count = min(maxl-i,int(f[2*maxj-i]/2)+1)# Here to facilitate the use of subsequent calculations count , which represents the distance from the current character to the right edge of its range of influence  
   else :          
    count = 1 
   while i-count>=0 and i+count<lens and s[i-count]==s[i+count]:# Both sides extension  
    count +=1 
   if(i-1+count)>maxl:       # Updates the character with the largest impact range j And its right boundary  
     maxl,maxj = i-1+count,i               
   f.append(count*2-1) 
   maxd = max(maxd,f[i])      # Update the longest length of the callback substring  
  return int((maxd+1)/2)-1      # Remove special characters  

Through the above program, using a pure 'a' string of 1000 length as a sample, it has been tested:

Enumeration of violence: 49.719844s

Center enumeration: 0.334019s

manacher: 0.008000 s

Thus it can be seen that when the length is 1000, the time of violent enumeration is unbearable, while in comparison, the efficiency of central enumeration has been greatly improved, and the optimal manacher time is shorter.


Related articles: