The realization of quick sort algorithm in Java programming and the optimization of related algorithms

2020-05-09 18:41:13
OfStack

Time complexity

Average: O(nlgn)
Worst case: O(n*n), occurs when the data is already sorted
The basic principle of quicksort algorithm

1. Select a value a[i] from the data as a reference
2. Take a[i] as a reference, divide the data into two parts: all data in P1, P2, P1 ≤a[i], all data in P2 are > a[i], and the data is changed into {{P1} {P2}}
3. Repeat the above steps for P1 and P2 until there is only one data in each part
4. The data is arranged in ascending order

Basic example:

Original data:


   { 3 . 9 . 8 . 5 . 2 . 1 . 6 }

Step 1: select the first data: 3
Step 2: divide the data into 2 parts, with the left side ≤3 and the right side greater than > 3:


   { 2 . 1 }   { 3 }   { 9 . 8 . 5 . 6 }

Step 3: repeat the above steps for each part until there is only one data in each part:


   { 2 . 1 }  =>  { 1 }   { 2 } 
   { 9 . 8 . 5 . 6 }  =>  { 8 . 5 . 6 }   { 9 } =>  { 5 . 6 }   { 8 }   { 9 } =>  { 5 }   { 6 }   { 8 }   { 9 }

Step 4: data in ascending order:


   { 1 }   { 2 }   { 3 }   { 5 }   { 6 }   { 8 }   { 9 }

The data in the program is usually stored in an array. For example, an array of type int, you can write the above steps as 1 quickSort function prototype:


quickSort(int begin, int end) { 
//begin Is the first of the array 1 Is the index value of the data, end Is the end of the array 1 The index value of the data +1
  // If only 1 A data or 0 , the program returns 
  if( begin == end || begin == (end-1) ) return; 
  int p = in[begin];//p For the selected reference data, select no 1 A data 
  int a = begin +1; //a As a 2 The index value of a partial data boundary 
  int b = a;//b The index value for the data to be compared 
  for( ; b < end; b++) {// Compare the data in the array with the reference data in turn 
    if( in[b] < p) { // If this data < The reference data is moved to the left 
      if(a == b){a++; continue;} // If the data is already on the left, it doesn't move 
      int temp = in[a];// Move the data to the left 
      in[a] = in[b];
      in[b] = temp;
      a++;// Move the dividing line to the right 
    }
  }
  in[begin] = in[a-1];// Let's move the reference value to 2 Group data center 
  in[a-1] = p;
  if( a-1 > begin){ //  If there is data on the left, repeat the steps above 
    quickSort(begin, a);
  } 
  if( end-1 > a ) {//  If there is data on the right, repeat the steps above 
    quickSort(a, end);
  } 
  return; //  If there is no data returned 
}

Algorithm to optimize
The above quicksort algorithm is basically quicksort because it doesn't take into account any input data. However, it is easy to find the flaw in this algorithm: when we input data in a basically or even completely orderly manner, the algorithm degenerates into bubble sort, which is no longer O(n, n), but O(n^2).
The root of this is in our code implementation, which starts at the first array at a time. If we take the median of arr[low],arr[high],arr[(low+high) /2]3 as the pivot record, we can greatly improve the performance of quicksort in the worst case. However, we still can't improve its performance to O(n) in the array ordered case. There are also ways to improve the worst-case time performance of quicksort to varying degrees.

In addition, quicksort requires a recursive stack, which is usually not very deep, at the log(n) level. However, if the two array lengths are severely out of balance at a time, it is a worst-case scenario and the stack depth increases to O(n). At this point, the space complexity brought by the stack space cannot be ignored. If you add the overhead of the extra variables, you might even get a horrible O(n^2) space complexity here. Therefore, the worst space complexity of quicksort is not a fixed value, or even a level.
To solve this problem, we can reduce the maximum depth back to the O level by comparing the lengths of the two ends after each partition and sorting the shorter sequences first (the goal is to end the stack first to free up space).

Here, three optimization ideas for quicksort are proposed:
For small arrays, you can use insertion sort to avoid recursive calls. For example, when if(hi) < When = lo + M), you can go to insertion sort.
Split the array with the median of the element in part 1 of the subarray. You get a better shard, but at the cost of calculating the median.
If the array contains a large number of repeating elements, you can use a 3-way shard. Divide the array into three parts, corresponding to the elements of the array that are less than, equal to, and greater than the shard elements. The code is implemented as follows:


  private static void sort1(int[] a, int lo, int hi) {
  if (hi <= lo)
    return;
  int lt = lo, i = lo + 1, gt = hi;
  int v = a[lo];
  while (i <= gt) {
    if (a[i] < v) {
      swap(a, lt++, i++);
    } else if (a[i] > v) {
      swap(a, i, gt--);
    } else {
      i++;
    }
    sort(a, lo, lt - 1);
    sort(a, gt + 1, hi);
  }
}