The basic sorting algorithm is implemented by C language

  • 2020-05-07 20:06:41
  • OfStack

first takes a look at some concepts related to 1 sorting algorithm:
1, stable sort and unstable sort
The short answer is that all equal Numbers can be sorted in the same relative order as they were before they were sorted by some sort method, and we say that this sort method is stable. On the contrary, it is unstable.
Such as: 1 set number is a1 before sorting, a2, a3, a4, a5, including a2 = a4, after some sort of a1, a2, a4, a3, a5, we say this order is stable, because in front of a4 a2 sorting before it in front of the a4 after sorting. If become a1, a4, a2 a3, a5, it is not stable.

2. Inner sort and outer sort
In the sorting process, all the Numbers to be sorted are in memory, and in memory to adjust their storage order, called internal sort;
In the process of sorting, only part of the number is called into memory, and with the help of memory adjustment in external storage storage order sort method is called external sort.

3. Time and space complexity of the algorithm
The so-called time complexity of the algorithm refers to the computational workload required to execute the algorithm.
The spatial complexity of an algorithm, 1 generally refers to the memory space required to execute the algorithm.

Next, we will actually look at the specific C language implementation of several sorting algorithms:

bubble sort (Bubble Sort)

If the sequence is small to large, then any two adjacent elements should satisfy a[i-1]. < = relation of a[i]. In bubble sort, we iterate over groups from right to left, comparing two adjacent elements. If two elements are in the wrong order, then swap them. If the order of the two elements is correct, no exchange is made. After one iteration, we can ensure that the smallest element (the bubble) is in the leftmost position.

After one iteration, bubble sort does not guarantee that all the elements are sorted from smallest to largest. Therefore, we need to iterate over the array elements from right to left again and do the bubble sort. So this one time, we don't have to worry about the leftmost element. Then proceed with a maximum of n-1 traversal.

If the elements are not exchanged during a traversal, the array is sorted and you can abort the sorting. In the worst case, the largest element in the starting array is at the far left, so the bubbling algorithm has to go through n-1 times to arrange the array, instead of finishing the sorting ahead of time.


/*By Vamei*/
/*swap the neighbors if out of order*/
void bubble_sort(int a[], int ac)
{
  /*use swap*/
  int i,j;
  int sign;
  for (j = 0; j < ac-1; j++) {
    sign = 0;
    for(i = ac-1; i > j; i--)
    {
      if(a[i-1] > a[i]) {
        sign = 1;
        swap(a+i, a+i-1);
      }
    }
    if (sign == 0) break;
  }
}

insertion sort (Insertion Sort)

Let's say that when the freshmen arrive, we line them up by height. If a student joins at this time, we add that student to the end of the line. If the student is lower than the student in front, then let the student switch places with the student in front. The student will eventually switch places. This is the basic principle of insertion sort.

For the starting array, we think that initially, we have 1 student, the leftmost element (i=0), forming an ordered queue.

Then a second student (i=1) joins the team, and the second student is transferred to the right place; Then a third student joined the team, and a third student was transferred to his place... When all the n students join the team, our sorting is complete.


/*By Vamei*/
/*insert the next element 
 into the sorted part*/
void insert_sort(int a[], int ac)
{
  /*use swap*/
  int i,j;  
  for (j=1; j < ac; j++) 
  {
    i = j-1;
    while((i>=0) && (a[i+1] < a[i])) 
    {
      swap(a+i+1, a+i);
      i--;
    }
  }
}

selection sort (Selection Sort)

The end result of sorting: no element is greater than the one to its right (a[i]). < = a[j], if i < = j). So, in an ordered sequence, the smallest element is in the leftmost position, and the second smallest element is in the i=1 position... The biggest element comes last.

Selection sort is to find the smallest element in the starting array and swap it to i=0; Then find the smallest of the remaining elements and swap it to i=1... Until you find the second largest element, swap it to n-2. At this point, the sorting of the entire array is complete.


/*By Vamei*/
/*find the smallest of the rest,
 then append to the sorted part*/
void select_sort(int a[], int ac) 
{
  /*use swap*/
  int i,j;
  int min_idx;
  for (j = 0; j < ac-1; j++) 
  {
    min_idx = j;
    for (i = j+1; i < ac; i++) 
    {
      if (a[i] < a[min_idx]) 
      {
        min_idx = i;
      }
    }
    swap(a+j, a+min_idx);
  }  
}

hill sort (Shell Sort)

As we mentioned in bubble sort, the worst happens when large elements are at the beginning of the array. These large elements at the beginning of the array need to be traversed many times before they can be swapped to the end of the array. Such elements are called turtles (turtle).

The reason for the turtle element is that bubble sort always compares and swaps adjacent elements. So every time you walk from right to left, the big element can only move 1 bit to the right. (the smaller element at the end of the line is called the rabbit (rabbit), which can be quickly swapped to the head of the line.)

Hill sort compares and swaps elements at larger intervals, so that larger elements can move more than one position to the right while swapping, moving the tortoise element faster. For example, you can divide an array into four subarrays (i=4k, i=4k+1, i=4k+2, i=4k+3) and bubble sort each subarray. For example, the subarray i= 0,4,8,12... . At this point, the interval between each exchange is 4.

After sorting the four subarrays, the order of the arrays is not always good. The hill sort reduces the spacing, reforms the subarray, and bubbles the sort into the subarray... When the interval is reduced to 1, it is equivalent to bubble sort the entire array once. The order of the arrays is then sorted.

Hill sort can be done not only with bubble sort, but also with other sort methods.


/*By Vamei*/
/*quickly sort the turtles at the tail of the array*/
void shell_sort(int a[], int ac)
{
  int step;
  int i,j;
  int nsub;
  int *sub;

  /* initialize step */
  step = 1;
  while(step < ac) step = 3*step + 1;

  /* when step becomes 1, it's equivalent to the bubble sort*/
  while(step > 1) {
    /* step will go down to 1 at most */
    step = step/3 + 1;
    for(i=0; i<step; i++) {
      /* pick an element every step, 
       and combine into a sub-array */
      nsub = (ac - i - 1)/step + 1;      
      sub = (int *) malloc(sizeof(int)*nsub);
      for(j=0; j<nsub; j++) {
        sub[j] = a[i+j*step]; 
      }
      /* sort the sub-array by bubble sorting. 
       It could be other sorting methods */
      bubble_sort(sub, nsub);
      /* put back the sub-array*/
      for(j=0; j<nsub; j++) {
        a[i+j*step] = sub[j];
      }
      /* free sub-array */
      free(sub);
    }  
  }
}

Shell Sorting depends on the selection of the interval (step). A common choice is to set this interval to 1/1.3 of the previous interval. See reference books.

 

merge sort (Merge Sort)

If we want to sort a deck of CARDS by number. Two people had already sequenced half of them. So we can put these two piles of CARDS up here, and let's say the smaller CARDS are on top. At this point, we're going to see the top two CARDS in the deck.

We take the smaller of the two CARDS and we take it out and we put it in our hand. Two more CARDS in the deck are exposed at the top, and the smaller one is placed in the hand... Until all the CARDS are in the hand, the deck is in order. That's merge sort.

Recursion is used in the following implementation:


/*By Vamei*/
/*recursively merge two sorted arrays*/
void merge_sort(int *a, int ac)
{
  int i, j, k;  
  int ac1, ac2;
  int *ah1, *ah2;
  int *container;

  /*base case*/  
  if (ac <= 1) return;

  /*split the array into two*/
  ac1 = ac/2;
  ac2 = ac - ac1;
  ah1 = a + 0;
  ah2 = a + ac1;

  /*recursion*/
  merge_sort(ah1, ac1);
  merge_sort(ah2, ac2);
 
  /*merge*/
  i = 0;
  j = 0;
  k = 0;
  container = (int *) malloc(sizeof(int)*ac);
  while(i<ac1 && j<ac2) {
    if (ah1[i] <= ah2[j]) {
      container[k++] = ah1[i++];
    } 
    else {
      container[k++] = ah2[j++];
    }
  }
  while (i < ac1) {
    container[k++] = ah1[i++];
  }
  while (j < ac2) {
    container[k++] = ah2[j++];
  }

  /*copy back the sorted array*/
  for(i=0; i<ac; i++) {
    a[i] = container[i];
  }
  /*free space*/
  free(container);
}

quicksort (Quick Sort)

We still consider ranking students by height. In quicksort, we randomly select a student and take the height of the student as a reference (pivot). Then have the student who is lower than the student stand on the student's right and the rest on the student's left.

Obviously, all the students were divided into two groups. The student on the right is taller than the student on the left.

Let's go ahead and pick a random student from the low height group and divide the low height group into two groups (very low and not so low). Again, divide the high school students into two groups (not so high and very high).

This continues until there is only one student in the group. When there is only one student in all the groups, the sorting is complete.

  USES recursion in the following implementation:


/*By Vamei*/
/*select pivot, put elements (<= pivot) to the left*/
void quick_sort(int a[], int ac)
{
  /*use swap*/

  /* pivot is a position, 
    all the elements before pivot is smaller or equal to pvalue */
  int pivot;
  /* the position of the element to be tested against pivot */
  int sample;

  /* select a pvalue. 
    Median is supposed to be a good choice, but that will itself take time.
    here, the pvalue is selected in a very simple wayi: a[ac/2] */
  /* store pvalue at a[0] */
  swap(a+0, a+ac/2);
  pivot = 1; 

  /* test each element */
  for (sample=1; sample<ac; sample++) {
    if (a[sample] < a[0]) {
      swap(a+pivot, a+sample);
      pivot++;
    }
  }
  /* swap an element (which <= pvalue) with a[0] */
  swap(a+0,a+pivot-1);

  /* base case, if only two elements are in the array,
    the above pass has already sorted the array */
  if (ac<=2) return;
  else {
    /* recursion */
    quick_sort(a, pivot);
    quick_sort(a+pivot, ac-pivot);
  }
}

The ideal pivot is to take the median in the grouping element. However, the algorithm to find the median needs to be implemented separately. You can also randomly select elements as pivot, and random selection needs to be implemented separately. For simplicity, I used the middle element as pivot each time.

 

heap sort (Heap Sort)

Heap (heap) is a common data structure. It's a queue with a priority. The most common implementation of the heap is an Complete Binary Tree with a qualified operation. This Complete Binary Tree maintains the nature of the heap, that is, the parent node (parent) is greater than the child node (children). Therefore, the root node of the heap is the smallest of all heap elements. The heap defines the insert node and delete root node operations, both of which maintain the characteristics of the heap.

We can make an unordered array into a heap, and then pull out the root node again and again, and finally make an ordered array.

Read the bibliography for a more detailed description of the heap.

Below is the data structure of the heap, as well as the insert node and delete root node operations. You can easily build the heap and pull out the root nodes to form an ordered array.


/* By Vamei 
  Use an big array to implement heap
  DECLARE: int heap[MAXSIZE] in calling function
  heap[0] : total nodes in the heap
  for a node i, its children are i*2 and i*2+1 (if exists)
  its parent is i/2 */

void insert(int new, int heap[]) 
{
  int childIdx, parentIdx;
  heap[0] = heap[0] + 1;
  heap[heap[0]] = new;
  
  /* recover heap property */
  percolate_up(heap);
}

static void percolate_up(int heap[]) {
  int lightIdx, parentIdx;
  lightIdx = heap[0];
  parentIdx = lightIdx/2;
  /* lightIdx is root? && swap? */
  while((parentIdx > 0) && (heap[lightIdx] < heap[parentIdx])) {
    /* swap */
    swap(heap + lightIdx, heap + parentIdx); 
    lightIdx = parentIdx;
    parentIdx = lightIdx/2;
  }
}


int delete_min(int heap[]) 
{
  int min;
  if (heap[0] < 1) {
    /* delete element from an empty heap */
    printf("Error: delete_min from an empty heap.");
    exit(1);
  }

  /* delete root 
    move the last leaf to the root */
  min = heap[1];
  swap(heap + 1, heap + heap[0]);
  heap[0] -= 1;

  /* recover heap property */
  percolate_down(heap);
 
  return min;
}

static void percolate_down(int heap[]) {
  int heavyIdx;
  int childIdx1, childIdx2, minIdx;
  int sign; /* state variable, 1: swap; 0: no swap */

  heavyIdx = 1;
  do {
    sign   = 0;
    childIdx1 = heavyIdx*2;
    childIdx2 = childIdx1 + 1;
    if (childIdx1 > heap[0]) {
      /* both children are null */
      break; 
    }
    else if (childIdx2 > heap[0]) {
      /* right children is null */
      minIdx = childIdx1;
    }
    else {
      minIdx = (heap[childIdx1] < heap[childIdx2]) ?
             childIdx1 : childIdx2;
    }

    if (heap[heavyIdx] > heap[minIdx]) {
      /* swap with child */
      swap(heap + heavyIdx, heap + minIdx);
      heavyIdx = minIdx;
      sign = 1;
    }
  } while(sign == 1);
}

summary

In addition to the algorithm above, there are also things like Bucket Sorting, Radix Sorting involved. I will add to this article after implementing relevant algorithms in the future. Time complexity analysis of related algorithms can be found in the bibliography. I did my own crude analysis. If the blog garden can support the display of mathematical formulas, I will post my own analysis process for the jade.

I wrote the code myself and tested it very simply. If there is any mistake, thank you for correcting me.

Finally, the exchange function used above is:


/* By Vamei */
/* exchange the values pointed by pa and pb*/
void swap(int *pa, int *pb)
{
  int tmp;
  tmp = *pa;
  *pa = *pb;
  *pb = tmp;
}

Comparison and selection of several sorting algorithms
1. Factors to consider when selecting the sorting method:
            (1) number of elements to be sorted n;
            (2) the amount of information in the element itself;
The structure and distribution of             (3) keywords;
          (4) conditions of language tools, size of auxiliary space, etc.

2. 1 some Suggestions:
    (1) if n is smaller (n) < = 50), then you can use direct insertion sort or direct selection sort. Because direct insertion sort requires more record movement than direct selection sort, it is better to use direct selection sort when the record itself has more information.
    (2) if the initial state of the file is basically ordered by keyword, direct insertion or bubble sort is preferred.
If n is large, the sorting method with time complexity of O(nlog2n) should be adopted: quick sort, heap sort or merge sort. Quicksort is currently considered the best comparison based internal sort method.
    sort (4) based on the comparative method, compare the size of the two key words every time, after only two possible transfer, therefore 1 two forks tree can be used to describe the decision process, which can prove that: when the file n a keyword random distribution, any sorting algorithm by means of "comparison", at least need O (nlog2n) time.
    (5) when the record itself has a large amount of information, a linked list can be used as a storage structure to avoid spending a lot of time moving the record.


Related articles: