Example of binary search algorithm in C and C++ program

  • 2020-05-09 19:02:03
  • OfStack

The idea of the   2-point search algorithm is simple, as described in programming gems: in an array containing t, 2-point search solves the problem by following the sum of the range. Initially, the scope is the entire array. The scope is narrowed by comparing the middle element of the scope with t and dropping a half of the scope. This process continues until t is found, or the range that can contain t becomes empty.
In his book Sorting and Searching 1,               Donald Knuth points out that although the first two-point search algorithm was published as early as 1946, the first two-point search algorithm without bug bug was not published until 12 years later. One common bug is the calculation of the intermediate subscript, if written as (low+high)/2, when low+high is large, it may overflow, resulting in array access errors. An improved method is to write the calculation as follows: low+ ((high-low)) > > 1). The modified algorithm code is given below:


int binarysearch1(int a[],int n,int x) 
{ 
 int l,u,m; 
 l=0;u=n; 
 while(l<u) 
 { 
  m=l+((u-l)>>1); 
  if(x<a[m]) 
   u=m; 
  else if(x==a[m]) 
   return m; 
  else 
   l=m+1; 
 } 
 return -1; 
} 

Note 1 here that the adjustment of the subscript looks a bit irregular due to the use of an asymmetric interval. One is u=m, and the other is l=m+1. Actually is easy to understand, before the adjustment should be in the form of interval [) form, if the middle value than find small, then adjust the left border, which is part of the closed, so add 1; otherwise, the adjustment is right border, was part of the open, so you don't have to minus 1. After adjustment is still [) form. Of course also can be written as symmetrical form. The code is as follows:


int binarysearch1(int a[],int n,int x) 
{ 
 int l,u,m; 
 l=0;u=n-1; 
 while(l<=u) 
 { 
  m=l+((u-l)>>1); 
  if(x<a[m]) 
   u=m-1; 
  else if(x==a[m]) 
   return m; 
  else 
   l=m+1; 
 } 
 return -1; 
} 

            this also looks neat, but there is a catch. If you want to change the program to "pure pointer" form, you will have trouble. The code modified to pure pointer is as follows:


int binarysearch2(int *a,int n,int x) 
{ 
 int *l,*u,*m; 
 l=a;u=a+n-1; 
 while(l<=u) 
 { 
  m=l+((u-l)>>1); 
  if(x<*m) 
   u=m-1; 
  else if(x==*m) 
   return m-a; 
  else 
   l=m+1; 
 } 
 return -1; 
} 

When             is 0, an invalid address is referenced. You don't have that problem with asymmetric intervals. The code is as follows:


int binarysearch2(int *a,int n,int x) 
{ 
 int *l,*u,*m; 
 l=a;u=a+n; 
 while(l<u) 
 { 
  m=l+((u-l)>>1); 
  if(x<*m) 
   u=m; 
  else if(x==*m) 
   return m-a; 
  else 
   l=m+1; 
 } 
 return -1; 
} 

            the two-point lookup given above is implemented iteratively, but it can also be implemented recursively. The code is as follows:


int binarysearch3(int a[],int l,int u,int x) 
 
int m=l+((u-l)>>1); 
if(l<=u) 
{ 
 if(x<a[m]) 
  return binarysearch3(a,l,m-1,x); 
 else if(x==a[m]) 
  return m; 
 else 
  return binarysearch3(a,m+1,u,x); 
} 
return -1; 

       
            these two-point algorithms, if the array element is repeated, return one element of the repeating element. If you want to return the first occurrence of the element being looked for, you need to modify the code. The following solution is given:


int binarysearch4(int a[],int n,int x) 
{ 
 int l,u,m; 
 int flag=-1; 
 l=0;u=n; 
 while(l<u) 
 { 
  m=l+((u-l)>>1); 
  if(x<a[m]) 
   u=m; 
  else if(x==a[m]) 
   flag=u=m; 
  else 
   l=m+1; 
 } 
 return flag; 
} 

           :


int binarysearch4(int a[],int n,int x) 
{ 
 int l,u,m; 
 l=-1;u=n; 
 while(l+1<u) 
 { 
  m=l+((u-l)>>1); 
  if(a[m]<x) 
   l=m; 
  else 
   u=m; 
 } 
 return (u>=n||a[u]!=x)?-1:u; 
} 

 
                              "The beauty of the code" has a chapter dedicated to 2 points search algorithm test, very beautiful. Here, a few simple test cases are presented. In view of the binarysearch1. The test procedure is as follows:


#include <iostream> 
#include <cassert> 
#include <algorithm> 
#include <ctime> 
using namespace std; 
 
int calmid(int l,int u) { return l+((u-l)>>1); } 
int binarysearch1(int a[],int n,int x); 
 
#define bs1 binarysearch1 
 
int main() 
{ 
 long start,end; 
 start=clock(); 
 
 int a[9]={-2147483648,-13,-10,-5,-3,0,1,400,2147483647}; 
 // The test of the median subscript calculation  
 assert(calmid(0,1)==0); 
 assert(calmid(0,2)==1); 
 assert(calmid(1000000,2000000)==1500000); 
 assert(calmid(2147483646,2147483647)==2147483646); 
 assert(calmid(2147483645,2147483647)==2147483646); 
 
 // Smoke test  
 assert(bs1(a,9,0)==5); 
 assert(bs1(a,9,1)==6); 
 assert(bs1(a,9,2)==-1); 
 
 // Boundary test  
 assert(bs1(a,0,1)==-1);       //0 An element  
 assert(bs1(a,1,-2147483648)==0);  //1 An element   successful  
 assert(bs1(a,1,-2147483647)==-1);  //1 An element   failure  
 
 assert(bs1(a,9,-2147483648)==0);  // The first element  
 assert(bs1(a,9,-3)==4);       // In the middle of the element  
 assert(bs1(a,9,2147483647)==8);  // At the end of the element  
 
 // Automated testing  
 int b[10000]; 
 int i,j; 
 for(i=0;i<10000;i++) 
 { 
  b[i]=i*10; 
  for(j=0;j<=i;j++) 
  { 
   assert(bs1(b,i+1,j*10)==j); 
   assert(bs1(b,i+1,j*10-5)==-1); 
  } 
 } 
 
 // Automated testing   Introduce random number  
 srand(time(0)); 
 for(i=0;i<10000;i++) 
 { 
  b[i]=rand()%1000000; 
  sort(&b[0],&b[i]); 
  for(j=0;j<=i;j++) 
  { 
   int x=rand(); 
   int k=bs1(b,i+1,x); 
   if(k!=-1) 
    assert(b[k]==x); 
  } 
 } 
 
 end=clock(); 
 cout<<(end-start)/1000.0<<'s'<<endl; 
 return 0; 
} 

            notes that the elements of an array have positive, negative, zero, maximum, and minimum values. It is common to forget negative number tests and introduce maximum and minimum values, mainly for boundary testing.
            1, tests the calculation of the median subscript. And I wrote a little function, and I tested it separately. Considering that memory might not be able to fit such a large array, it was only a simulation test, and did not really apply for such a large space, but the tests for the median subscript were enough.
            no. 2, smoke test. Do some basic tests. The boundary test is performed after the test passes.
            3, boundary testing. There are three types, one is for the number of elements in the array, zero and one. 2 is for the position of the element, which is the first element, the middle element and the last element respectively. 3 is for element value, has the maximum value, the minimum value, 0 and so on.
            4, automated testing. Here an array of tests is automatically generated and a successful lookup test is performed for each element.
            no. 5, automated testing, except that the elements of the array are random values.
            no. 5, performance test. The relevant code is not listed here. When all the above tests pass, you can modify the lookup algorithm and add the code for the performance test. You can simply add a comparison counter. The return value can be changed from the original lookup result to the counter value of the comparison. The code is simpler, so it's not going to be a column.

Note: 2 points look for an bug that is easily ignored
For 2 points search algorithm, I believe you will not be unfamiliar. The algorithm looks for the specified element in an ordered array, returns the index of that element in the array if it is found, or returns -1. Here's how to do it.


//a For a sorted array, n Is the size of the array, x For the specified element  
int binarySearch(int a[], int n, int x) 
{ 
 int left = 0, right = n-1, middle = 0; 
 int tmp = 0; 
 while(left <= right) 
 { 
   middle = (left + right)/2; 
   tmp = a[middle]; 
   if(x < tmp) right = middle - 1; 
   else if(x > tmp) left = middle + 1; 
   else return middle; 
 } 
 return -1; 
} 

          at first glance there are no errors, but unfortunately there is one bug in the program. When the array is extremely large, (left+right) may be negative, then the array index overflows and the program crashes.
Solution: change middle=(left+right)/2 to middle=left+(right-left)/2. That is, the use of subtraction instead of addition, so as to eliminate the overflow.
         


Related articles: