Micro blog short link algorithm php version of the implementation code

  • 2020-05-24 05:15:30
  • OfStack

Ideas:
1) generate 32-bit signature strings for the long website md5, which are divided into 4 segments, each containing 8 bytes;
2) for the 4 segments of the loop processing, take 8 bytes, regard it as hexadecimal string and 0x3fffffff(30-bit 1) and operation, that is, more than 30 bits of ignore processing;
3) the 30 digits are divided into 6 segments. Each 5-digit digit is used as the index of the alphabet to obtain specific characters, and the 6-digit string is obtained in turn;
4) the total md5 string can obtain 4 6-bit strings; Take any one of them as the short url address of the long url;
Here's the PHP code:
 
function shorturl($url='', $prefix='', $suffix='') { 
$base = array ( 
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 
'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 
'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 
'y', 'z', '0', '1', '2', '3', '4', '5'); 
$hex = md5($prefix.$url.$suffix); 
$hexLen = strlen($hex); 
$subHexLen = $hexLen / 8; 
$output = array(); 
for ($i = 0; $i < $subHexLen; $i++) { 
$subHex = substr ($hex, $i * 8, 8); 
$int = 0x3FFFFFFF & (1 * ('0x'.$subHex)); 
$out = ''; 
for ($j = 0; $j < 6; $j++) { 
$val = 0x0000001F & $int; 
$out .= $base[$val]; 
$int = $int >> 5; 
} 
$output[] = $out; 
} 
return $output; 
} 
$urls = shorturl('https://www.ofstack.com/'); 
var_dump($urls); 

The results of
 
array(4) { 
[0]=> 
string(6) "alms1l" 
[1]=> 
string(6) "2ipmby" 
[2]=> 
string(6) "avo1hu" 
[3]=> 
string(6) "fdlban" 
} 

Another version:
 
function shorturl($url='', $prefix='', $suffix='') { 
$base = array( 
"a","b","c","d","e","f","g","h", 
"i","j","k","l","m","n","o","p", 
"q","r","s","t","u","v","w","x", 
"y","z","0","1","2","3","4","5", 
"6","7","8","9","A","B","C","D", 
"E","F","G","H","I","J","K","L", 
"M","N","O","P","Q","R","S","T", 
"U","V","W","X","Y","Z"); 
$hex = md5($prefix.$url.$suffix); 
$hexLen = strlen($hex); 
$subHexLen = $hexLen / 8; 
$output = array(); 
for ($i = 0; $i < $subHexLen; $i++) { 
$subHex = substr ($hex, $i * 8, 8); 
$int = 0x3FFFFFFF & (1 * ('0x'.$subHex)); 
$out = ''; 
for ($j = 0; $j < 6; $j++) { 
$val = 0x0000003D & $int; 
$out .= $base[$val]; 
$int = $int >> 5; 
} 
$output[] = $out; 
} 
return $output; 
} 

Results:
 
array(4) { 
[0] => 
string(6) "6jmMVj" 
[1] => 
string(6) "2EnIby" 
[2] => 
string(6) "6vIVfu" 
[3] => 
string(6) "B7Fb6n" 
} 

However, the collision rate of the upgraded version is higher, I don't know why.
Test code for test collision:
 
$result = array(); 
$repeats= array(); 
$loop = 20000; 
for($i=0;$i<$loop;$i++){ 
$url = 'https://www.ofstack.com/?id='.$i; 
$shorta = shorturl($url); 
$short = $shorta[0]; 
if(in_array($short, $result)){ 
$repeats[] = $short; 
} 
$result[] = $short; 
} 
$result = array(); 
for($i=0;$i<$loop;$i++){ 
$url = 'https://www.ofstack.com/?id='.$i; 
$shorta = shorturl($url); 
$short = $shorta[0]; 
if(in_array($short, $repeats)){ 
$result[$short][] = $url; 
} 
} 
var_dump($repeats); 
var_dump($result); 

Results:
 
array(8) { 
[0] => 
string(6) "3eQBzq" 
[1] => 
string(6) "uQFnay" 
[2] => 
string(6) "qEZbIv" 
[3] => 
string(6) "fMneYf" 
[4] => 
string(6) "FJj6Fr" 
[5] => 
string(6) "3Eviym" 
[6] => 
string(6) "j2mmuy" 
[7] => 
string(6) "jyQfIv" 
} 
array(8) { 
'jyQfIv' => 
array(2) { 
[0] => 
string(26) "https://www.ofstack.com/?id=1640" 
[1] => 
string(27) "https://www.ofstack.com/?id=18661" 
} 
'fMneYf' => 
array(2) { 
[0] => 
string(26) "https://www.ofstack.com/?id=2072" 
[1] => 
string(26) "https://www.ofstack.com/?id=8480" 
} 
'3eQBzq' => 
array(2) { 
[0] => 
string(26) "https://www.ofstack.com/?id=4145" 
[1] => 
string(26) "https://www.ofstack.com/?id=4273" 
} 
'j2mmuy' => 
array(2) { 
[0] => 
string(26) "https://www.ofstack.com/?id=7131" 
[1] => 
string(27) "https://www.ofstack.com/?id=17898" 
} 
'qEZbIv' => 
array(2) { 
[0] => 
string(26) "https://www.ofstack.com/?id=7320" 
[1] => 
string(26) "https://www.ofstack.com/?id=8134" 
} 
'uQFnay' => 
array(2) { 
[0] => 
string(26) "https://www.ofstack.com/?id=7347" 
[1] => 
string(26) "https://www.ofstack.com/?id=7962" 
} 
'FJj6Fr' => 
array(2) { 
[0] => 
string(26) "https://www.ofstack.com/?id=8628" 
[1] => 
string(26) "https://www.ofstack.com/?id=9031" 
} 
'3Eviym' => 
array(2) { 
[0] => 
string(27) "https://www.ofstack.com/?id=11175" 
[1] => 
string(27) "https://www.ofstack.com/?id=14437" 
} 
} 

Related articles: