php short link algorithm collection and analysis
- 2020-05-10 17:51:41
- OfStack
Short links don't say, everyone has been clear, as shown below is a short link:
Sina weibo http: / / t cn/SVpONM
Tencent weibo http: / / url cn yor / 302
Yun.io http://d.yun.io/PNri2v
Benefits of short links: 1. Content needs; 2. 2. User-friendly; 3. Easy to manage.
How to do this, there are about three steps:
1. Define an URL mapping algorithm, which can map long URL into short strings;
2. Use 1 store (database? NoSQL?) To store the completed mapping;
3. Implemented its own URL mapping algorithm;
1 generally speaking, step 3 is a bit of a headache for us, how to map a long URL string into a shorter one. I've summed up three ways:
Common implementation
I think you've learned before how to convert from base 10 to base 2, or from base 10 to base 106, so to make it shorter, we can use base 62, transcode a number ID, and convert it to a short string.
The disadvantage of this approach is that there is no way to guarantee that all links are of a fixed bit length, and in the case of high concurrency, how to ensure fast distribution is a problem.
Specific implementation method:
Implementation of literature and art
Algorithm description: using 6 characters to represent short links, we use 'a'-'z','0'-'5' in ASCII characters, a total of 32 characters as a set. Each character has 32 states, 6 characters can represent 32^6 (1073741824), then how to get these 6 characters, described as follows:
Md5 is applied to the long URL passed in, resulting in a 32-bit string, which varies a lot and is 16 to the thirty-second power, basically guaranteeing the uniqueness. Divide the 32 bits into 4 parts of 8 characters each, and the probability becomes 16 to the eighth power, which is 4294967296. The probability of collision of this number is also small. The key is the following processing. We think of this 8-bit character as a hexadecimal integer, which is 1*('0x'.$val), and then we take the 0-30 bits, one in every five, calculate its integer value, and map it to the 32 characters we have prepared, and finally we can get a 6-bit short link address.
PHP is implemented as follows:
2 forced implementation
The following function USES a purely random method to generate a short link, although we can query to make sure we don't reuse the short link, but... Does that really make sense
Technorati tag: short link,Short Url, map, hash
Sina weibo http: / / t cn/SVpONM
Tencent weibo http: / / url cn yor / 302
Yun.io http://d.yun.io/PNri2v
Benefits of short links: 1. Content needs; 2. 2. User-friendly; 3. Easy to manage.
How to do this, there are about three steps:
1. Define an URL mapping algorithm, which can map long URL into short strings;
2. Use 1 store (database? NoSQL?) To store the completed mapping;
3. Implemented its own URL mapping algorithm;
1 generally speaking, step 3 is a bit of a headache for us, how to map a long URL string into a shorter one. I've summed up three ways:
Common implementation
I think you've learned before how to convert from base 10 to base 2, or from base 10 to base 106, so to make it shorter, we can use base 62, transcode a number ID, and convert it to a short string.
The disadvantage of this approach is that there is no way to guarantee that all links are of a fixed bit length, and in the case of high concurrency, how to ensure fast distribution is a problem.
Specific implementation method:
/**
* using 62 Base against Numbers ID Code short links. The disadvantage is that you cannot guarantee that each short link is a fixed length
*
* @author wanshiqiang<wangshiqiang@360.cn>
* @param integer $integer
* @param string $base
*/
private function getShortenedURLFromID ($integer, $base = ALLOWED_CHARS)
{
$length = strlen($base);
while($integer > $length - 1)
{
$out = $base[fmod($integer, $length)] . $out;
$integer = floor( $integer / $length );
}
return $base[$integer] . $out;
}
/**
* right 62 Base coded short links are decoded
*
* @author wangshiqiang<wangshiqiang@360.cn>
* @param string $string
* @param string $base
*/
private function getIDFromShortenedURL ($string, $base = ALLOWED_CHARS)
{
$length = strlen($base);
$size = strlen($string) - 1;
$string = str_split($string);
$out = strpos($base, array_pop($string));
foreach($string as $i => $char)
{
$out += strpos($base, $char) * pow($length, $size - $i);
}
return $out;
}
Implementation of literature and art
Algorithm description: using 6 characters to represent short links, we use 'a'-'z','0'-'5' in ASCII characters, a total of 32 characters as a set. Each character has 32 states, 6 characters can represent 32^6 (1073741824), then how to get these 6 characters, described as follows:
Md5 is applied to the long URL passed in, resulting in a 32-bit string, which varies a lot and is 16 to the thirty-second power, basically guaranteeing the uniqueness. Divide the 32 bits into 4 parts of 8 characters each, and the probability becomes 16 to the eighth power, which is 4294967296. The probability of collision of this number is also small. The key is the following processing. We think of this 8-bit character as a hexadecimal integer, which is 1*('0x'.$val), and then we take the 0-30 bits, one in every five, calculate its integer value, and map it to the 32 characters we have prepared, and finally we can get a 6-bit short link address.
PHP is implemented as follows:
function shorten( $long_url )
{
$base32 = "abcdefghijklmnopqrstuvwxyz012345";
$hex = md5( $long_url );
$hexLen = strlen( $hex );
$subHexLen = $hexLen / 8;
$output = array();
for( $i = 0; $i < $subHexLen; $i++ )
{
$subHex = substr( $hex, $i * 8, 8 );
$subHex = 0x3FFFFFFF & ( 1 * ('0x' . $subHex ) );
$out = '';
for( $j = 0; $j < 6; $j++ )
{
$val = 0x0000001F & $int;
$out .= $base32[$val];
$int = $int >> 5;
}
$output[] = $out;
}
return $output;
}
2 forced implementation
The following function USES a purely random method to generate a short link, although we can query to make sure we don't reuse the short link, but... Does that really make sense
function random($length, $pool = '') {
$random = '';
if (empty($pool)) { $pool = 'abcdefghkmnpqrstuvwxyz'; $pool .=
'23456789'; }
srand ((double)microtime()*1000000);
for($i = 0; $i < $length; $i++) { $random .=
substr($pool,(rand()%(strlen ($pool))), 1); }
return $random;
}
Technorati tag: short link,Short Url, map, hash
Reference:
1. Principle analysis of short address on microblog
2. The principle and function of short domain name of microblog
3, Yours org
4, Free PHP URL Shorten script that that kicks ass
5. PHP Short Url Algorithm Implementation
6. Implement your own short URL
7, short url algorithm preliminary summary
8. Implementation method of Short Url