php USES regular expressions to download the function code for the remote image in the content
- 2020-05-10 17:50:17
- OfStack
This program is actually an important part of the "thief program". This 1 section of the program is only to download the remote picture of that 1 section, the program is relatively simple to write, most parts are made comments.
if (preg_match_all("/http://[^ "']+[.jpg|.gif|.jpeg|.png]+/ui",stripcslashes($content),$aliurl)){
$i=0; // Multiple files ++
while(list($key ,$v) = each($aliurl[0])){
//echo $v."<br />";
$filetype = pathinfo($v, PATHINFO_EXTENSION); // Get the suffix name
$ff = @file_get_contents($v); // To obtain 2 Base file content
if(!stripos($v,"ofstack.com")){// Determine if it is an image under your own website
if (!empty($ff)){ // Once you get the file, do the following
$dir = "upload/".date("Ymd")."/";// Specify a new storage path
if (!file_exists($dir)){// Determine if the directory exists
@mkdir($dir,511,true); // Create multiple levels of directories ,511 Converted to 10 Into the system is 777 Has executable permissions
}
$nfn = $dir.date("Ymdhis").$i.".".$filetype; // The new name of the build file
$nf = @fopen($nfn,"w"); // Create a file
fwrite($nf,$ff); // Written to the file
fclose($nf); // Close the file
$i++; // Multiple files ++
echo "<img src="".$nfn."">";
$content = str_replace($v,$nfn, $content);// replace content The parameters in the
}else{// If you cannot get the image, replace it with the default image
$content = str_replace($v,"/upload/201204/20120417213810742.gif", $content);// replace content The parameters in the
}
}
}
}
PHP USES regular expressions to download images to the local implementation code
<?php
/*
author: ssh_kobe
date: 20110602
shortage: If the image path in the web page is not an absolute path, it cannot be captured
*/
set_time_limit(0);// Fetching is not limited by time
$URL='http://pp.baidu.com/';// Any web site
get_pic($URL);
function get_pic($pic_url) {
// Get photo 2 Base flow
$data=CurlGet($pic_url);
/* Get image links using regular expressions */
$pattern_src = '/<[img|IMG].*?src=[\'|\"](.*?(?:[\.gif|\.jpg]))[\'|\"].*?[\/]?>/';
$num = preg_match_all($pattern_src, $data, $match_src);
$arr_src=$match_src[1];// Get an array of images
get_name($arr_src);
echo "<br>finished!!!";
return 0;
}
/* Get the image type and save it to the same file 1 directory */
function get_name($pic_arr)
{
// Image type
$pattern_type = '/(/.(jpg|bmp|jpeg|gif|png))/';
foreach($pic_arr as $pic_item){// Loop to fetch the address of each image
$num = preg_match_all($pattern_type, $pic_item, $match_type);
$pic_name = get_unique().$match_type[1][0];// Change the time microsecond timestamp name
// Save the image as a stream
$write_fd = @fopen($pic_name,"wb");
@fwrite($write_fd, CurlGet($pic_item));
@fclose($write_fd);
echo "[OK]..!";
}
return 0;
}
// Obtain only by microsecond time 1ID
function get_unique(){
list($msec, $sec) = explode(" ",microtime());
return $sec.intval($msec*1000000);
}
// Crawling web content
function CurlGet($url){
$url=str_replace('&','&',$url);
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_HEADER, false);
//curl_setopt($curl, CURLOPT_REFERER,$url);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 6.0; SeaPort/1.2; Windows NT 5.1; SV1; InfoPath.2)");
curl_setopt($curl, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($curl, CURLOPT_COOKIEFILE, 'cookie.txt');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 0);
$values = curl_exec($curl);
curl_close($curl);
return $values;
}
?>
Let's look at an example
function get_pic_url($content){
$pattern="/<[img|IMG].*?src=[\'|\"](.*?(?:[\.gif|\.jpg]))[\'|\"].*?[\/]?>/";// regular
preg_match_all($pattern,$content,$match);// Match the pictures
return $match[1];// Returns the path to all images
}
That's the function code in php to download remote images from content via regular expressions. For more information on downloading remote images from content via regular expressions, check out other articles on this site!