PHP crawl sina reading channel novels and generate TXT ebook code

  • 2020-03-31 20:04:32
  • OfStack


 
//Want to see what ebook, go to sina to read a book to search first, fill in corresponding parameter next can
//http://vip.book.sina.com.cn/ 
//Ebook parameters
$array_book[0] = 38884; //A novel id
$array_book[1] = 22172; //Chapter start id
$array_book[2] = 32533; //Chapter end id
$array_book[3] = ' Chinese special forces survive: spike '; //The novel name

//Matching parameters
$title_pre = "/<h1>(.*?)</h1>/"; //The title section
$contents_pre = "/<div id="contTxt" class="contTxt1">(.*?)</div>/"; //The content part

//Generate e-books
for( $i = $array_book[1]; $i <= $array_book[2]; $i++){ 
$url = "http://vip.book.sina.com.cn/book/chapter_{$array_book[0]}_{$i}.html"; 
$html = file_get_contents($url); 
preg_match_all($title_pre,$html,$title); 
preg_match_all($contents_pre,$html,$contents); 
$fh = fopen($array_book[3].'.txt','a+'); 
$write_contents = $title[1][0]."rn".str_replace('</p>',"rn",str_replace('<p>',' ',$contents[1][0]))."rn"; 
if( fwrite($fh,$write_contents) ){ 
echo ' The first '.$i.' Section grab < /br>'; 
} 
fclose($fh); 
} 

Related articles: