Analysis on how to realize the principle of crawling data in php

  • 2021-11-01 02:47:29
  • OfStack

Official website site: Simple, flexible and powerful PHP collection tool, so that collection is simpler 1 point.

Brief introduction

QueryList uses jQuery selector to do collection, so that you bid farewell to complex regular expressions; QueryList has jQuery1-like DOM operation ability, Http network operation ability, garbled code resolution ability, content filtering ability and scalability; Can easily achieve such as: simulated login, fake browser, HTTP proxy and other complex network requests; It has rich plug-ins, supports multi-threaded collection and uses PhantomJS to collect JavaScript dynamic rendered pages.

Installation

Install via Composer:


composer require jaeger/querylist

Using tutorials:

Directly on the code:


<?php
include './vendor/autoload.php';
//  Use composer Import Directory After Installation 
use QL\QueryList;
//  Using plug-ins 
 
$html = file_get_contents('https://www.biqudu.com/14_14778/');
//  Get the page manually 
$data = QueryList::html($html);
//  Get the content of the page 
$data = QueryList::setHtml('https://www.biqudu.com/14_14778/');
//  Equivalent to the above html()
$data->rules([
  //  Collect all a Labeled href Attribute 
  'link' => ['a','href'],
  //  Collect all a Text content of label 
  'text' => ['a','text']
  ]);
//  Here $data =  Objects after the content of the web page has been obtained above 
//  Set collection rules   Replaces the traditional regularity 
$data->query();
//  Here $data =  Objects after the content of the web page has been obtained above  
// query  Perform an operation 
$data->getData();
//  Here $data =  Objects after the content of the web page has been obtained above 
//  Get data results 
$data->all();
//  Here $data =  Objects after the content of the web page has been obtained above 
//  Convert data into 2 Dimensional array 
print_r($data->all());
//  Print results 

The basic usage method above is like this, so that we can already capture 1 set of data


Related articles: