Zero based preparation of Java based crawler

  • 2020-04-01 03:35:09
  • OfStack

The beginning of our or as before, talk about the idea of doing reptile and need to prepare the knowledge, the masters please ignore directly.

First, let's have a little bit of thought about what to do and make a simple list of needs.

The requirements are as follows:

1. Simulated visit the official website of zhihu (link: http://www.zhihu.com/)

2. Download the specified page contents, including: hottest today, hottest this month, editor recommended

3. Download all the questions and answers in the specified categories, e.g., investing, programming, flunking

4. Download all responses from the specified respondent

5. It would be nice to have a one-click thumb up kink (so I can give all the answers to raelen all at once).

The list of technical problems to be solved is as follows:

1. Simulate a browser to visit a web page

2. Grab key data and save it locally

3. Solve the dynamic loading problem in web browsing

4. Use tree structure to grab all contents of zhihu

Ok, that's all I've thought about so far.

Now comes the preparation.

1. Confirm the crawler language: as I have written a series of crawler courses (click here), baidu post bar, qiushi encyclopedia, shandong university gpa query, etc., all written in python, so I decided to use Java to write this time (hello, there is no half a dime contact).

2. Science crawler: Web crawler, or Web Spider, is a very vivid name. A Spider is a Spider that crawls around the web. Web spiders find web pages by linking to them. Get started (click here).

3. Prepare the crawler environment: the installation and configuration of the Jdk and Eclipse is much more than that. By the way, a good browser is very important for a crawler, because first you need to browse the web and know where the things you need are, so you can tell them where to go and how to crawl. I recommend firefox, or Google, which is very powerful for right-clicking on elements and viewing source code.

Now we begin our official reptilian journey! Well, that's a question. Let me think about it. Don't worry


Related articles: