One linux command per day (61) : wget command in detail

  • 2020-05-13 04:28:10
  • OfStack

wget on the Linux system is a tool for downloading files from the command line. It is an essential tool for Linux users, and we often have to download some software or restore a backup from a remote server to a local server. wget supports the HTTP, HTTPS, and FTP protocols and can use the HTTP proxy. The so-called automatic download means that wget can be executed in the background after the user exits the system. This means that you can log in, launch an wget download task, and then exit the system. wget will run in the background until the task is completed, which saves you a lot of trouble compared to most other browsers that require user 1's direct participation in downloading large amounts of data.

wget can follow the links on the HTML page and download them in turn to create a local version of the remote server, completely recreating the directory structure of the original site. This is often referred to as "recursive downloading". When downloading recursively, wget follows the standard Robot Exclusion (/ robots.txt). wget converts the link to a local file to facilitate offline browsing.

wget is very stable, and it is very adaptable in the case of narrow bandwidth and unstable network. If the download fails due to network reasons, wget will keep trying until the entire file is downloaded. If the server interrupts the download, it will reconnect to the server and resume the download from where it left off. This is useful for downloading large files from servers that have limited link times.

1. Command format:

wget [parameter] [URL address]

2. Command functions:

Used to download resources from the network, no specified directory, download resources back to default to the current directory. wget is powerful but simple to use:

1) support breakpoint downlink function; This point is also the biggest selling point of network ant and FlashGet in the past. Now, Wget can also use this feature, so users who don't have a good network can rest assured.

2) both FTP and HTTP downloads are supported; Although most software can be downloaded using HTTP, sometimes it is still necessary to use FTP to download software.

3) support proxy server; For the high security system, 1 will not be directly exposed to the Internet, so the support agent is necessary to download software functions;

4) convenient and simple setting; Perhaps, accustomed to graphical interface users are not accustomed to the command line, but the command line in the Settings actually have more advantages, at least, the mouse can be many times less, do not worry about whether the mouse is wrong;

5) small program, completely free; Small programs can be considered, because now the hard disk is too big; Even though there are many so-called free software on the Internet, the advertisements of these software are not what we like.

3. Command parameters:

Startup parameters:

-V, the copyright holder displays the version of wget and exits -h, copyright © 2014 all rights reserved -b, the copyright holder 76en will be transferred to the background for execution after startup -e, and execute=COMMAND execute commands in '.wgetrc' format, see /etc/wgetrc or ~/.wgetrc for wgetrc format

Record and input file parameters:

-- o, -- output-file =FILE write the record to the FILE file -a, and append-output =FILE append the record to the FILE file -d, the copyrighted copy debug print debug output -q, wokquiet quiet mode (no output) -v, wokverbose verbose mode (this is the default) -nv, hang non-verbose turn off the long mode, but not the quiet mode Download URLs in FILE file The input file is treated as an HTML file -B, and wsr118en =URL prefixes URL as a relative link that appears in the file specified by the -F-i parameter Copyright © 2014 all rights reserved. All rights reserved. Copyright © 2014 all rights reserved Copyright © 2015 all rights reserved. Copyright © 2015 all rights reserved. Copyright © 2015 all rights reserved The file name of EGD socket is specified by FILE

Download parameters:

Copyright © 2014 all rights reserved. All rights reserved. Copyright © 2014 all rights reserved. All rights reserved. -t, WSD =NUMBER set the maximum number of link attempts (0 = unlimited). -O. output-document =FILE. Write the document to the FILE file Do not overwrite existing files or use the.# prefix -c, copyright © 2009-2014 all rights reserved Copyright 2015 all rights reserved. All rights reserved. Copyright 2015 all rights reserved -N, and timestamping do not download files unless they are newer than the local file -S, copyright © 2015 - server-response print server response I do not download anything -T, woktimeout =SECONDS sets the number of seconds for the response timeout -w, copy165en =SECONDS, SECONDS seconds between attempts Copyright © 2014 all rights reserved. All rights reserved. SECONDS seconds Copyright © 2014 all rights reserved. All rights reserved. All rights reserved. 2 * WAIT seconds -Y, wokproxy =on/off turn the agent on or off -Q, WSD 179en =NUMBER set the download capacity limit Ambien 181en -rate=RATE - limit the download rate

Directory parameters:

-nd, no-directories does not create directories -x, and force-directories to force directory creation -nH, and no-host-directories do not create host directories Save the file to PREFIX/... 201en -dirs=NUMBER ignore the NUMBER layer remote directory

HTTP option parameters:

The user name of HTTP is USER. The password http is set to PASS -C, copyright © 2014 -- 2014 all rights reserved - cache=on/off data caching on the server side is/is not allowed (1 normally allowed) All text/html documents are saved with the.html extension The 'Content-Length' header field is ignored The string STRING is inserted into headers The user name of the agent is USER The password of the agent is set to PASS The 'Referer: URL' header is included in the HTTP request -s, hang 250en -headers save HTTP header to file -U, and user-agent =AGENT set the agent's name to AGENT instead of Wget/VERSION -- > no-http-keep-alive -- close HTTP active link (forever link) Copyright © 2014 all rights reserved. All rights reserved. All rights reserved The copyright © 2014 all rights reserved. The copyright © 2014 all rights reserved. The copyright © 2014 all rights reserved WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD: WSD

FTP option parameters:

-nr, and the file '.listing' is not removed -g, and glob=on/off turn on or off the globbing mechanism for file names The default default is to use the passive transport mode. The hang active-ftp USES active transmission mode When you recurse, you link to a file (not a directory)

Recursive download parameters:

-r, copyright © 2009-2014 all rights reserved. -l, hang level=NUMBER maximum recursive depth (inf or 0 represents infinity) The file is partially deleted after the end of the now Convert non-relative links to relative links Copy the file X.orig before converting it to X -m, hang mirror is equivalent to -r-N-l inf-nr Download all the pictures showing the HTML file

Include and not include in recursive download (accept/reject) :

-A, wk3en =LIST semicolon separated list of accepted extensions -R, wokreject =LIST semicolon separated list of unaccepted extensions -D, the wokdomains =LIST semicolon separated list of accepted fields A list of unaccepted fields separated by the semicolon exclude-domains =LIST The copyright © 2014 all rights reserved. The copyright © 2014 all rights reserved. The copyright © 2014 all rights reserved A semicolon separated list of tracked HTML tags -G, copyright © 2014 - ignore-tags =LIST semicolon separated list of ignored HTML tags -H, hang span-hosts goes to an external host when recursing -L, the copyright holder only tracks relative links -I, wokinclude-directories =LIST list of permitted directories-X, wok 366en -directories=LIST is not included in the list of directories -np, WSD 370en-parent do not trace back to the parent directory wget-S. spider url

4. Examples of use:

Example 1: download a single file using wget

Command:


wget http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description:

The following example downloads a file from the network and saves it in the current directory. A progress bar is displayed during the download process, including (percent download complete, bytes downloaded, current download speed, remaining download time).

Example 2: download using wget-O and save with a different file name

Command:


:wget -O wordpress.zip http://www.minjieren.com/download.aspx?id=1080

Description:

By default, wget will command with the last character following "/", and for dynamically linked downloads the file name is usually incorrect.

Error: the following example downloads a file with the name download. aspx? id = 1080


wget http://www.minjieren.com/download?id=1

Even if the downloaded file is in zip format, it is still in download. php? id = 1080 command.

Correct: to solve this problem, we can specify a file name using the parameter -O:


wget -O wordpress.zip http://www.minjieren.com/download.aspx?id=1080

Example 3: download using wget with limit-rate speed limit

Command:


wget --limit-rate=300k http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description:

When you execute wget, it takes up all possible broadband downloads by default. But when you're about to download a large file and you need to download other files, it may be necessary to limit the speed.

Example 4: continuation with wget-c breakpoints

Command:


wget -c http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description:

Using wget-c to restart the files that have been interrupted, it is very helpful for us to download large files when they are suddenly interrupted due to network or other reasons. We can continue to download instead of downloading a new file. You can use the -c parameter when you need to continue an interrupted download.

Example 5: download using wget-b background

Command:


wget -b http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description:

When downloading very large files, we can use the parameter -b for background downloading.


wget -b http://www.minjieren.com/wordpress-3.1-zh_CN.zip
Continuing in background, pid 1840.
Output will be written to `wget-log'.

You can check the download progress by using the following command:


tail -f wget-log

Example 6: fake agent name download

Command:


wget --user-agent="Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16" http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description:

Some sites can refuse your download request by deciding that the proxy name is not the browser. However, you can mask with the user-agent parameter.

Example 7: test the download link using the wget, spider

Command:


:wget -O wordpress.zip http://www.minjieren.com/download.aspx?id=1080
0

Description:

When you plan to do a timed download, you should test the download link at the scheduled time. We can add the wokspider parameter to check.


wget --spider URL

If the download link is correct, it will be displayed


wget --spider URL
Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.

This ensures that the download will take place at the scheduled time, but when you give the wrong link, the following error will appear


:wget -O wordpress.zip http://www.minjieren.com/download.aspx?id=1080
3

You can use the spider parameter in the following situations:

Periodically check before downloading The interval checks if the site is available Check for dead links to web pages

Example 8: use wget and tries to increase the number of retries

Command:


:wget -O wordpress.zip http://www.minjieren.com/download.aspx?id=1080
4

Description:

If the network has a problem or downloading a large file may fail. wget retries the download file 20 times by default. If necessary, you can increase the number of retries with the use of wk03.

Example 9: download multiple files using wget-i

Command:


:wget -O wordpress.zip http://www.minjieren.com/download.aspx?id=1080
5

Description:

First, save a copy of the download link file


:wget -O wordpress.zip http://www.minjieren.com/download.aspx?id=1080
6

Then use this file and the parameter -i to download

Example 10: use the wget and mirror mirror sites

Command:


wget --mirror -p --convert-links -P ./LOCAL URL

Description:

Download the entire site locally.

Copyright © 2014 all rights reserved. All rights reserved -p: download all files for html page to display normally Copyright © 2014 all rights reserved. All rights reserved. Copyright © 2014 all rights reserved -P./LOCAL: save all files and directories to a locally specified directory

Example 11: filter the download using wget and reject in the specified format

Command:


wget --reject=gif ur

Description:

Download 1 site, but you don't want to download images, you can use the following command.

Example 12: download information to log files using wget-o

Command:


:wget -O wordpress.zip http://www.minjieren.com/download.aspx?id=1080
9

Description:

Do not want the download information to be displayed directly on the terminal but in a log file that can be used

Example 13: limit the total download file size using wget-Q

Command:


wget -Q5m -i filelist.txt

Description:

When you want to download more than 5M files and exit the download, you can use. Note: this parameter does not work for individual file downloads and is only valid for recursive downloads.

Example 14: download the file in the specified format using wget-r-A

Command:


wget -r -A.pdf url

Description:

You can use this feature in the following situations:

Download all images from 1 site Download all the videos from 1 website Download all PDF files from 1 web site

Example 15: download using wget FTP

Command:


wget ftp-url
wget --ftp-user=USERNAME --ftp-password=PASSWORD url

Description:

You can use wget to download the ftp link.

Using wget anonymous ftp download:


wget ftp-url

Download ftp with wget username and password authentication


wget --ftp-user=USERNAME --ftp-password=PASSWORD url

Note: compile and install

Compile the installation using the following command:


# tar zxvf wget-1.9.1.tar.gz 
# cd wget-1.9.1 
# ./configure 
# make 
# make install 

Related articles: