Linux wget Command Examples, Tips and Tricks
The wget is a Linux command line tool for download web pages and files from the internet. The wget command in Linux support HTTP, HTTPS as well as FTP protocol.
In this tutorial we will see how to use wget command with examples.
- Install wget command on Linux.
- Download Web pages with wget command.
- Recursive Download with wget command.
- Site mirroring with wget command.
- Download Specific file types.
- Download Files From FTP Server.
- Set download speed.
- Read URLs from a text file.
- Continue incomplete download with wget command.
- Run wget command in the background.
- Run wget on debug mode.
- Run wget command as a web spider.
Install wget command on Linux
To install Wget on Debian and Ubuntu-based Linux systems, run the following command.
apt-get install wget
To install Wget on Red Hat/CentOS and Fedora use the following command:
yum install wget
Download Web pages with wget command
Capturing a single web page with wget is straightforward. To download a web page or file, simply use the wget command followed by the URL of the web page or file.
wget example.com
Since we only used the url, not a specific file name, output will be saved as "index.html".
Following command will download the 'latest.tar.gz' file from the wordpress.org website.
wget https://wordpress.org/latest.tar.gz
The file will be saved with the same name as remote filename.
Save with different filename
By default wget command will save the download file same name as the remote file. With -O (uppercase o) option we can specify different output file name.
Following wget command will download latest.tar.gz file and save it as wordpress.tar.gz.
wget -O wordpress.tar.gz https://wordpress.org/wordpress.tar.gz
Download Multiple files and pages
The wget command can download multiple files or webpages at once.
wget URL1 URL2
Set User Agent in wget command
The --user-agent change the default user agent. The following example will retrieve example.com and use 'Mozilla/4.0' as wget User-Agent.
wget --user-agent='Mozilla/4.0' example.com
View Server Response Headers
Sometimes you will want to see the headers sent by the Server. The -S or --server-response option will print the response headers.
wget -S example.com
Save verbose output to a log file
By default wget command prints verbose output to the Linux terminal. The -o (lowercase 0) option will log all messages to a logfile.
wget -o log.txt example.com
The above wget command will save verbose output to the 'log.txt' file.
Recursive Download with wget command
The -r or --recursive option use to Turn on recursive retrieving.
wget -r example.com
The default maximum depth of the recursive download is 5. But we can Specify recursion maximum depth level using the -l option.
wget -r -l 10 wsldp.com
The wget recursive mode crawl through the website and follow all links up to maximum depth level.
Note that, '-l 0' is Infinite recursion, So if you set maximum depth to zero, it will download all the files on the website.
Convert Links
The --convert-links is a useful option, it convert links to make them suitable for local viewing.
wget -r l 2 --convert-links example.com
Set max download size
We can set the max download size when retrieving files recursively. The download process will be aborted when the limit is exceeded. The value can be specified in bytes (default), kilobytes (with k suffix), or megabytes (with m suffix).
wget -r --quota=1m example.com
Note that quota will never affect downloading a single file.
Site Mirroring with wget command
Mirroring is similar to Recursive Download but there is no maximum depth level, So it will download the full website.
wget --mirror --convert-links example.com
Download Specific File Types
The -A option allows us to tell the wget command to download specific file types. This is done with the Recursive Download. For example, if you need to download pdf files from a website.
wget -A '*.pdf -r example.com
Note that recursive retrieving will be limited to the maximum depth level, default is 5.
Download Files From FTP Server
We can use wget command to download files from a FTP Server.
wget --ftp-user=username --ftp-password=pass ftp://192.168.1.10/file1.txt
As per the above example, wget command will download 'file1.txt' from the FTP Server located at 192.168.1.10.
The recursive option also can be used with the FTP protocol to download FTP files recursively.
wget -r --ftp-user=username --ftp-password=pass ftp://192.168.1.10/
Set Download Speed with wget
We can also limit download speed when downloading files with the wget command. Following command will download 'file.zip' and limits the download speed to 20KB/s.
wget --limit-rate=20k url/file.zip
Download rate may be expressed in bytes, kilobytes with the k suffix, or megabytes with the m suffix and no suffix for bytes.
Read URLs from a text file
The Linux wget command can read url's from a text file provided with -i option.
wget -i url.txt
The input file can contain multiple Urls, But each url must start in a new line.
Continue incomplete download with wget command
The -c / --continue option of the wget command use to continue downloading a partially-downloaded file. This is useful when you want to finish up a download started by a previous wget instance or by another program.
wget -c example.com/file.zip
Run wget command in the background
The -b / --background option sends the wget process to background immediately after startup. This is useful if you are downloading a large file that will take longer to finish.
wget -b wordpress.org/latest.tar.gz
The output of the wget process is redirected to 'wget-log' unless you specified a different log file with -o option.
We can monitor the wget process with tail command.
tail -f wget-log
Run wget on debug mode
When debug is turned on, wget print various information important to the developers and system administrators.
wget --debug example.com
The debug output includes wget request headers and response headers sent by the remote server.
Run wget command as a web spider
When the --spider option is used, wget command behaves like a web spider, which means that it will not download the pages, just check that they are there. If there are any broken links on the webpage it will be reported.
wget -r --spider example.com