Sep 11
28
How to Download an Entire Website with WGET
Introduction
If you have ever had to or needed to copy or move a website, wget is quite handy. Wget is open source, available on Linux, OSX and Windows and is very easy to use. A whole website can be downloaded with one simple command.
Usage
Wget is a command line program so the name of the program is typed into a terminal followed by the required arguments. To download a file from a website, the wget command is quite simple:
wget http://www.website.com/file.zip
To download recursively all files linked to a page you want to download, -r is used.
wget -r http://www.website.com/index.html
In the case a web server does not allow download managers, -U can be used to tell the web server you are using a common web browser:
wget -r -U Mozilla http://www.website.com/index.html
Some web servers may blacklist an IP if it notices that the all pages are being downloaded quickly. the –wait=15 option will prevent this:
wget --wait=15 -r -U Mozilla http://www.website.com/index.html
The download rate can also be set with the –limit-rate=64K option. limit rate defaults to bytes so a K must be placed after a number for kilobytes:
wget --wait=15 --limit-rate=64K -r -U Mozilla http://www.website.com/index.html
To make sure that wget does not download files from parent directories, –no-parent can be used:
wget --wait=15 --limit-rate=64K --no-parent -r -U Mozilla http://www.website.com/index.html
