How to create a website offline copy with wget in Kali Linux
During the "Reconnaissance" phase we might need to frequently access the targeted website and this can trigger some alarms. I used to rely on Httrack – or WebHttrack – for making one-on-one offline copies for a given web-page, but for some odd reasons it doesn't work on my current Kali installation. For those who want to give WEBHTTRACK a chance, one thing you need to remember: it's not included by default in Kali. In order to install webhttrack type the following:apt-get updateto get the full GUI version, or
apt-get install webhttrack
apt-get updateto get the command-line version only.
apt-get install httrack
Searching for alternative easy ways to do it, I've found this tutorial from kossboss – all the credit goes there.
Open a terminal and type mkdir /mywebsitedownloads/ and then
cd / mywebsitedownloads – you can name the folder in any way you wish.
Now (copy and paste):
wget --limit-rate=200k --no-clobber --convert-links --random-wait -r -p -E -e robots=off -U mozilla http://www.nameofthesiteyouwanttocopy.comReplace the nameofthesiteyouwanttocopy.com with the actual name of your targeted web-page. Below is the explanation of each command:
--limit-rate=200k: Limit the download to 200 Kb/sec – higher download rates might seem suspicious.
--no-clobber: don't overwrite any existing files (used in case the download is interrupted and
resumed).
--convert-links: convert links so that they work locally, off-line, instead of pointing to a website online.
--random-wait: Random waits between download – same reason as for the limit-rate.
-r: Recursive - downloads full website
-p: downloads everything, including pictures.
-E: gets the right extension of the file.
-e robots=off: prevent the website from considering your session as a robot/crawler.
-U mozilla: pretends to be just like a web-browser.
Once the download is completed you can find the offline copy in /nameofthefolder you used for saving your downloaded page – look for the home/index.html page.
You'll notice that it is an identical copy – it preserves the link structure, pictures, code and other formatting. Remember that anytime you interact directly with any online resources owned by the 'target', there's a chance you'll leave your digital fingerprint behind.
No comments:
Post a Comment