Making a static copy of a website
Sometimes, it is necessary to retrieve a full static copy of a website (hopefully one you own). There are tools that help you do this such as httrack.
However, for Linux systems primarily, using the wget command can achieve the same result.
Assuming I want to mirror the https://forum.phalcon.io
site, the wget
command used is:
wget -E -F -k -K -l 100 -N -nH -p -r -v http://forum.phalcon.io/
The options used are:
-E
: rename html files to.html
(adjust extensions)-F
: Force reading inputs as HTML files-k
: Convert links to relative (local viewing)-K
: Backup converted files-l 100
: Recurse 100 levels deep (it should be enough)-N
: Time stamp on-nH
: Disable generation of host-prefixed directories.-p
: Download all assets for a page (css, js, images)-r
: Recursive (important)-v
: Verbose
If you are interested in mirroring a site, you are more than welcome to use the above wget
command, adjusting it to your needs.
NOTE: Please do not be that guy that tries to index a site without the owner knowing about it. It is not nice!
-
Nikolaos Dimopoulos
Boldly goes where no other coder has gone before.... and other ramblings
Recent Posts
-
Setting up Docker for Qubes OS
2024-10-05 -
PhpStorm cannot create scratch files
2023-12-07 -
PHP 8.2 Deprecation of Dynamic Properties
2023-07-18 -
New Look
2023-06-12 -
Linux Swap file in RAM
2023-04-17
Tag Cloud
-
amazon (3)
android (1)
angularjs (7)
apps (1)
aurora (1)
aws (1)
backup (2)
bash (1)
bitbucket (1)
blog (2)
books (1)
bootstrap (1)
buzz (1)
cPanel (1)
cache (1)
celebrations (4)
chromium (3)
chromium os (3)
cloud computing (3)
codacy (1)
codecov (1)
communications (1)
composer (1)
conversion (1)
copy (1)
degoogle (5)
design (1)
design patterns (3)
discord (1)
docker (1)
docs (3)
documentation (1)
ec2 (3)
emerge (1)
encoding (1)
factory (1)
froyo (1)
fujitsu (1)
gentoo (7)
git (3)
github (2)
gmail (3)
google (16)
google apps (4)
google maps (1)
gource (1)
ha (1)
hosting (2)
how to (36)
igbinary (1)
information (5)
input (1)
installation (6)
internet (1)
iphone (1)
json (2)
libreoffice (1)
linux (13)
localization (1)
lts (1)
mariadb (1)
memorial day (1)
metrics (1)
migration (1)
mod_rewrite (1)
mov (1)
mp4 (1)
mysql (6)
nas (1)
netlify (1)
new look (1)
nexus one (2)
nfs (1)
notebook (1)
online storage (1)
openoffice (1)
opinion (1)
oracle (1)
patterns (1)
payroll (1)
performance (3)
personal (9)
phalcon (12)
php (23)
php8 (2)
php82 (1)
phpstorm (1)
phpunit (2)
picasa (2)
portage (1)
privacy (1)
programming (9)
proxy (1)
qubes os (1)
rant (5)
rdbms (1)
rds (1)
relationships (1)
release (1)
remove (1)
replication (1)
review (9)
rsync (2)
s1300 (1)
scan (1)
scratch (1)
serialize (1)
series (9)
singleton (1)
sorting (1)
spaceship (1)
spam (1)
ssl (1)
static (1)
storage (6)
submodules (1)
subversion (2)
svn (1)
swap (1)
tdd (1)
technorati (1)
test driven development (1)
testability (1)
testing (2)
titles (1)
traits (1)
ua (1)
ubuntu (1)
update (6)
upgrade (1)
usa (2)
usort (1)
utf8 (1)
video (1)
visualization (1)
vps (1)
webm (1)
website (1)
wget (1)
zend framework (4)
zram (1)
zstd (1)