Mirror Remote Files With Perl and PHP
The index page of one of the sites I maintain dynamically includes a small html page sourced on another server. Sometimes accessing the other server caused delayed appearance of my page. I decided that if the local index page used a mirrored version of the included remote html content, it would always appear quickly. The trick would be to script and schedule a regular update of the local version of the included content file.
The Script
I show two mirror technics here, Perl and PHP.
A Google-search supplied a link to http://www.perl.com/newdocs/pod/perlipc.html#Sockets_Client_Server_Communica where there was a big help script for internet TCP clients and servers. I added my host and file specifications in lines 8 and 10.
The file I wanted to retrieve had a header returned by the remote server and an image reference to their logo. The address for the image file was, of course on the remote server. The fetched file also had a non-html server prologue, so I added code to save only html lines (those containing "<") and replace the remote path with a local "images" path (line 25): (NOTE: I use images here because some characters like the backslash "
" are not treated well in browsers and content managers.)

Download script
Automation
The next step was to schedule the update using cron on the local server. I created a cron specification file with
echo "0,20,40 * * * * perl /home/sites/site64/cgi-bin/mir.pl" > /home/sites/site64/cgi-bin/mirsched
and then issued the command to execute the "mirsched" file as a crontab entry
crontab /home/sites/site64/cgi-bin/mirsched
Lo, and behold, "lthead.html" is captured, cleaned and localized three times an hour.
Here’s another mirror script that grabs the daily bikini image from http://www.thedaily.com/bikini.html/

Download script
This is a very adaptable script; here’s yet another version that hunts down Yahoo!’s "top stories" headlines:

Download script
Here's yet another version in PHP that saves the top 10 Yahoo Top Stories in the news into a local file. You can then include the file into a web page with an SSI include statement.
Install the script in your web /cgi-bin/ directory, then schedule the script to run using "crontab -e" command and enter the line:
00 * * * * php /var/www/cgi-bin/mir-yahoots-file.php

Download script

plain