I also found some code to build a site outline, as mentioned yesterday. I’m using WWW::SimpleRobot. It just took a few small tweaks to the example to get what I needed. What I’m really after is a spider I can point to a site, and have it show me all the urls it can find, so I can compare it against the files of the site (on my local filesystem) to see what doesn’t get spidered. It’s a search engine robot simulator.

(Note: WWW::SimpleRobot does not respect the robots.txt file, so use it with care.)


Jun 24, 2003 11:36 am ·

Comments

No comments yet.

RSS feed for comments on this post.

Leave a comment

Sorry, the comment form is closed at this time.

Archives

photos: