Difference between revisions of "PhD Masterclass - How to Build a Web Crawler"
imported>Ed (New page: This page provides resources for the PhD Masterclass "How to Build a Web Crawler", which I gave on Friday 28th January 2011 to interested PhD students at Haas. ==Tools== *[http://www.per...) |
(No difference)
|
Revision as of 18:23, 28 January 2011
This page provides resources for the PhD Masterclass "How to Build a Web Crawler", which I gave on Friday 28th January 2011 to interested PhD students at Haas.
Tools
- Perl - Available with a large set of useful modules for Windows from ActiveState as ActivePerl
- Komodo - An integrated development environment for Perl available from ActiveState
- Textpad - A powerful shareware text editor that supports regular expressions
You should download a trial of Komodo to help you learn. The trial is valid for 21 days (longer if you keep changing your system clock). Komodo will let you step through your code, line by line, and see the values that your variables take on.
Perl is a free and open language, with a rich history, so you will find a wealth of information on the web to help you learn and use it.
Modules
One of the joys of Perl is CPAN - The Comprehensive Perl Archive Network which acts as repository for perl modules (as well as scripts, distros and much else). There are modules written by people from all over the world for almost every conceivable purpose. There is usually no need to reinvent the wheel in Perl - just grab a module (e.g. Wheel::Base)!