Changes
Jump to navigation
Jump to search
PhD Masterclass - How to Build a Web Crawler (view source)
Revision as of 18:23, 28 January 2011
, 18:23, 28 January 2011New page: This page provides resources for the PhD Masterclass "How to Build a Web Crawler", which I gave on Friday 28th January 2011 to interested PhD students at Haas. ==Tools== *[http://www.per...
This page provides resources for the PhD Masterclass "How to Build a Web Crawler", which I gave on Friday 28th January 2011 to interested PhD students at Haas.
==Tools==
*[http://www.perl.org/ Perl] - Available with a large set of useful modules for Windows from ActiveState as [http://www.activestate.com/activeperl ActivePerl]
*[http://www.activestate.com/komodo-ide Komodo] - An integrated development environment for Perl available from ActiveState
*[http://www.textpad.com/ Textpad] - A powerful shareware text editor that supports [http://en.wikipedia.org/wiki/Regular_expression regular expressions]
You should [http://www.activestate.com/komodo-ide/downloads download a trial of Komodo] to help you learn. The trial is valid for 21 days (longer if you keep changing your system clock). Komodo will let you step through your code, line by line, and see the values that your variables take on.
Perl is a free and open language, with a rich history, so you will find a wealth of information on the web to help you learn and use it.
==Modules==
One of the joys of Perl is [http://www.cpan.org/ CPAN - The Comprehensive Perl Archive Network] which acts as repository for perl modules (as well as scripts, distros and much else). There are modules written by people from all over the world for almost every conceivable purpose. There is usually no need to reinvent the wheel in Perl - just grab a module (e.g. Wheel::Base)!
==Tools==
*[http://www.perl.org/ Perl] - Available with a large set of useful modules for Windows from ActiveState as [http://www.activestate.com/activeperl ActivePerl]
*[http://www.activestate.com/komodo-ide Komodo] - An integrated development environment for Perl available from ActiveState
*[http://www.textpad.com/ Textpad] - A powerful shareware text editor that supports [http://en.wikipedia.org/wiki/Regular_expression regular expressions]
You should [http://www.activestate.com/komodo-ide/downloads download a trial of Komodo] to help you learn. The trial is valid for 21 days (longer if you keep changing your system clock). Komodo will let you step through your code, line by line, and see the values that your variables take on.
Perl is a free and open language, with a rich history, so you will find a wealth of information on the web to help you learn and use it.
==Modules==
One of the joys of Perl is [http://www.cpan.org/ CPAN - The Comprehensive Perl Archive Network] which acts as repository for perl modules (as well as scripts, distros and much else). There are modules written by people from all over the world for almost every conceivable purpose. There is usually no need to reinvent the wheel in Perl - just grab a module (e.g. Wheel::Base)!