NHL

From edegan.com
Revision as of 14:41, 14 March 2016 by imported>Sahil
Jump to navigation Jump to search

Old Material

Downloading Postgresql on Mac

Download package from:

http://www.enterprisedb.com/products-services-training/pgdownload#osx

Follow instructions given on the website. Macs already come with Perl, using the stackbuilder application which was also downloaded through the same link, download the PL/Perl package.

Variables

List of necessary variables and where to find them in the dropbox.

For all skaters we need:

NHLIDDetails.txt (likely a file we generate)
 ID (int) 
 Playername from NHL, Playername from CapGeek, Playername from GeneralFanager
 DOB (transform to ISO8601)
NHLHistoric_Player_summary.txt & NHLPlayer_summary.txt (historic data set includes NHL Player summary except for two games of 2013-2014 season)
 Playername
 Current Team (string)
 Position (F, D) 
 season (YYYY) 
 goals (int) 
 TOI (float)
NHLPlayer_points.txt
 Playername
 DOB
 PPG (float)
NHLPlayer_bios.txt
 playername
 dob 
 game type (overtime or no overtime)
 weights (int)
 height (int)
 age (int) - calculated from DOB
NHLPlayer_faceOffPercentageAll.txt
 playername
 face-off wins (int) 
Capgeek_10_processed-notepad.txt
 playername
 dob
 salary (int)
 length (int)
 contract start date (MM/DD/YYYY)
 contract type (EL, RFA, UFA, TFP)
 caphit (int)
 
In a separate Table:
 Year and CPI (2010 Base Year)

Next Tasks

Spec General Fanager!

General Fanager Webcrawler

The Perl Libraries I used to create this webcrawler are

use strict;
use LWP::Simple;
use HTML::Tree;

Using the LWP::Simple library makes it easy to rip the HTML off the website by simply doing,

$content = get(your url as a string here);

The URL used to access the General Fanager page containing data from all the players is http://www.generalfanager.com/players.

Now the HTML::Tree library allows us to parse the HTML code into a more accessible tree structure.

$tree = HTML::Tree->new();
$tree->parse($content);

Now, with the HTML code parsed we can look down the tree to find what we are searching for.

$tree->look_down( '_tag', 'tag of what you are looking for here')

Will return an array with each element of the array containing the HTMl tree down from where the tag was found. I used the tag table because it was the most specific tag above the player stat, and put the resuls into the @tables variable. Now in order to access the data of each individual player you must look inside the @tables variable