Difference between revisions of "Parallel Enclosing Circle Algorithm"

McNair Project
Parallel Enclosing Circle Algorithm
Project Information
Project Title	Parallel Enclosing Circle Algorithm
Owner	Oliver Chang
Start Date	July 31, 2017
Deadline	October 4, 2017
Primary Billing
Notes
Has project status	Complete
Is dependent on	Enclosing Circle Algorithm
	Copyright © 2016 edegan.com. All Rights Reserved.

Revision as of 16:30, 3 November 2017

A thin-wrapper around the enclosing circle algorithm which allows for instance-level parallelization. This project consists of the python files in E:\McNair\Projects\OliverLovesCircles\src\python.

Parallelization is implemented via Python2's subprocess.open() which is non-blocking and available in the standard library.

The Problem

Note that this is not the classical enclosing circle algorithm. Rather, we seek to minimize the sum of enclosing circles containing at least n points. Thus, multiple circles are allowed and inclusion in multiple circles is possible.

This algorithm has terrible time-performance characteristics, so we make the assumption that we can divide a large number of points with k-means and then solve those subproblems. In other words, we make the simplifying assumption that the Enclosing Circle Algorithm has Optimal Substructure.

Parameters

in circles.py:
- ITERATIONS: the number of iterations to attempt for each k to find minimum for that k
- MIN_POINTS_PER_CIRCLE (AKA n): the minimum number of data points that must be included in a circle
in vc_circles.py
- NUMBER_INSTANCES: number of parallel instances to run; assume no data-races between instances
- SWEEP_CYCLE_SECONDS: amount of time before removing completed jobs from the current job and adding new jobs if any files are left to process
- TIMEOUT_MINUTES: maximum running time of a parallel instance of the algorithm
- SPLIT_THRESHOLD: if a dataset has more than this threshold of data points, it will be split via k-means

Example Usage

$ python vc_circles.py --infile E:/McNair/Projects/OliverLovesCircles/CoLevelForCirclesNotRunGTE200.txt

where CoLevelForCirclesNotRunGTE200.txt is a tab-separated values file with the columns placestate, place, statecode, year, latitude, longitude, coname, datefirstinv, placens, geoid, city

This command will populate (and overwrite) any files in data/ and reports/. The format of the filenames in this directory are {city}{sep}{state}{sep}{year}{sep}{num}.tsv where num is a 0-indexed integer of a split city/state/year infile that has greater than SPLIT_THRESHOLD

Bugs/Issues

"St. Paul" and "St. Louis" have un-enclosed points--speculate because of weird file path issues
Some place/state/year combinations do not run to completion regardless of how tractable the number of points
How to merge small enclosing circles? This is a better measure of agglomeration regardless
How to separate outliers?

Related Pages

External Links

Git Repository

@@ Line 40: / Line 40: @@
 <code>placestate, place, statecode, year, latitude, longitude, coname, datefirstinv, placens, geoid, city</code>
-This command will populate (and overwrite) any files in <code>data/</code>. The format of the filenames in this directory are <code>{city}{sep}{state}{sep}{year}{sep}{num}.tsv</code> where <code>num</code> is a 0-indexed integer of a split city/state/year <code>infile</code> that has greater than <code>SPLIT_THRESHOLD</code>
+This command will populate (and overwrite) any files in <code>data/</code> and <code>reports/</code>. The format of the filenames in this directory are <code>{city}{sep}{state}{sep}{year}{sep}{num}.tsv</code> where <code>num</code> is a 0-indexed integer of a split city/state/year <code>infile</code> that has greater than <code>SPLIT_THRESHOLD</code>
 == Bugs/Issues ==

Difference between revisions of "Parallel Enclosing Circle Algorithm"

Revision as of 16:30, 3 November 2017

Contents

The Problem

Parameters

Example Usage

Bugs/Issues

Related Pages

External Links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools