We could normalize the number of clusters, dividing it by the maximum, to deal with the 'cities are different' problem. That is, we could put %unclustered (later called %complete) on the x-axis and %variance explained on the y-axis and fit a curve to a plot of city-year-layers. We could then pick a %unclustered value and apply it across cities. The difference between this and the 'heuristic method' is that we'd be choosing based on diminishing marginal returns in variance explained as opposed to in percentage locations in hulls.
Addendum: We could do the elbow method on a per city-year basis. The number of statistical clusters is equal to the number of layers, so we'd be indexing over layers, and selecting a layer, for each city-year. It might be worth trying this for some city-year, say Tulsa, 2003. The code would be reusable for a bigger sample. Estimate: 3hrs. ====The Elbow Heuristic Method Justification====
An attempt at a paragraph justifying the 'heuristic' method: