tecznotes

Michal Migurski's notebook, listening post, and soapbox. Subscribe to spacer this blog. Check out the rest of my site as well.

Nov 28, 2012 11:37pm

gamma shapes for obama

Back in April, I bailed to Chicago for a week to volunteer with the Obama campaign tech team. It’s the team you’ve read all about, the one CTO’d by Harper Reed that blew the doors off the Romney campaign, and I was unbelievably lucky to spend one too-short week seven months ago working with the group right in their office.

Interestingly, what I ended up working on was a direct descendant of the precinct geometries that Marc Pfister mentioned in the comments of my old post:

We were manually fixing precinct data. We’d look at address-level registration data and rebuild precincts from blocks. A lot of it could probably have been done with something like Clustr and some autocorrelation. But basically you would look at a bunch of points and pick out the polygons they covered.

Clustr is a tool developed by Schuyler Erle and immortalized by Aaron Cope’s work on Flickr alpha shapes:

For every geotagged photo we store up to six Where On Earth (WOE) IDs. … Over time this got us wondering: If we plotted all the geotagged photos associated with a particular WOE ID, would we have enough data to generate a mostly accurate contour of that place? Not a perfect representation, perhaps, but something more fine-grained than a bounding box. It turns out we can.

Schuyler followed up on the alpha shapes project with beta shapes, a project used at SimpleGeo to correlate OSM-based geometry with point-based data from other sources like Foursquare or Flickr, generating neighborhood boundaries that match streets through a process of simple votes from users of social services. For campaign use, up-to-date boundaries for legislative precincts remains something of a holy grail. We would need an evolution of alpha- and beta-shapes that I’m going to go ahead and call “gamma shapes”.

spacer

Gamma shapes are defined by information from the Voter Information Project, which provides data from secretaries of state who define voting precincts prior to election day. While data providers like the U.S. Census provide clean shapefiles for nationwide districts, these typically lag the official definitions by many months, and those official definitions are in notable flux in the immediate lead-up to a national election. Everything was changing, and we needed a picture like this to accurately help field offices before November 6:

spacer

Unlike Flickr and SimpleGeo’s needs, precinct outlines are unambiguous and non-overlapping. It’s not enough to guess at a smoothed boundary or make a soft judgement call. As Anthea Watson-Strong and Paul Smith showed me, legislative precincts are defined in terms of address lists, and if your block is entirely within precinct A except for one house that’s in B, that’s an anomaly that must be accounted for in the map, such as this district with a weird, non-contiguous island:

spacer

The shapes above should look familiar to you if you’ve ever seen the Voronoi diagram, one of this year’s smoking-hot fashion algorithms.

spacer

(Fred Scharmen)

When you remove the clipping boundaries defined by TIGER/Line, you reveal a simple Voronoi tessellation of our original precinct-defining address points, each one with a surrounding cell of influence that shows where that precinct begins and ends. Voronoi allows us to define a continuous space based on distance to known points, and establishes cut lines that we can use to subdivide a TIGER-bounded block into two or more precincts. In most cases it’s not needed, because precinct boundaries are generally good about following obvious lines like roads or rivers. Sometimes, for example in cases of sparse road networks outside towns or demographic weirdness or even gerrymandering, the Voronoi are necessary.

spacer

The points in some of the images below are meant to represent individual address points (houses), but for most states precincts are defined in terms of address ranges. For example, the address numbers 100 through 180 on the even side of a street might be one precinct, while numbers 182 though 198 are another. In these case, I’m using TIGER/Line and Postgis 2.0’s new curve offset features to fake a line of houses by simply offsetting the road in either direction and clipping it to a range.

spacer

spacer

For a large city, such as this image of Milwaukee, there can be hundreds of precincts each just a few blocks in size:

spacer

In the end, the Voter Information Project source data ended up being of insufficient quality to support an accurate mapping operation, so this work will need to continue at some future date with new fresh data. Precincts change constantly, so I would expect to revisit this as a process rather than an end-product. Sadly, much of the code behind this project is not mine to share, so I can’t simply open up a repository as I normally would. Next time!

spacer

Comments (1)

  1. "Sadly, much of the code behind this project is not mine to share, so I can’t simply open up a repository as I normally would. Next time!" That's a concern I had when working on the first campaign - what happens to this work after Obama is termed out? What if it is given, or god forbid *sold* to a campaign that we wouldn't normally support? That's the breaks, I guess. Thanks for the shout-out, I'm glad to see you got to work on this problem.

    Posted by Marc Pfister on Thursday, November 29 2012 8:35am PST

Sorry, no new comments on old posts.

permanent link | tecznotes delicious -->

subscribe to spacer this site. | contact Michal Migurski

March 2013
Su M Tu W Th F Sa
     12
3456789
10111213141516
17181920212223
24252627282930
31      

Recent Entries

  1. documentation for tiled vectors in mapnik
  2. the liberty of postgreslessness: tiled vectors in mapnik
  3. gl-solar, webGL rendering of OSM data
  4. webgl maps, stealth mountain edition
  5. one more (map of lake merritt)
  6. elephant-to-elephant: processing OSM data in hadoop
  7. beasts of the southern wild
  8. weeks 1,838/1,839: total protonic reversal
  9. week 1,837: typescripting
  10. week 1,836: back at shiny
  11. hands
  12. week 1,835: leaving stamen
  13. work in progress: green means go
  14. loading artifacts, google maps for iOS
  15. blog all oft-played tracks IV
  16. back to webgl and nokia’s maps
  17. three ways fastly is awesome
  18. fourteen years of pantone colors-of-the-year
  19. gamma shapes for obama
  20. teasing out the data

Archives

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.