You probably already have a favorite software or database system for data manipulation. No problem, as long as it gets your data where it needs to be. We do highly recommend using Google Docs in conjunction, for it does certain tasks very well. It’s also free, has versioning, and allows you to use our nifty Geocoding and GeoJSON Export scripts. Importing your data into a Google Spreadsheet is relatively painless. These tutorials will guide you through a few of the more advanced tools offered in Google Spreadsheets for pre-TileMill data preparation.
For demonstration purposes we will be using 2010 tornado data, originally found at the NOAA’s Storm Prediction Center.
A pivot table is a summarization tool for your data. It creates a new table that allows you to sort, count, total, or average the selected data. When preparing data for mapping, pivot tables are often used to aggregate a list of data to a specific geographical level, such as countries or states.
Take, for example, this tornado data. The raw data is a list of all the tornados that occurred in 2010 and their coordinates and other statistics. Mapping these points is relatively easy and results in an informative and straightforward map. But say we want to do some analysis on this data and map the total number of occurrences per state? Using the pivot table tool, we can quickly create a new table based on this calculation.
These steps serve only a basic example of how to use this tool. There are many more things you can accomplish with pivot tables and you may need to make adjustments based on your own data.
In order to import your CSV or GeoJSON into TileMill, it must have columns that specify latitude and longitude coordinates. Quite often your data will have location names, but not coordinates. So how do you get the coordinate of a point based on the name of a location? This exact process is called geocoding, and there is a script you can add to your google spreadsheet that will automate this process for you.
This tutorial will guide you through geocoding using this script. For further information regarding geocoding services see this article.
Geocoding is not always 100% accurate. There are times when the service will return a wrong coordinate or a blank, so it is important to double check these values. You might not catch any errors until viewing the data on a map. In these cases you may have to manually adjust the coordinates.
Once your data is ready, you will want to publish it as a CSV. TileMill can import a CSV file as long as it has the appropriate longitude and latitude columns. Publishing with Google Docs will generate a link that you can use to import your data into TileMill. Of course, you can also simply save your spreadsheet to CSV and load that file into TileMill, but publishing gives you the option to forgo a lot of hassle in the future if your data requires updating.
TileMill caches layer data for better performance, but this can sometimes cause updated data from the same source to not be immediately available. To rectify this we can manually tell TileMill to flush the cache.
Say you make an edit to the previously published spreadsheet. To see those changes immediately take effect in TileMill:
Click the edit button on the relative layer.
Click Flush cache underneath the datasource field.
Click Save. TileMill will now load the fresh data.
Add a CSV layer to your TileMill project.