In a previous post, I showed how to extract data from the Google Maps API, which leaves a series of JSON files, like this:
{"address_components": [{"long_name":"576","short_name":"576","types":["street_number"]}, {"long_name":"Concord Road","short_name":"Concord Road","types":["route"]},{"long_name":"Glen Mills","short_name":"Glen Mills","types":["locality","political"]},{"long_name":"PA","short_name":"PA","types":...
Ideally we want selections from these as a CSV for manual review, and import into mapping software. First, we load a list of files:
import os
folder = 'G:\\maps\\churches\\db.tar\\db\\db'
files = []
for root, dirs, f in os.walk(folder):
files = files + f
Then, we define a function which reads the contents of each file, and parses it as JSON, skipping non-JSON files:
import json
def js(fn):
try:
with open(fn, 'r') as f:
contents = f.read()
f.close()
return json.loads(contents)
except:
return ''
text = [js(f) for f in files]
We can then easily retrieve rows as tuples – since JSON is typically a hierarchical format, you may need to manually filter these in some way.
csv_rows = \
[(j['name'], \
get(j, 'website'), \
j['geometry']['location']['kb'], \
j['geometry']['location']['jb'], \
j['formatted_address']) for j in text]
Finally, we use a simple CSV API provided by Python to write the file. Typically values in CSV files are quoted where necessary and separated by commas, although this API lets you control that:
with open('maps.csv', 'wb') as csvfile:
mapswriter = csv.writer(csvfile, delimiter=',',
quotechar='"', quoting=csv.QUOTE_MINIMAL)
for r in csv_rows:
mapswriter.writerow(r)
One of the great things about Python as a scripting language for data analysis is the host of APIs available to make this operation simple – as you can see this is quite easy, as it doesn’t require downloading any additional add-on libraries.