Retrieving Open Street Map Data in Python
![](https://patrickthiel.com/wp-content/uploads/2023/06/subway_osm-800x450.jpg)
Open Street Map (OSM) data can be a highly useful tool for obtaining spatial information about your local community. Next to its broad availability, it can be easily obtained via Python. This article shows you how to download OSM data as the one mapped below. The figure shows restaurants in Berlin.
![](https://patrickthiel.com/wp-content/uploads/2022/11/map_berlin_restaurants_edit.png)
The goal is to collect data from various years, for various geographical locations, and also for different types of establishments. You can, of course, easily simplify the task by only picking one option.
First, let’s start by setting up our file by loading the necessary libraries. I also like to define global variables for paths leading to the file directories. So, let’s do this as well.
# importing libraries
import os
from os.path import join
import osmnx
import pandas as pd
# global path variables
main_path = "your_main_path"
data_path = join(main_path, "data")
os is needed for handling our directory paths. osmnx is the main library I use to retrieve the data from OSM. I recommend using a virtual environment to avoid any problems with the installation of the library. I typically set up a virtual environment in Anaconda and install osmnx via the channel conda-forge. Finally, pandas is required mainly for data manipulation and export.
The second step requires us to specify all the relevant parameters for the data we are interested in, including what kind of places we want to extract and for which geographical locations.
cities = ["Berlin, Germany", "Hamburg, Germany"]
places = ["restaurant", "bar"]
# Note: In case you have a list of cities stored externaly you could also read it in here. You simply have to loop through these cities then.
I like to store the different places previously defined in their own folder. This is completely optional. The following code sets up the folder structure automatically.
for p in places:
isExist = os.path.exists(join(data_path, p))
if not isExist:
os.makedirs(join(data_path, p))
To obtain different years, we define a timestamp, i.e., the point in time the snapshot of OSM is retrieved. This is done as follows:
# set timestamp
settings = '[out:json][timeout:180][date:"{year}-12-31T00:00:00Z"]'
The timestamp is set to the end of the year. Any other date would work as well. {year} is a placeholder to loop through multiple years. So, the last parameter to define is the time range.
years = ["2020", "2021"]
Finally, we need a list to store the collected data.
# list to store the data
extracted_data = []
Having everything ready, we can loop through all options.
# loop through years and get snapshot at the time
for place in places:
for city in cities:
for year in years:
# define tag
tag = {"amenity" : place}
# set extraction year
osmnx.settings.overpass_settings = settings.format(year = year)
# extract data for tags and year
tagged_data = osmnx.geometries_from_place(city, tags = tag)
# add snapshot year
tagged_data["snap_year"] = year
# export data
filename = str(place) + "_" + str(year) + "_" + str(city) + ".csv"
path = join(data_path, place)
tagged_data.to_csv(join(path, filename))
# append list to store data
extracted_data.append(tagged_data)
# print to see where the code is at
print(f"Extraction of {year} and {city} for {tag} completed")
Note that I defined amenities (like restaurants and bars) here. You could also define other types (like leisure) depending on the OSM tag. You can find out about different types by consulting Google.
That’s it. Taken all together, it retrieves OSM data for the specified set of parameters. The code in one block can be found on my GitHub. There, you can also find the code for generating the map at the top.
Suggestions or questions? Leave a comment or contact me.