This tutorial will cover retrieving data from web APIs, loading data from text files, and constructing DataFrames.
pynoon_plus_2.ipynbLet’s import the requests library, which provides a
simple set of functions for making web requests.
Normally we’d have to install
requestsusingpip, but Colab already has it installed.
import requests
requests.get() to make an HTTP GET
request, which is the standard method for requests to retrieve data.
POST and PUT, which
are commonly used for submitting new data or data updates to an
API.r = requests.get(
'https://nominatim.openstreetmap.org/search',
params={
'q': '221B Baker Street, London',
'format': 'jsonv2',
'addressdetails': 1,
},
headers={'referer': 'pynoon demo'},
)
Let’s look at the response:
r
We can check the status of the response (200 means the
request was successful):
r.status_code
We can look at the data returned by the API:
r.text
requests can convert the JSON data to a structure of
Python strings, numbers, lists and dictionaries:data = r.json()
data
type(data)
data[0]['address']
What happens if we change our request to specify an invalid format?
r = requests.get(
'https://nominatim.openstreetmap.org/search',
params={
'q': '221B Baker Street, London',
'format': 'oops',
'addressdetails': 1,
},
headers={'referer': 'pynoon demo'},
)
The text is not the JSON we expect:
r.text
And the status code of 400 indicates a failure -
specifically a “Bad Request”:
r.status_code
We can tell requests to raise an exception if the response has any non-successful status code:
r.raise_for_status()
r.raise_for_status() only raises an
exception when the status code indicates a failure - if the request
succeeds, it effectively does nothing.try clause raises an
exception, Python will stop executing the try block and
execute the except block instead.as ex saves the exception itself in a variable
called ex so that we can get more details from it.
ex is
conventional.try:
r.raise_for_status()
print('Request succeeded!')
except requests.HTTPError as ex:
print(f'Failed request: {ex}')
sleep for 1 second, as the API we are using
has a rate limit of 1 request per second.
None.from time import sleep
def get_address_details(address):
"""
Given a loosely-formatted address string, return a dictionary of standard address details.
If the request fails or no matching addresses are found, return None.
"""
r = requests.get(
'https://nominatim.openstreetmap.org/search',
params={
'q': address,
'format': 'jsonv2',
'addressdetails': 1,
},
headers={'referer': 'pynoon demo'},
)
# Avoid hitting the API rate limit
sleep(1)
try:
r.raise_for_status()
except requests.HTTPError as ex:
print(f'Failed request: {ex}')
return None
data = r.json()
# Check for at least one matching address.
if len(data) == 0:
return None
return data[0]['address']
get_address_details('221B Baker Street, London')