PyNoon Plus Lesson 2 - Tutorial

This tutorial will cover retrieving data from web APIs, loading data from text files, and constructing DataFrames.

Setup

  1. Make a new notebook for this lesson
  2. What’s the first thing to do? RENAME IT!
  3. Name it pynoon_plus_2.ipynb

Making web Requests

Let’s import the requests library, which provides a simple set of functions for making web requests.

Normally we’d have to install requests using pip, but Colab already has it installed.

import requests
r = requests.get(
    'https://nominatim.openstreetmap.org/search',
    params={
        'q': '221B Baker Street, London',
        'format': 'jsonv2',
        'addressdetails': 1,
    },
)

Let’s look at the response:

r

We can check the status of the response (200 means the request was successful):

r.status_code

We can look at the data returned by the API:

r.text
data = r.json()
data
type(data)
data[0]['address']

Handling Errors

What happens if we change our request to specify an invalid format?

r = requests.get(
    'https://nominatim.openstreetmap.org/search',
    params={
        'q': '221B Baker Street, London',
        'format': 'oops',
        'addressdetails': 1,
    },
)

The text is not the JSON we expect:

r.text

And the status code of 400 indicates a failure - specifically a “Bad Request”:

r.status_code

We can tell requests to raise an exception if the response has any non-successful status code:

r.raise_for_status()
try:
    r.raise_for_status()
    print('Request succeeded!')
except requests.HTTPError as ex:
    print(f'Failed request: {ex}')

Defining a request function

from time import sleep

def get_address_details(address):
    """
    Given a loosely-formatted address string, return a dictionary of standard address details.

    If the request fails or no matching addresses are found, return None.
    """
    r = requests.get(
        'https://nominatim.openstreetmap.org/search',
        params={
            'q': address,
            'format': 'jsonv2',
            'addressdetails': 1,
        },
    )
    # Avoid hitting the API rate limit
    sleep(1)

    try:
        r.raise_for_status()
    except requests.HTTPError as ex:
        print(f'Failed request: {ex}')
        return None

    data = r.json()
    # Check for at least one matching address.
    if len(data) == 0:
        return None
    return data[0]['address']

get_address_details('221B Baker Street, London')