PyNoon Starter Lesson 2 - Tutorial

This tutorial is based on:

DEMO-ONLY TUTORIAL BEGINS HERE

Types

In Python, every value has a type

The type of a value determines what operations we can perform with it

Checking the type of a value

We can use the type() function to find out what it is:

type(42)

An int or “integer” is a number without a decimal part.

type(42.0)

float is the name Python and other programming languages use for numbers with decimal parts (aka floating point numbers).

type('hello')

A str or “string” is a string of characters - a piece of text.

Operations vary by type

The type of a value determines what operations we can perform with it:

'hello' + 'world'

Note that “adding” strings joins them together, while we’ve seen that adding numbers performs arithmetic addition.

What if we try to subtract strings?

'hello' - 'h'

Python complains with a TypeError, telling us that the - operation is not supported for strings.

We can get the length of a string with the len() function:

len('hello')

But the length of an integer is not defined:

len(10)

Different types also support different variables and functions attached to the value itself:

'hello'.upper()

upper() is a function like print, type, and len, but it doesn’t take any arguments, and it is attached to string values (referred to as a special kind of function called a method).

Operations with mixed types

Some operations do not work with mixed types:

1 + '1'

Functions are available to convert values from one type to another, like str() to convert values to strings:

str(1) + '1'

Some operations do work with mixed types

'<>' * 10

What do you think the result of this will be?

type(42 + 42.0)

Maths with both integers and floats results in a float. Note the decimal part after the resulting number:

42 + 42.0

FOLLOW-ALONG TUTORIAL BEGINS HERE

Setup

Variables

A quick recap on variables for those who haven’t reached that part of futurecoder yet. Run the following:

height_cm = 180

The = symbol assigns the name on the left to refer to the value on the right.

Now we can use the variable anywhere we would have used the value:

height_cm
height_cm + 10

Note: The Python convention for naming variables is to use “snake case” (lowercase letters separated with words/parts separated by underscores).

Note: Variables are case-sensitive.

Variables are useful for giving meaningful names to values:

This is bad:

height_cm / 2.54

This is better:

cm_per_inch = 2.54
height_cm / cm_per_inch

Also note how helpful it is to include the units in the name of the variable.

Variables are also useful for storing the result of an operation for later use:

cm_per_inch = 2.54
height_inch = height_cm / cm_per_inch

Indexing and Slicing

Indexing

We can use indexing to get particular characters from a string:

email = 'pynoon@example.com'
email

To get a new string containing only the first character of the email string:

email[0]

Note that Python starts counting indexes at 0, so index 1 is the second character:

email[1]

You can also specify negative indexes to count from the end of the string:

email[-1]

Slicing

We can use slicing to get a new string that is a subset of the target string:

email[0:2]

Taking an index or slice does not change the contents of the original string, but returns a copy of part of the string:

email

Here’s a more concrete example to split an email address around it’s @ sign.

We can use the find() method supported by strings to get the index of the first @ sign:

at_index = email.find('@')
at_index

We can then use that index to slice the string:

email[at_index]
email[0:at_index]
email[at_index:len(email)]
email[(at_index + 1):len(email)]

DataFrames

More generally, indexing and slicing work on any type that is an ordered list of elements.

Let’s try that out with the DataFrame type from the pandas library, which stores tabular data. We look at DataFrames in more detail in the PyNoon Data course.

To import the pandas library as the alias pd:

import pandas as pd

To load a CSV file into a DataFrame:

df = pd.read_csv('http://pynoon.github.io/data/inside_airbnb_listings_nz_2023_09.csv')

Note: Your notebook must not be inside a subfolder for this command to work.

Look at the contents of the DataFrame:

df

Check the type of the DataFrame:

type(df)

Get the first row of the DataFrame by indexing:

df.iloc[0]

Get the first three rows of the DataFrame by slicing:

df.iloc[0:3]

Check the length of the DataFrame:

len(df)

Remember, the type of the value determines what operations behave. Maths operations applied to a DataFrame are applied to every cell individually:

df * 5