This tutorial is based on:
DEMO-ONLY TUTORIAL BEGINS HERE
In Python, every value has a type
The type of a value determines what operations we can perform with it
We can use the type()
function to find out what it
is:
type(42)
An int
or “integer” is a number without a decimal
part.
type(42.0)
float
is the name Python and other programming languages
use for numbers with decimal parts (aka floating point numbers).
type('hello')
A str
or “string” is a string of characters - a piece of
text.
The type of a value determines what operations we can perform with it:
'hello' + 'world'
Note that “adding” strings joins them together, while we’ve seen that adding numbers performs arithmetic addition.
What if we try to subtract strings?
'hello' - 'h'
Python complains with a TypeError
, telling us that the
-
operation is not supported for strings.
We can get the length of a string with the len()
function:
len('hello')
But the length of an integer is not defined:
len(10)
Different types also support different variables and functions attached to the value itself:
'hello'.upper()
upper()
is a function like print
,
type
, and len
, but it doesn’t take any
arguments, and it is attached to string values (referred to as a special
kind of function called a method).
Some operations do not work with mixed types:
1 + '1'
Functions are available to convert values from one type to another,
like str()
to convert values to strings:
str(1) + '1'
Some operations do work with mixed types
'<>' * 10
What do you think the result of this will be?
type(42 + 42.0)
Maths with both integers and floats results in a float
.
Note the decimal part after the resulting number:
42 + 42.0
FOLLOW-ALONG TUTORIAL BEGINS HERE
pynoon_starter_2.ipynb
A quick recap on variables for those who haven’t reached that part of futurecoder yet. Run the following:
height_cm = 180
The =
symbol assigns the name on the left to refer to
the value on the right.
Now we can use the variable anywhere we would have used the value:
height_cm
height_cm + 10
Note: The Python convention for naming variables is to use “snake case” (lowercase letters separated with words/parts separated by underscores).
Note: Variables are case-sensitive.
Variables are useful for giving meaningful names to values:
This is bad:
height_cm / 2.54
This is better:
cm_per_inch = 2.54
height_cm / cm_per_inch
Also note how helpful it is to include the units in the name of the variable.
Variables are also useful for storing the result of an operation for later use:
cm_per_inch = 2.54
height_inch = height_cm / cm_per_inch
We can use indexing to get particular characters from a string:
email = 'pynoon@example.com'
email
To get a new string containing only the first character of the
email
string:
email[0]
Note that Python starts counting indexes at 0
, so index
1
is the second character:
email[1]
You can also specify negative indexes to count from the end of the string:
email[-1]
We can use slicing to get a new string that is a subset of the target string:
email[0:2]
Taking an index or slice does not change the contents of the original string, but returns a copy of part of the string:
email
Here’s a more concrete example to split an email address around it’s
@
sign.
We can use the find()
method supported by strings to get
the index of the first @
sign:
at_index = email.find('@')
at_index
We can then use that index to slice the string:
email[at_index]
email[0:at_index]
email[at_index:len(email)]
email[(at_index + 1):len(email)]
More generally, indexing and slicing work on any type that is an ordered list of elements.
Let’s try that out with the DataFrame type from the
pandas
library, which stores tabular data. We look at
DataFrames in more detail in the PyNoon Data course.
To import the pandas library as the alias pd
:
import pandas as pd
To load a CSV file into a DataFrame:
df = pd.read_csv('http://pynoon.github.io/data/inside_airbnb_listings_nz_2023_09.csv')
Note: Your notebook must not be inside a subfolder for this command to work.
Look at the contents of the DataFrame:
df
Check the type of the DataFrame:
type(df)
Get the first row of the DataFrame by indexing:
df.iloc[0]
Get the first three rows of the DataFrame by slicing:
df.iloc[0:3]
Check the length of the DataFrame:
len(df)
Remember, the type of the value determines what operations behave. Maths operations applied to a DataFrame are applied to every cell individually:
df * 5