This tutorial will cover use of AI models, loading data from text files, and constructing DataFrames.
pynoon_plus_1.ipynb
Normally we’d have to install
transformers
and its dependencytorch
usingpip
, but Colab already has these installed.
from transformers import pipeline
classifier = pipeline('zero-shot-classification', model='facebook/bart-large-mnli')
classifier(
'one day I will see the world',
candidate_labels=['travel', 'cooking', 'technology'],
)
travel
is the
best fit, which seems reasonable.
def classify_text(text_to_classify):
result = classifier(
text_to_classify,
candidate_labels=['travel', 'cooking', 'dancing'],
)
return result['labels'][0]
classify_text('one day I will see the world')
titles.txt
from: pynoon.github.io/curriculum/lesson_plus_1/titles.txttitles.txt
Click the folder icon on the left side of the Colab interface,
then right-click and select New file
Right-click and select Rename file
to name it
titles.txt
Double-click the file to open it, and enter the following content:
My weekend in Queenstown
When to plant tomatoes
Recommendations for 2024's best TVs
The fastest ever cookie recipe
Ctrl-s
to save the file.
open()
to load the fileopen()
should be used with a with
statement so that the file is automatically closed when we’re finished
with it:with open('titles.txt') as titles_file:
titles = titles_file.readlines()
.readlines()
has provided us with a list of strings
representing each line in the file:
titles
We can use a list comprehension to transform each value in a list:
[classify_text(title) for title in titles]
We can use a list comprehension to construct a list of dictionaries, where each dictionary contains the title and its label:
title_details = [
{
'title': title,
'label': classify_text(title),
}
for title in titles
]
title_details
pd.DataFrame
can be used to construct a DataFrame from
a list of dictionaries like title_details
.import pandas as pd
title_df = pd.DataFrame(title_details)
title_df