Pandas Exercises

TASK: Import pandas


TASK: Read in the bank.csv file that is located under the 01-Crash-Course-Pandas folder. Pay close attention to where the .csv file is located!


TASK: Display the first 5 rows of the data set

age job marital education default balance housing loan contact day month duration campaign pdays previous poutcome y
0 30 unemployed married primary no 1787 no no cellular 19 oct 79 1 -1 0 unknown no
1 33 services married secondary no 4789 yes yes cellular 11 may 220 1 339 4 failure no
2 35 management single tertiary no 1350 yes no cellular 16 apr 185 1 330 1 failure no
3 30 management married tertiary no 1476 yes yes unknown 3 jun 199 4 -1 0 unknown no
4 59 blue-collar married secondary no 0 yes no unknown 5 may 226 1 -1 0 unknown no

TASK: What is the average (mean) age of the people in the dataset?


TASK: What is the marital status of the youngest person in the dataset?



TASK: How many unique job categories are there?


TASK: How many people are there per job category? (Take a peek at the expected output)

management       969
blue-collar      946
technician       768
admin.           478
services         417
retired          230
self-employed    183
entrepreneur     168
unemployed       128
housemaid        112
student           84
unknown           38
Name: job, dtype: int64

TASK: What percent of people in the dataset were married?


TASK: There is a column labeled “default”. Use pandas’ .map() method to create a new column called “default code” which contains a 0 if there was no default, or a 1 if there was a default. Then show the head of the dataframe with this new column.

Helpful Hint Link One

Helpful Hint Link Two

age job marital education default balance housing loan contact day month duration campaign pdays previous poutcome y default code
0 30 unemployed married primary no 1787 no no cellular 19 oct 79 1 -1 0 unknown no 0
1 33 services married secondary no 4789 yes yes cellular 11 may 220 1 339 4 failure no 0
2 35 management single tertiary no 1350 yes no cellular 16 apr 185 1 330 1 failure no 0
3 30 management married tertiary no 1476 yes yes unknown 3 jun 199 4 -1 0 unknown no 0
4 59 blue-collar married secondary no 0 yes no unknown 5 may 226 1 -1 0 unknown no 0

TASK: Using pandas .apply() method, create a new column called “marital code”. This column will only contained a shortened code of the possible marital status first letter. (For example “m” for “married” , “s” for “single” etc… See if you can do this with a lambda expression. Lots of ways to do this one!

Hint Link

age job marital education default balance housing loan contact day month duration campaign pdays previous poutcome y default code marital code
0 30 unemployed married primary no 1787 no no cellular 19 oct 79 1 -1 0 unknown no 0 m
1 33 services married secondary no 4789 yes yes cellular 11 may 220 1 339 4 failure no 0 m
2 35 management single tertiary no 1350 yes no cellular 16 apr 185 1 330 1 failure no 0 s
3 30 management married tertiary no 1476 yes yes unknown 3 jun 199 4 -1 0 unknown no 0 m
4 59 blue-collar married secondary no 0 yes no unknown 5 may 226 1 -1 0 unknown no 0 m

TASK: What was the longest lasting duration?


TASK: What is the most common education level for people who are unemployed?

secondary    68
tertiary     32
primary      26
unknown       2
Name: education, dtype: int64

TASK: What is the average (mean) age for being unemployed?
