Pandas
Pandas is a powerful data manipulation library for Python. It provides data structures and functions to efficiently manipulate large 2D tabular datasets.
Loading Data
Data Exploration
Handling Missing Values
Checking Unique Values
Selecting Columns
Data Filtering
Exercise ☕📝
Find all aliens with medium size but weights more than 300kg.
Creating New Columns
Sorting Data
Aggregate by Group
You can find a visualization of this operation here. (Note that the visualizatoin page may take a few seconds to load.)
Exercise ☕📝
Calculate the average weight of aliens by size.
Chain Operations
You can find a step-by-step visualization of this chained operatoin here. (Note that the visualizatoin page may take a few seconds to load.)
Exercise ☕📝
For all large sized aliens, what are the longest lifespan for each habitat.
Merging Data
There are many ways to merge datasets. We will only cover one often used merge called left merge (or left join) in this example. A left merge combines two datasets based on a common column, keeping all rows from the left dataset and only matching rows from the right dataset.
You can find a visualization of this merge operation here. (Note that the visualizatoin page may take a few seconds to load.)