Back to all posts

Basics of Pandas

Pandas is the most popular open-source library in the Python programming language and pandas is widely used for data science/data analysis and machine lear…

Pandas is the most popular open-source library in the Python programming language and pandas is widely used for data science/data analysis and machine learning applications.

Pandas is a Python library used for working with data sets. It has functions for analyzing, cleaning, exploring, and manipulating data.

The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis".

Plain Text
pip install pandas

Pandas Series:

A Pandas Series is like a row in a table. It is a one-dimensional array holding data of any type.

PHP
product = [1, 7, 2]

product = pd.Series(product)
print(product)
Bash
type(product) # pandas.core.series.Series
Bash
product = [1, 7, 2]
index = ["x", "y", "z"]
product = pd.Series(product,index=index)
print(product)
Bash
calories = {"day1": 420, "day2": 380, "day3": 390}
myvar = pd.Series(calories)
print(myvar)

DataFrames:

A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns.

Series is like a column, a DataFrame is the whole table.

PHP
data = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}

#load data into a DataFrame object:
df = pd.DataFrame(data, index = ["day1", "day2", "day3"])

print(df) 

Read CSV Files:

Bash
data = {
  "Duration":{
    "0":60,
    "1":60,
    "2":60,
    "3":45,
    "4":45,
    "5":60
  },
  "Pulse":{
    "0":110,
    "1":117,
    "2":103,
    "3":109,
    "4":117,
    "5":102
  },
  "Maxpulse":{
    "0":130,
    "1":145,
    "2":135,
    "3":175,
    "4":148,
    "5":127
  },
  "Calories":{
    "0":409,
    "1":479,
    "2":340,
    "3":282,
    "4":406,
    "5":300
  }
}

df = pd.DataFrame(data)
print(df) 

Keep building your data skillset

Explore more SQL, Python, analytics, and engineering tutorials.