The `apply()` function in Pandas lets you apply a function to every item in a column or row of a DataFrame, which can be useful for transforming data.
import pandas as pd
# Function to double a number
def double(x):
return 2 * x
# Create a DataFrame
df = pd.DataFrame({'column1': [1, 2, 3], 'column2': [4, 5, 6]})
# Apply the doubling function to 'column1'
df['column1'] = df['column1'].apply(double)
# Apply a lambda function to 'column2'
df['column2'] = df['column2'].apply(lambda x: 3 * x)
# Apply a function to each row
def custom_func(row):
return row['column1'] * 1.5 + row['column2']
df['newColumn'] = df.apply(custom_func, axis=1)
You can add new columns to a DataFrame by assigning values directly. This can be done by either setting a single value for all rows or by performing calculations on existing columns.
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'oldColumn': [10, 20, 30]})
# Add a new column with specific values
df['newColumn'] = [1, 2, 3]
# Add a new column with the same value for all rows
df['newColumn'] = 1
# Create a new column by calculating from an existing column
df['newColumn'] = df['oldColumn'] * 5
You can create a Pandas DataFrame from various data sources, including dictionaries, lists, and CSV files. Each method allows you to organize data into a table format.
import pandas as pd
# Create a DataFrame from a dictionary
data = {'name': ['Anthony', 'Maria'], 'age': [30, 28]}
df = pd.DataFrame(data)
# Create a DataFrame from a list of lists
data = [['Tom', 20], ['Jack', 30], ['Meera', 25]]
df = pd.DataFrame(data, columns=['Name', 'Age'])
# Create a DataFrame by reading from a CSV file
df = pd.read_csv('students.csv')
A DataFrame is the primary data structure in Pandas. It is a 2D table where you can store and manipulate data in rows and columns.
import pandas as pd
# Importing the Pandas library
The `groupby()` function in Pandas groups data by one or more columns and allows you to perform aggregate operations, such as calculating the average of a column.
import pandas as pd
# Create a DataFrame
df = pd.DataFrame([
['Amy', 'Assignment 1', 75],
['Amy', 'Assignment 2', 35],
['Bob', 'Assignment 1', 99],
['Bob', 'Assignment 2', 35]
], columns=['Name', 'Assignment', 'Grade'])
# Group by 'Name' and calculate the mean grade
result = df.groupby('Name').Grade.mean()
print(result)
# Output:
# Name
# Amy 55.0
# Bob 67.0
# Name: Grade, dtype: float64
Pandas provides functions to calculate statistics for each column, such as mean, median, and maximum. These functions help you quickly analyze data in your DataFrame.
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'columnName': [10, 20, 30, 40]})
# Calculate various statistics
mean = df.columnName.mean() # Average value
std = df.columnName.std() # Standard deviation
median = df.columnName.median() # Median value
max_value = df.columnName.max() # Maximum value
min_value = df.columnName.min() # Minimum value
count = df.columnName.count() # Number of values
unique_count = df.columnName.nunique() # Number of unique values
unique_values = df.columnName.unique() # List of unique values
print(mean, std, median, max_value, min_value, count, unique_count, unique_values)
# Output: 25.0 12.91 25.0 40 10 4 4 [10 20 30 40]
Python's `datetime` module helps you manage dates and times. You can create specific dates and times, and work with them in your code.
import datetime
# Create a date
date = datetime.date(year=2019, month=2, day=16)
print(date) # Output: 2019-02-16
# Create a time
time = datetime.time(hour=13, minute=48, second=5)
print(time) # Output: 13:48:05
# Create a datetime
timestamp = datetime.datetime(year=2019, month=2, day=16, hour=13, minute=48, second=5)
print(timestamp) # Output: 2019-02-16 13:48:05
Pandas can load data from CSV files into a DataFrame, making it easy to work with large datasets from files.
import pandas as pd
# Load data from a CSV file into a DataFrame
df = pd.read_csv('data.csv')
print(df)
# Output: DataFrame created from the CSV file data
Welcome to our comprehensive collection of programming language cheatsheets! Whether you're a seasoned developer or a beginner, these quick reference guides provide essential tips and key information for all major languages. They focus on core concepts, commands, and functions—designed to enhance your efficiency and productivity.
ManageEngine Site24x7, a leading IT monitoring and observability platform, is committed to equipping developers and IT professionals with the tools and insights needed to excel in their fields.
Monitor your IT infrastructure effortlessly with Site24x7 and get comprehensive insights and ensure smooth operations with 24/7 monitoring.
Sign up now!