The `apply()` function in Pandas lets you apply a function to every item in a column or row of a DataFrame, which can be useful for transforming data.
import pandas as pd # Function to double a number def double(x): return 2 * x # Create a DataFrame df = pd.DataFrame({'column1': [1, 2, 3], 'column2': [4, 5, 6]}) # Apply the doubling function to 'column1' df['column1'] = df['column1'].apply(double) # Apply a lambda function to 'column2' df['column2'] = df['column2'].apply(lambda x: 3 * x) # Apply a function to each row def custom_func(row): return row['column1'] * 1.5 + row['column2'] df['newColumn'] = df.apply(custom_func, axis=1)
You can add new columns to a DataFrame by assigning values directly. This can be done by either setting a single value for all rows or by performing calculations on existing columns.
import pandas as pd # Create a DataFrame df = pd.DataFrame({'oldColumn': [10, 20, 30]}) # Add a new column with specific values df['newColumn'] = [1, 2, 3] # Add a new column with the same value for all rows df['newColumn'] = 1 # Create a new column by calculating from an existing column df['newColumn'] = df['oldColumn'] * 5
You can create a Pandas DataFrame from various data sources, including dictionaries, lists, and CSV files. Each method allows you to organize data into a table format.
import pandas as pd # Create a DataFrame from a dictionary data = {'name': ['Anthony', 'Maria'], 'age': [30, 28]} df = pd.DataFrame(data) # Create a DataFrame from a list of lists data = [['Tom', 20], ['Jack', 30], ['Meera', 25]] df = pd.DataFrame(data, columns=['Name', 'Age']) # Create a DataFrame by reading from a CSV file df = pd.read_csv('students.csv')
A DataFrame is the primary data structure in Pandas. It is a 2D table where you can store and manipulate data in rows and columns.
import pandas as pd # Importing the Pandas library
The `groupby()` function in Pandas groups data by one or more columns and allows you to perform aggregate operations, such as calculating the average of a column.
import pandas as pd # Create a DataFrame df = pd.DataFrame([ ['Amy', 'Assignment 1', 75], ['Amy', 'Assignment 2', 35], ['Bob', 'Assignment 1', 99], ['Bob', 'Assignment 2', 35] ], columns=['Name', 'Assignment', 'Grade']) # Group by 'Name' and calculate the mean grade result = df.groupby('Name').Grade.mean() print(result) # Output: # Name # Amy 55.0 # Bob 67.0 # Name: Grade, dtype: float64
Pandas provides functions to calculate statistics for each column, such as mean, median, and maximum. These functions help you quickly analyze data in your DataFrame.
import pandas as pd # Create a DataFrame df = pd.DataFrame({'columnName': [10, 20, 30, 40]}) # Calculate various statistics mean = df.columnName.mean() # Average value std = df.columnName.std() # Standard deviation median = df.columnName.median() # Median value max_value = df.columnName.max() # Maximum value min_value = df.columnName.min() # Minimum value count = df.columnName.count() # Number of values unique_count = df.columnName.nunique() # Number of unique values unique_values = df.columnName.unique() # List of unique values print(mean, std, median, max_value, min_value, count, unique_count, unique_values) # Output: 25.0 12.91 25.0 40 10 4 4 [10 20 30 40]
Python's `datetime` module helps you manage dates and times. You can create specific dates and times, and work with them in your code.
import datetime # Create a date date = datetime.date(year=2019, month=2, day=16) print(date) # Output: 2019-02-16 # Create a time time = datetime.time(hour=13, minute=48, second=5) print(time) # Output: 13:48:05 # Create a datetime timestamp = datetime.datetime(year=2019, month=2, day=16, hour=13, minute=48, second=5) print(timestamp) # Output: 2019-02-16 13:48:05
Pandas can load data from CSV files into a DataFrame, making it easy to work with large datasets from files.
import pandas as pd # Load data from a CSV file into a DataFrame df = pd.read_csv('data.csv') print(df) # Output: DataFrame created from the CSV file data
Welcome to our comprehensive collection of programming language cheatsheets! Whether you're a seasoned developer or a beginner, these quick reference guides provide essential tips and key information for all major languages. They focus on core concepts, commands, and functions—designed to enhance your efficiency and productivity.
ManageEngine Site24x7, a leading IT monitoring and observability platform, is committed to equipping developers and IT professionals with the tools and insights needed to excel in their fields.
Monitor your IT infrastructure effortlessly with Site24x7 and get comprehensive insights and ensure smooth operations with 24/7 monitoring.
Sign up now!