Cursor rules for data analysis and manipulation using Pandas, NumPy, and data visualization.
.cursorrules in your project rootYou are an expert Python data scientist specializing in Pandas and data analysis.
## Import Conventions
```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from typing import Optional, List, Dict, Any
```
## DataFrame Best Practices
```python
# ✅ Use method chaining
df_cleaned = (
df
.dropna(subset=['important_column'])
.assign(
new_col=lambda x: x['col1'] + x['col2'],
date=lambda x: pd.to_datetime(x['date_str'])
)
.query('value > 0')
.sort_values('date')
.reset_index(drop=True)
)
# ✅ Use vectorized operations
df['total'] = df['price'] * df['quantity'] # Fast
# ❌ Avoid loops
for i in range(len(df)): # Slow
df.loc[i, 'total'] = df.loc[i, 'price'] * df.loc[i, 'quantity']
# ✅ Use .loc and .iloc explicitly
df.loc[df['status'] == 'active', 'value'] = 100 # Label-based
df.iloc[0:5, 2:4] # Position-based
```
## Data Cleaning
```python
def clean_dataframe(df: pd.DataFrame) -> pd.DataFrame:
"""Clean and prepare DataFrame for analysis."""
return (
df
# Remove duplicates
.drop_duplicates()
# Handle missing values
.fillna({
'numeric_col': df['numeric_col'].median(),
'categorical_col': 'Unknown'
})
# Fix data types
.astype({
'id': 'int64',
'category': 'category',
'date': 'datetime64[ns]'
})
# Remove outliers (IQR method)
.pipe(remove_outliers, column='value')
)
def remove_outliers(df: pd.DataFrame, column: str) -> pd.DataFrame:
Q1 = df[column].quantile(0.25)
Q3 = df[column].quantile(0.75)
IQR = Q3 - Q1
return df[
(df[column] >= Q1 - 1.5 * IQR) &
(df[column] <= Q3 + 1.5 * IQR)
]
```
## Aggregations
```python
# Group by with multiple aggregations
summary = (
df
.groupby(['category', 'year'])
.agg(
total_sales=('sales', 'sum'),
avg_price=('price', 'mean'),
count=('id', 'count'),
unique_customers=('customer_id', 'nunique')
)
.reset_index()
)
# Pivot tables
pivot = pd.pivot_table(
df,
values='sales',
index='region',
columns='quarter',
aggfunc='sum',
fill_value=0,
margins=True
)
```
## Memory Optimization
```python
def optimize_dtypes(df: pd.DataFrame) -> pd.DataFrame:
"""Reduce memory usage by optimizing data types."""
for col in df.select_dtypes(include=['int64']).columns:
df[col] = pd.to_numeric(df[col], downcast='integer')
for col in df.select_dtypes(include=['float64']).columns:
df[col] = pd.to_numeric(df[col], downcast='float')
for col in df.select_dtypes(include=['object']).columns:
if df[col].nunique() / len(df) < 0.5:
df[col] = df[col].astype('category')
return df
```Comprehensive Cursor rules for Next.js 14+ with App Router, including routing, layouts, and API patterns.
Cursor rules for TypeScript with strict type checking, advanced patterns, and best practices.
Cursor rules for Tailwind CSS development with responsive design, custom components, and dark mode.
Cursor
data
AI coding rules customize how Cursor generates and refactors code for your project. Follow these steps to install Python Data Science with Pandas.
.cursor/rules, for Windsurf use .windsurfrulesComprehensive Cursor rules for Next.js 14+ with App Router, including routing, layouts, and API patterns.
Cursor rules for TypeScript with strict type checking, advanced patterns, and best practices.
Cursor rules for Tailwind CSS development with responsive design, custom components, and dark mode.