{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# ML Practice Series: Module 18 - Time Series Analysis\n",
"\n",
"Welcome to Module 18! **Time Series Analysis** is the study of data points collected or recorded at specific time intervals. This is crucial for finance, weather forecasting, and inventory management.\n",
"\n",
"### Objectives:\n",
"1. **Datetime Handling**: Converting strings to date objects.\n",
"2. **Resampling & Rolling Windows**: Smoothing data trends.\n",
"3. **Stationarity**: Understanding the Mean and Variance over time.\n",
"4. **Forecasting**: A simple look at the Moving Average model.\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Setup\n",
"We will use the **Air Passengers** dataset, which shows monthly totals of international airline passengers from 1949 to 1960."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"\n",
"# Load dataset\n",
"url = \"https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv\"\n",
"df = pd.read_csv(url, parse_dates=['Month'], index_index=True)\n",
"\n",
"print(\"Dataset head:\")\n",
"print(df.head())\n",
"\n",
"plt.figure(figsize=(12, 6))\n",
"plt.plot(df)\n",
"plt.title('Monthly International Airline Passengers')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Feature Extraction from Time\n",
"\n",
"### Task 1: Component Extraction\n",
"Extract the `Year`, `Month`, and `Day of Week` from the index into new columns.\n",
"\n",
"*Web Reference: [Feature Engineering Guide](https://aashishgarg13.github.io/DataScience/feature-engineering/) (Time features section).*"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# YOUR CODE HERE\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"Click to see Solution
\n",
"\n",
"```python\n",
"df['year'] = df.index.year\n",
"df['month'] = df.index.month\n",
"df['day_of_week'] = df.index.dayofweek\n",
"df.head()\n",
"```\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Smoothing Trends\n",
"\n",
"### Task 2: Rolling Mean\n",
"Calculate and plot a 12-month rolling mean to see the yearly trend more clearly."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# YOUR CODE HERE\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"Click to see Solution
\n",
"\n",
"```python\n",
"rolling_mean = df['Passengers'].rolling(window=12).mean()\n",
"\n",
"plt.figure(figsize=(12, 6))\n",
"plt.plot(df['Passengers'], label='Original')\n",
"plt.plot(rolling_mean, color='red', label='12-Month Rolling Mean')\n",
"plt.legend()\n",
"plt.show()\n",
"```\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"--- \n",
"### Excellent Forecast! \n",
"Time Series is a deep field. You've now mastered the basics of handling temporal data.\n",
"Next: **Natural Language Processing (NLP)**."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.7"
}
},
"nbformat": 4,
"nbformat_minor": 4
}