{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# ML Practice Series: Module 18 - Time Series Analysis\n", "\n", "Welcome to Module 18! **Time Series Analysis** is the study of data points collected or recorded at specific time intervals. This is crucial for finance, weather forecasting, and inventory management.\n", "\n", "### Objectives:\n", "1. **Datetime Handling**: Converting strings to date objects.\n", "2. **Resampling & Rolling Windows**: Smoothing data trends.\n", "3. **Stationarity**: Understanding the Mean and Variance over time.\n", "4. **Forecasting**: A simple look at the Moving Average model.\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Setup\n", "We will use the **Air Passengers** dataset, which shows monthly totals of international airline passengers from 1949 to 1960." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "\n", "# Load dataset\n", "url = \"https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv\"\n", "df = pd.read_csv(url, parse_dates=['Month'], index_index=True)\n", "\n", "print(\"Dataset head:\")\n", "print(df.head())\n", "\n", "plt.figure(figsize=(12, 6))\n", "plt.plot(df)\n", "plt.title('Monthly International Airline Passengers')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Feature Extraction from Time\n", "\n", "### Task 1: Component Extraction\n", "Extract the `Year`, `Month`, and `Day of Week` from the index into new columns.\n", "\n", "*Web Reference: [Feature Engineering Guide](https://aashishgarg13.github.io/DataScience/feature-engineering/) (Time features section).*" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# YOUR CODE HERE\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "Click to see Solution\n", "\n", "```python\n", "df['year'] = df.index.year\n", "df['month'] = df.index.month\n", "df['day_of_week'] = df.index.dayofweek\n", "df.head()\n", "```\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Smoothing Trends\n", "\n", "### Task 2: Rolling Mean\n", "Calculate and plot a 12-month rolling mean to see the yearly trend more clearly." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# YOUR CODE HERE\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "Click to see Solution\n", "\n", "```python\n", "rolling_mean = df['Passengers'].rolling(window=12).mean()\n", "\n", "plt.figure(figsize=(12, 6))\n", "plt.plot(df['Passengers'], label='Original')\n", "plt.plot(rolling_mean, color='red', label='12-Month Rolling Mean')\n", "plt.legend()\n", "plt.show()\n", "```\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "--- \n", "### Excellent Forecast! \n", "Time Series is a deep field. You've now mastered the basics of handling temporal data.\n", "Next: **Natural Language Processing (NLP)**." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.7" } }, "nbformat": 4, "nbformat_minor": 4 }