Spaces:

ameythakur
/

Stock-Trading-RL

Sleeping

App Files Files Community

ameythakur commited on Jan 26

Commit

9ffa007

unverified ·

0 Parent(s):

Stock-Trading-RL

Browse files

Files changed (36) hide show

.DS_Store +0 -0
.gitattributes +41 -0
.gitignore +161 -0
CITATION.cff +13 -0
Dockerfile +27 -0
LICENSE +21 -0
Mega/Filly.jpg +3 -0
Mega/Mega.png +3 -0
Mega/Mega_Chair.png +3 -0
Mega/Mega_Dining.jpg +3 -0
Mega/Mega_Professional.jpg +3 -0
Mega/Mega_and_Hetvi.png +3 -0
README.md +299 -0
SECURITY.md +41 -0
Source Code/.streamlit/config.toml +4 -0
Source Code/Procfile +1 -0
Source Code/Stock-RL.py +243 -0
Source Code/Train_model/Model.ipynb +407 -0
Source Code/all_stocks_5yr.csv +3 -0
Source Code/model.pkl +0 -0
Source Code/model_training.py +268 -0
Source Code/requirements.txt +4 -0
Source Code/setup.sh +8 -0
Technocolabs/AMEY THAKUR - BLUEPRINT.pdf +3 -0
Technocolabs/Optimizing Stock Trading Strategy With Reinforcement Learning.pdf +3 -0
Technocolabs/PROJECT REPORT.pdf +3 -0
Technocolabs/Technocolabs Software - Data Scientist - Internship Completion Letter.pdf +3 -0
Technocolabs/Technocolabs Software - Data Scientist - Internship Offer Letter.pdf +3 -0
Technocolabs/Technocolabs Software - Data Scientist - Letter of Recommendation.pdf +3 -0
Technocolabs/Technocolabs Software - Data Scientist - Project Completion Letter.pdf +3 -0
codemeta.json +43 -0
docs/SPECIFICATION.md +46 -0
screenshots/01-landing-page.png +3 -0
screenshots/02-amzn-trend.png +3 -0
screenshots/03-portfolio-growth.png +3 -0
screenshots/04-alb-trend.png +3 -0

.DS_Store ADDED Viewed

Binary file (6.15 kB). View file

.gitattributes ADDED Viewed

	@@ -0,0 +1,41 @@

+# Standardized Git Attributes for Scholarly Archiving
+# https://git-scm.com/docs/gitattributes
+# Auto detect text files and perform LF normalization
+* text=auto
+# Source code
+*.py text diff=python
+*.js text
+*.css text
+*.html text
+*.sh text eol=lf
+*.json text
+# Documentation
+*.md text
+*.txt text
+*.cff text
+*.yml text
+*.yaml text
+# Data & Models
+*.csv filter=lfs diff=lfs merge=lfs -text
+*.pkl binary
+*.pickle binary
+*.h5 binary
+*.pdf binary
+# Images
+*.jpg binary
+*.jpeg binary
+*.png binary
+*.gif binary
+*.ico binary
+*.svg text
+# Windows specific
+*.bat text eol=crlf
+*.ps1 text eol=crlf
+# Linguist (GitHub Language Stats)
+*.md linguist-documentation
+*.js linguist-detectable
+*.html linguist-detectable
+*.pdf filter=lfs diff=lfs merge=lfs -text
+*.png filter=lfs diff=lfs merge=lfs -text
+*.jpg filter=lfs diff=lfs merge=lfs -text
+*.jpeg filter=lfs diff=lfs merge=lfs -text

.gitignore ADDED Viewed

	@@ -0,0 +1,161 @@

+# Result: Comprehensive .gitignore for Python/Data Science Projects
+# Byte-compiled / Optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / Packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+.python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   with no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# Model Weights (Large Files)
+*.h5
+*.hd5
+# *.pkl (We are tracking model.pkl for this specific project as it is small/demo)
+# *.pickle
+# Streamlit
+.streamlit/secrets.toml
+# OS Generated
+.DS_Store
+.DS_Store?
+._*
+.Spotlight-V100
+.Trashes
+ehthumbs.db
+Thumbs.db
+# IDES
+.idea/
+.vscode/
+*.swp
+*.swo

CITATION.cff ADDED Viewed

	@@ -0,0 +1,13 @@

+cff-version: 1.2.0
+message: "If you use this Data Science internship project or its associated academic materials, please cite them as below."
+authors:
+- family-names: "Thakur"
+  given-names: "Amey"
+  orcid: "https://orcid.org/0000-0001-5644-1575"
+- family-names: "Satish"
+  given-names: "Mega"
+  orcid: "https://orcid.org/0000-0002-1844-9557"
+title: "OPTIMIZING STOCK TRADING STRATEGY WITH REINFORCEMENT LEARNING"
+version: 1.0.0
+date-released: 2021-09-18
+url: "https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING"

Dockerfile ADDED Viewed

	@@ -0,0 +1,27 @@

+# Dockerfile for Hugging Face Spaces (Docker SDK)
+FROM python:3.9-slim
+# Set working directory
+WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    software-properties-common \
+    git \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements (using JSON array format for paths with spaces)
+COPY ["Source Code/requirements.txt", "./requirements.txt"]
+# Install Python dependencies
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy the entire Source Code directory (using JSON array format and explicit directory destination)
+COPY ["Source Code/", "./"]
+# Expose the port that Hugging Face Spaces expects (7860)
+EXPOSE 7860
+# Define the command to run the application
+CMD ["streamlit", "run", "Stock-RL.py", "--server.port=7860", "--server.address=0.0.0.0"]

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2021 Amey Thakur and Mega Satish
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

Mega/Filly.jpg ADDED Viewed

Git LFS Details

SHA256: f399507a2f19ba7319564afa208bb953bbf719b13377e9069649a41ac3c39ea7
Pointer size: 131 Bytes
Size of remote file: 299 kB

Mega/Mega.png ADDED Viewed

Git LFS Details

SHA256: 0ef583539af8192a1c656cf1b194e7faaf2135d4369aab21c04839c7c6a1c9d0
Pointer size: 131 Bytes
Size of remote file: 122 kB

Mega/Mega_Chair.png ADDED Viewed

Git LFS Details

SHA256: 776ee9d31807c16548a5ef4bf0bf858c1925e5945bfc7b686d8397c9af12083c
Pointer size: 132 Bytes
Size of remote file: 1.01 MB

Mega/Mega_Dining.jpg ADDED Viewed

Git LFS Details

SHA256: c31fe99e3d99646e3e70f7d3814d14f099986f56f29ae93ba37b46d0c2b21849
Pointer size: 131 Bytes
Size of remote file: 316 kB

Mega/Mega_Professional.jpg ADDED Viewed

Git LFS Details

SHA256: 925ac54a4ca4f6cb5fbb06984dd51ca7560c8580b8e231ce8a9be19bbefecfaf
Pointer size: 129 Bytes
Size of remote file: 7.9 kB

Mega/Mega_and_Hetvi.png ADDED Viewed

Git LFS Details

SHA256: 90ef035b18801866be44ae0d0ce295784780c839c03aae530dd2ec5e4d0bcd94
Pointer size: 132 Bytes
Size of remote file: 1 MB

README.md ADDED Viewed

	@@ -0,0 +1,299 @@

+<div align="center">
+  <a name="readme-top"></a>
+  # Optimizing Stock Trading Strategy With Reinforcement Learning
+  [![License: MIT](https://img.shields.io/badge/License-MIT-lightgrey)](LICENSE)
+  ![Status](https://img.shields.io/badge/Status-Completed-success)
+  [![Technology](https://img.shields.io/badge/Technology-Python%20%7C%20Reinforcement%20Learning-blueviolet)](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING)
+  [![Developed by Amey Thakur & Mega Satish](https://img.shields.io/badge/Developed%20by-Amey%20Thakur%20%26%20Mega%20Satish-blue.svg)](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING)
+  A machine learning study demonstrating the application of **Reinforcement Learning (Q-Learning)** algorithms to optimize stock trading strategies and maximize portfolio returns.
+  **[Source Code](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/tree/main/Source%20Code)** &nbsp;·&nbsp; **[Kaggle Notebook](https://www.kaggle.com/ameythakur20/stock-price-prediction-model)** &nbsp;·&nbsp; **[Video Demo](https://youtu.be/Q82a93hjxJE)** &nbsp;·&nbsp; **[Live Demo](https://huggingface.co/spaces/ameythakur/Stock-Trading-RL)**
+  <br>
+  <a href="https://youtu.be/Q82a93hjxJE">
+    <img src="https://img.youtube.com/vi/Q82a93hjxJE/maxresdefault.jpg" alt="Video Demo" width="70%">
+  </a>
+</div>
+---
+<div align="center">
+  [Authors](#authors) &nbsp;·&nbsp; [Overview](#overview) &nbsp;·&nbsp; [Features](#features) &nbsp;·&nbsp; [Structure](#project-structure) &nbsp;·&nbsp; [Results](#results) &nbsp;·&nbsp; [Quick Start](#quick-start) &nbsp;·&nbsp; [License](#license) &nbsp;·&nbsp; [About](#about-this-repository) &nbsp;·&nbsp; [Acknowledgments](#acknowledgments)
+</div>
+---
+<!-- AUTHORS -->
+<div align="center">
+  <a name="authors"></a>
+  ## Authors
+  | <a href="https://github.com/Amey-Thakur"><img src="https://github.com/Amey-Thakur.png" width="150" height="150" alt="Amey Thakur"></a><br>[**Amey Thakur**](https://github.com/Amey-Thakur)<br><br>[![ORCID](https://img.shields.io/badge/ORCID-0000--0001--5644--1575-green.svg)](https://orcid.org/0000-0001-5644-1575) | <a href="https://github.com/msatmod"><img src="Mega/Mega.png" width="150" height="150" alt="Mega Satish"></a><br>[**Mega Satish**](https://github.com/msatmod)<br><br>[![ORCID](https://img.shields.io/badge/ORCID-0000--0002--1844--9557-green.svg)](https://orcid.org/0000-0002-1844-9557) |
+  | :---: | :---: |
+</div>
+> [!IMPORTANT]
+> ### 🤝🏻 Special Acknowledgement
+> *Special thanks to **[Mega Satish](https://github.com/msatmod)** for her meaningful contributions, guidance, and support that helped shape this work.*
+---
+<!-- OVERVIEW -->
+<a name="overview"></a>
+## Overview
+**Optimizing Stock Trading Strategy With Reinforcement Learning** is a Data Science study conducted as part of the **Internship** at **Technocolabs Software**. The project focuses on the development of an intelligent agent capable of making autonomous trading decisions (Buy, Sell, Hold) to maximize profitability.
+By leveraging **Q-Learning**, the system models the market environment where an agent learns optimal strategies based on price movements and moving average crossovers. The model is visualized via a **Streamlit** web application for real-time strategy simulation.
+### Computational Objectives
+The analysis is governed by strict **exploratory and modeling principles** ensuring algorithmic validity:
+*   **State Representation**: utilization of Short-term and Long-term Moving Average crossovers to define market states.
+*   **Action Space**: Discrete action set (Buy, Sell, Hold) optimized through reward feedback.
+*   **Policy Optimization**: Implementing an Epsilon-Greedy strategy to balance exploration and exploitation of trading rules.
+> [!NOTE]
+> ### Research Impact
+> This project was published as a research paper and successfully demonstrated the viability of RL agents in simulated trading environments. The work received official recognition from Technocolabs Software including an **Internship Completion Certificate** and **Letter of Recommendation**.
+>
+> *   [ResearchGate](http://dx.doi.org/10.13140/RG.2.2.13054.05440)
+> *   [Project Completion Letter](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/blob/main/Technocolabs/Technocolabs%20Software%20-%20Data%20Scientist%20-%20Project%20Completion%20Letter.pdf)
+### Resources
+| # | Resource | Description | Date |
+| :---: | :--- | :--- | :--- |
+| 1 | [**Source Code**](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/tree/main/Source%20Code) | Complete production repository and scripts | — |
+| 2 | [**Kaggle Notebook**](https://www.kaggle.com/ameythakur20/stock-price-prediction-model) | Interactive Jupyter notebook for model training | — |
+| 3 | [**Dataset**](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/blob/main/Source%20Code/all_stocks_5yr.csv) | Historical stock market data (5 Years) | — |
+| 4 | [**Technical Specification**](docs/SPECIFICATION.md) | System architecture and specifications | — |
+| 5 | [**Technical Report**](Technocolabs/PROJECT%20REPORT.pdf) | Comprehensive archival project documentation | September 2021 |
+| 6 | [**Blueprint**](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/blob/main/Technocolabs/AMEY%20THAKUR%20-%20BLUEPRINT.pdf) | Initial project design and architecture blueprint | September 2021 |
+> [!TIP]
+> ### Market Adaptation
+> The Q-Learning agent's performance relies heavily on the quality of historical data. Regular retraining with recent market data is recommended to adapt the Q-Table's values to shifting market trends and volatility patterns.
+---
+<!-- FEATURES -->
+<a name="features"></a>
+## Features
+| Component | Technical Description |
+|-----------|-----------------------|
+| **Data Ingestion** | Automated loading and processing of historical stock data (CSV). |
+| **Trend Analysis** | Computation of 5-day and 1-day Moving Averages to identify trend signals. |
+| **RL Agent** | **Q-Learning** implementation with state-action mapping for decision autonomy. |
+| **Portfolio Logic** | Dynamic tracking of cash, stock holdings, and total net worth over time. |
+| **Visualization** | Interactive **Streamlit** dashboard using **Plotly** for financial charting. |
+> [!NOTE]
+> ### Empirical Context
+> Stock markets are stochastic environments. This project simplifies the state space to Moving Average crossovers to demonstrate the foundational capabilities of Reinforcement Learning in financial contexts, prioritizing pedagogical clarity over high-frequency trading complexity.
+### Tech Stack
+-   **Runtime**: Python 3.x
+-   **Machine Learning**: NumPy, Pandas
+-   **Visualization**: Streamlit, Plotly, Matplotlib, Seaborn
+-   **Algorithm**: Q-Learning (Reinforcement Learning)
+---
+<!-- STRUCTURE -->
+<a name="project-structure"></a>
+## Project Structure
+```python
+OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/
+│
+├── docs/                                            # Technical Documentation
+│   └── SPECIFICATION.md                             # Architecture & Design Specification
+│
+├── Mega/                                            # Archival Attribution Assets
+│   ├── Filly.jpg                                    # Companion (Filly)
+│   ├── Mega.png                                     # Author Profile Image (Mega Satish)
+│   └── ...                                          # Additional Attribution Files
+│
+├── screenshots/                                     # Application Screenshots
+│   ├── 01-landing-page.png                          # Home Interface
+│   ├── 02-amzn-trend.png                            # Stock Trend Visualization
+│   ├── 03-portfolio-growth.png                      # Portfolio Value Over Time
+│   └── 04-alb-trend.png                             # Analysis Example
+│
+├── Source Code/                                     # Core Implementation
+│   ├── Train_model/                                 # Training Notebooks
+│   │   └── Model.ipynb                              # Q-Learning Implementation
+│   │
+│   ├── .streamlit/                                  # Streamlit Configuration
+│   ├── all_stocks_5yr.csv                           # Historical Dataset
+│   ├── model.pkl                                    # Trained Q-Table (Pickle)
+│   ├── Procfile                                     # Heroku Deployment Config
+│   ├── requirements.txt                             # Dependencies
+│   ├── setup.sh                                     # Environment Setup Script
+│   └── Stock-RL.py                                  # Main Application Script
+│
+├── Technocolabs/                                    # Internship Artifacts
+│   ├── AMEY THAKUR - BLUEPRINT.pdf                  # Design Blueprint
+│   ├── Optimizing Stock Trading...pdf               # Research Paper
+│   ├── PROJECT REPORT.pdf                           # Final Project Report
+│   └── ...                                          # Internship Completion Documents
+│
+├── .gitattributes                                   # Git configuration
+├── .gitignore                                       # Repository Filters
+├── CITATION.cff                                     # Scholarly Citation Metadata
+├── codemeta.json                                    # Machine-Readable Project Metadata
+├── LICENSE                                          # MIT License Terms
+├── README.md                                        # Project Documentation
+└── SECURITY.md                                      # Security Policy
+```
+---
+<!-- RESULTS -->
+<a name="results"></a>
+## Results
+<div align="center">
+  <b>1. User Interface: Landing Page</b>
+  <br>
+  <i>The Streamlit-based dashboard allows users to select stocks and define investment parameters for real-time strategy optimization.</i>
+  <br><br>
+  <img src="screenshots/01-landing-page.png" alt="Landing Page" width="80%">
+  <br><br>
+  <b>2. Market Analysis: Stock Trend</b>
+  <br>
+  <i>Historical price visualization identifying long-term upward trends suitable for momentum-based trading strategies.</i>
+  <br><br>
+  <img src="screenshots/02-amzn-trend.png" alt="Stock Trend" width="80%">
+  <br><br>
+  <b>3. Strategy Evaluation: Portfolio Growth</b>
+  <br>
+  <i>Simulation of portfolio value over time, demonstrating the cumulative return generated by the agent against the initial capital.</i>
+  <br><br>
+  <img src="screenshots/03-portfolio-growth.png" alt="Portfolio Growth" width="80%">
+  <br><br>
+  <b>4. Risk Assessment: Volatility Analysis</b>
+  <br>
+  <i>Trend analysis highlighting periods of high volatility where the agent adjusts exposure to mitigate risk.</i>
+  <br><br>
+  <img src="screenshots/04-alb-trend.png" alt="Volatility Analysis" width="80%">
+</div>
+---
+<!-- QUICK START -->
+<a name="quick-start"></a>
+## Quick Start
+### 1. Prerequisites
+-   **Python 3.7+**: Required for runtime execution. [Download Python](https://www.python.org/downloads/)
+-   **Streamlit**: For running the web application locally.
+> [!WARNING]
+> **Data Consistency**
+>
+> The Q-Learning agent depends on proper state definitions. Ensure that the input dataset contains the required `date`, `close`, and `Name` columns to correctly compute the Moving Average crossovers used for state discretization.
+### 2. Installation
+Establish the local environment by cloning the repository and installing the computational stack:
+```bash
+# Clone the repository
+git clone https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING.git
+cd OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING
+# Navigate to Source Code directory
+cd "Source Code"
+# Install dependencies
+pip install -r requirements.txt
+```
+### 3. Execution
+Launch the web server to start the prediction application:
+```bash
+streamlit run Stock-RL.py
+```
+**Access**: `http://localhost:8501/`
+---
+<!-- LICENSE -->
+<a name="license"></a>
+## License
+This academic submission, developed for the **Data Science Internship** at **Technocolabs Software**, is made available under the **MIT License**. See the [LICENSE](LICENSE) file for complete terms.
+> [!NOTE]
+> **Summary**: You are free to share and adapt this content for any purpose, even commercially, as long as you provide appropriate attribution to the original authors.
+**Copyright © 2021 Amey Thakur & Mega Satish**
+---
+<!-- ABOUT -->
+<a name="about-this-repository"></a>
+## About This Repository
+**Created & Maintained by**: [Amey Thakur](https://github.com/Amey-Thakur) & [Mega Satish](https://github.com/msatmod)
+**Role**: Data Science Interns
+**Program**: Data Science Internship
+**Organization**: [Technocolabs Software](https://technocolabs.com/)
+This project features **Optimizing Stock Trading Strategy With Reinforcement Learning**, a study conducted as part of an industrial internship. It explores the practical application of Q-Learning in financial economics.
+**Connect:** [GitHub](https://github.com/Amey-Thakur) &nbsp;·&nbsp; [LinkedIn](https://www.linkedin.com/in/amey-thakur) &nbsp;·&nbsp; [ORCID](https://orcid.org/0000-0001-5644-1575)
+### Acknowledgments
+Grateful acknowledgment to [**Mega Satish**](https://github.com/msatmod) for her exceptional collaboration and scholarly partnership during the execution of this data science internship task. Her analytical precision, deep understanding of statistical modeling, and constant support were instrumental in refining the learning algorithms used in this study. Working alongside her was a transformative experience; her thoughtful approach to problem-solving and steady encouragement turned complex challenges into meaningful learning moments. This work reflects the growth and insights gained from our side-by-side academic journey. Thank you, Mega, for everything you shared and taught along the way.
+Special thanks to the **mentors at Technocolabs Software** for providing this platform for rapid skill development and industrial exposure.
+---
+<div align="center">
+  [↑ Back to Top](#readme-top)
+  [Authors](#authors) &nbsp;·&nbsp; [Overview](#overview) &nbsp;·&nbsp; [Features](#features) &nbsp;·&nbsp; [Structure](#project-structure) &nbsp;·&nbsp; [Results](#results) &nbsp;·&nbsp; [Quick Start](#quick-start) &nbsp;·&nbsp; [License](#license) &nbsp;·&nbsp; [About](#about-this-repository) &nbsp;·&nbsp; [Acknowledgments](#acknowledgments)
+  <br>
+  📈 **[OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING](https://huggingface.co/spaces/ameythakur/Stock-Trading-RL)**
+  ---
+  ### Presented as part of the Data Science Internship @ Technocolabs Software
+  ---
+  ### 🎓 [Computer Engineering Repository](https://github.com/Amey-Thakur/COMPUTER-ENGINEERING)
+  **Computer Engineering (B.E.) - University of Mumbai**
+  *Semester-wise curriculum, laboratories, projects, and academic notes.*
+</div>

SECURITY.md ADDED Viewed

	@@ -0,0 +1,41 @@

+# Security Policy
+## Maintenance Status
+This repository is part of a curated collection of academic, engineering, and internship projects and is maintained in a finalized and stable state. The project is preserved as a complete and authoritative record, with its scope and contents intentionally fixed to ensure long-term academic and professional reference.
+## Supported Versions
+As a finalized internship project, only the version listed below is authoritative:
+| Version | Supported |
+| ------- | --------- |
+| 1.0.0   | Yes       |
+## Vulnerability Reporting Protocol
+In accordance with established academic and professional standards for security disclosure, security-related observations associated with this internship project are documented through formal scholarly channels.
+To document a security concern, communication is facilitated with the project curators:
+  - **Curator**: [Amey Thakur](https://github.com/Amey-Thakur)
+  - **Collaborator**: [Mega Satish](https://github.com/msatmod)
+  - **Method**: Reports are submitted via the repository’s [GitHub Issues](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/issues) interface to formally record security-related findings.
+Submissions include:
+  1. A precise and technically accurate description of the identified issue.
+  2. Demonstrable steps or technical evidence sufficient to contextualize the finding.
+  3. An explanation of the issue’s relevance within the defined scope of the project.
+## Implementation Context: Optimizing Stock Trading Strategy With Reinforcement Learning
+This project consists of an implementation of a Reinforcement Learning model (Q-Learning) to optimize stock trading strategies, developed as part of a Data Science internship at Technocolabs Software.
+-   **Scope Limitation**: This policy applies exclusively to the documentation, code, and datasets contained within this repository and does not extend to the execution environment (Python/Streamlit runtime) or third-party libraries (Pandas, NumPy, etc.).
+## Technical Integrity Statement
+This repository is preserved as a fixed academic, engineering, and internship project. Security-related submissions are recorded for documentation and contextual reference and do not imply active monitoring, response obligations, or subsequent modification of the repository.
+---
+*This document defines the security posture of a finalized internship project.*

Source Code/.streamlit/config.toml ADDED Viewed

	@@ -0,0 +1,4 @@

+[theme]
+base="dark"
+primaryColor="#02eaf9"
+font="serif"

Source Code/Procfile ADDED Viewed

	@@ -0,0 +1 @@


1	+ web: sh setup.sh && streamlit run Stock-RL.py

Source Code/Stock-RL.py ADDED Viewed

	@@ -0,0 +1,243 @@

+"""
+Project: Optimizing Stock Trading Strategy With Reinforcement Learning
+Authors: Amey Thakur & Mega Satish
+Reference: https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING
+License: MIT
+Description:
+This script contains the Main Application logic served via Streamlit.
+It loads the pre-trained Q-Learning model (model.pkl), processes user-selected
+stock data, simulates the trading strategy on unseen data, and visualizes
+the portfolio performance using interactive Plotly charts.
+"""
+import numpy as np
+import pandas as pd
+from pandas._libs.missing import NA
+import streamlit as st
+import time
+import plotly.graph_objects as go
+import pickle as pkl
+# ==========================================
+# 1. Data Processing Logic
+# ==========================================
+# @st.cache(persist=True)
+def data_prep(data, name):
+    """
+    Prepares the dataset for the selected stock ticker.
+    Args:
+        data (pd.DataFrame): The raw dataset.
+        name (str): The specific stock name selected by the user.
+    Returns:
+        pd.DataFrame: A clean dataframe with computed Moving Averages (5-day & 1-day).
+    """
+    df = pd.DataFrame(data[data['Name'] == name])
+    df.dropna(inplace=True)
+    df.reset_index(drop=True, inplace=True)
+    # Calculate Moving Averages (Technical Indicators)
+    # These indicators form the basis of the State Space for the RL agent.
+    df['5day_MA'] = df['close'].rolling(5).mean()
+    df['1day_MA'] = df['close'].rolling(1).mean()
+    # Handle initial NaN values
+    df.loc[:4, '5day_MA'] = 0
+    return df
+# ==========================================
+# 2. Agent Logic (Inference)
+# ==========================================
+# @st.cache(persist=True)
+def get_state(long_ma, short_ma, t):
+    """
+    Determines the current state of the market based on MA crossovers.
+    Returns a tuple (Trend, Position) matching the Q-Table structure used during training.
+    """
+    if short_ma < long_ma:
+        if t == 1:
+            return (0, 1) # Bearish, Cash
+        else:
+            return (0, 0) # Bearish, Stock
+    elif short_ma > long_ma:
+        if t == 1:
+            return (1, 1) # Bullish, Cash
+        else:
+            return (1, 0) # Bullish, Stock
+    return (0, 1) # Default
+# @st.cache(persist=True)
+def trade_t(num_of_stocks, port_value, current_price):
+    """
+    Checks if a trade (Buy) is financially feasible.
+    """
+    if num_of_stocks >= 0:
+        if port_value > current_price:
+            return 1 # Can Buy
+        else: return 0
+    else:
+        if port_value > current_price:
+            return 1
+        else: return 0
+# @st.cache(persist=True)
+def next_act(state, qtable, epsilon, action=3):
+    """
+    Decides the next action based on the trained Q-Table.
+    During inference (testing), epsilon is typically 0 (pure exploitation),
+    meaning the agent always chooses the optimal action learned during training.
+    """
+    if np.random.rand() < epsilon:
+        action = np.random.randint(action)
+    else:
+        action = np.argmax(qtable[state])
+    return action
+# @st.cache(persist=True)
+def test_stock(stocks_test, q_table, invest):
+    """
+    Runs a simulation of the trading strategy on the selected stock.
+    Args:
+        stocks_test (pd.DataFrame): The stock data to test on.
+        q_table (np.array): The loaded reinforcement learning model.
+        invest (int): Initial investment amount.
+    Returns:
+        list: A time-series list of net worth values over the simulation period.
+    """
+    num_stocks = 0
+    epsilon = 0 # No exploration during testing/inference
+    net_worth = [invest]
+    np.random.seed()
+    for dt in range(len(stocks_test)):
+        long_ma = stocks_test.iloc[dt]['5day_MA']
+        short_ma = stocks_test.iloc[dt]['1day_MA']
+        close_price = stocks_test.iloc[dt]['close']
+        # Determine Current State
+        t = trade_t(num_stocks, net_worth[-1], close_price)
+        state = get_state(long_ma, short_ma, t)
+        # Agent chooses action
+        action = next_act(state, q_table, epsilon)
+        if action == 0: # Buy
+            num_stocks += 1
+            to_append = net_worth[-1] - close_price
+            net_worth.append(np.round(to_append, 1))
+        elif action == 1: # Sell
+            num_stocks -= 1
+            to_append = net_worth[-1] + close_price
+            net_worth.append(np.round(to_append, 1))
+        elif action == 2: # Hold
+            to_append = net_worth[-1] + close_price # Mark-to-market valuation
+            net_worth.append(np.round(to_append, 1))
+        # Check for next state existence
+        try:
+            next_state = get_state(stocks_test.iloc[dt+1]['5day_MA'], stocks_test.iloc[dt+1]['1day_MA'], t)
+        except:
+            break
+    return net_worth
+# ==========================================
+# 3. Streamlit Interface
+# ==========================================
+def fun():
+    # Reading the Dataset
+    # Ensure all_stocks_5yr.csv is in the working directory
+    data = pd.read_csv('all_stocks_5yr.csv')
+    names = list(data['Name'].unique())
+    names.insert(0, "<Select Names>")
+    st.title("Optimizing Stock Trading Strategy With Reinforcement Learning")
+    st.sidebar.title("Choose Stock and Investment")
+    st.sidebar.subheader("Choose Company Stocks")
+    # User Input: Select Stock
+    stock = st.sidebar.selectbox("(*select one stock only)", names, index=0)
+    if stock != "<Select Names>":
+        stock_df = data_prep(data, stock)
+        # Sidebar Checkbox: Plot Data Trend
+        if st.sidebar.button("Show Stock Trend", key=1):
+            fig = go.Figure()
+            fig.add_trace(go.Scatter(
+                x=stock_df['date'],
+                y=stock_df['close'],
+                mode='lines',
+                name='Stock_Trend',
+                line=dict(color='cyan', width=2)
+            ))
+            fig.update_layout(
+                title='Stock Trend of ' + stock,
+                xaxis_title='Date',
+                yaxis_title='Price ($) '
+            )
+            st.plotly_chart(fig, use_container_width=True)
+            # Simple heuristic for trend feedback
+            if stock_df.iloc[500]['close'] > stock_df.iloc[0]['close']:
+                original_title = '<b><p style="font-family:Play; color:Cyan; font-size: 20px;">NOTE:<br>Stock is on a solid upward trend. Investing here might be profitable.</p>'
+                st.markdown(original_title, unsafe_allow_html=True)
+            else:
+                original_title = '<b><p style="font-family:Play; color:Red; font-size: 20px;">NOTE:<br> Stock does not appear to be in a solid uptrend. Better not to invest here; instead, pick different stock.</p>'
+                st.markdown(original_title, unsafe_allow_html=True)
+        # Sidebar Checkbox: Investment Simulation
+        st.sidebar.subheader("Enter Your Available Initial Investment Fund")
+        invest = st.sidebar.slider('Select a range of values', 1000, 1000000)
+        if st.sidebar.button("Calculate", key=2):
+            # Load Pre-trained Model
+            try:
+                # Using 'model.pkl' as standardized
+                q_table = pkl.load(open('model.pkl', 'rb'))
+            except FileNotFoundError:
+                st.error("Model file 'model.pkl' not found. Please ensure the model is trained.")
+                return
+            # Run Simulation
+            net_worth = test_stock(stock_df, q_table, invest)
+            net_worth = pd.DataFrame(net_worth, columns=['value'])
+            # Plot Results
+            fig = go.Figure()
+            fig.add_trace(go.Scatter(
+                x=net_worth.index,
+                y=net_worth['value'],
+                mode='lines',
+                name='Net_Worth_Trend',
+                line=dict(color='cyan', width=2)
+            ))
+            fig.update_layout(
+                title='Change in Portfolio Value Day by Day',
+                xaxis_title='Number of Days since Feb 2013 ',
+                yaxis_title='Value ($) '
+            )
+            st.plotly_chart(fig, use_container_width=True)
+            original_title = '<b><p style="font-family:Play; color:Cyan; font-size: 20px;">NOTE:<br> Increase in your net worth as a result of a model decision.</p>'
+            st.markdown(original_title, unsafe_allow_html=True)
+if __name__ == '__main__':
+    fun()
+    # Dummy chart for layout purposes if needed, otherwise optional
+    # chart_data = pd.DataFrame(np.random.randn(20, 3), columns=['a', 'b', 'c'])

Source Code/Train_model/Model.ipynb ADDED Viewed

	@@ -0,0 +1,407 @@

+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "source": [
+    "import pandas as pd\r\n",
+    "import numpy  as np\r\n",
+    "import seaborn as sns\r\n",
+    "import matplotlib.pyplot as plt\r\n",
+    "import pickle as pk"
+   ],
+   "outputs": [],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 24,
+   "source": [
+    "df=pd.read_csv('all_stocks_5yr.csv')\r\n"
+   ],
+   "outputs": [],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "source": [
+    "df1=df['close']\r\n",
+    "df1.iloc[0]"
+   ],
+   "outputs": [
+    {
+     "output_type": "execute_result",
+     "data": {
+      "text/plain": [
+       "14.75"
+      ]
+     },
+     "metadata": {},
+     "execution_count": 22
+    }
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 25,
+   "source": [
+    "#Creating Environment Matrix 2x2x3\r\n",
+    "env_rows=2\r\n",
+    "env_cols=2\r\n",
+    "n_action=3\r\n",
+    "\r\n",
+    "q_table=np.zeros((env_rows,env_cols,n_action))\r\n",
+    "np.random.seed()\r\n",
+    "pk.dump(q_table,open(\"pickl.pkl\",'wb'))"
+   ],
+   "outputs": [],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "source": [
+    "pk.load(open(\"pickl.pkl\",'rb'))"
+   ],
+   "outputs": [
+    {
+     "output_type": "execute_result",
+     "data": {
+      "text/plain": [
+       "'hey'"
+      ]
+     },
+     "metadata": {},
+     "execution_count": 7
+    }
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 26,
+   "source": [
+    "#Defining Data Preprocessing Function\r\n",
+    "\r\n",
+    "def data_prep(data,name):\r\n",
+    "    df=pd.DataFrame(data[data['Name']==name])\r\n",
+    "    df.dropna(inplace=True)\r\n",
+    "    df.drop(['high','low','volume','Name'],axis=1,inplace=True)\r\n",
+    "    df.reset_index(drop=True,inplace=True)\r\n",
+    "    # Calculating 5 day and 1 day Moving Average for DF\r\n",
+    "    df['5day_MA']=df['close'].rolling(5).mean()\r\n",
+    "    df['1day_MA']=df['close'].rolling(1).mean()\r\n",
+    "    df['5day_MA'][:4]=0\r\n",
+    "    #Splitting into train and Test data\r\n",
+    "    train_df=df[:int(len(df)*0.8)]\r\n",
+    "    test_df=df[int(len(df)*0.8):].reset_index(drop=True)\r\n",
+    "    return train_df,test_df\r\n",
+    "\r\n",
+    "# Get the state for datapoint by Moving Average\r\n",
+    "def get_state(long_ma,short_ma,t):\r\n",
+    "    if short_ma<long_ma:\r\n",
+    "        if t==1:\r\n",
+    "            return (0,1) #Cash\r\n",
+    "        else :\r\n",
+    "            return (0,0) #Stock\r\n",
+    "    \r\n",
+    "    elif short_ma>long_ma:\r\n",
+    "        if t==1:\r\n",
+    "            return (1,1) #Cash\r\n",
+    "        else :\r\n",
+    "            return (1,0) #Stock\r\n",
+    "\r\n",
+    "\r\n",
+    "#Checking if the user can trade or not\r\n",
+    "def trade_t(num_of_stocks,port_value,current_price):\r\n",
+    "    if num_of_stocks>=0:\r\n",
+    "        if port_value>current_price:\r\n",
+    "            return 1\r\n",
+    "        else :return 0\r\n",
+    "    else:\r\n",
+    "        if port_value>current_price:\r\n",
+    "            return 1\r\n",
+    "        else :return 0\r\n",
+    "\r\n",
+    "\r\n",
+    "\r\n",
+    "#Get next action by Epsilon greedy\r\n",
+    "def next_act(state,qtable,epsilon,action=3):\r\n",
+    "    if np.random.rand() < epsilon:\r\n",
+    "        action=np.random.randint(action)\r\n",
+    "    else:\r\n",
+    "        action=np.argmax(qtable[state])\r\n",
+    "        \r\n",
+    "        \r\n",
+    "    return action\r\n",
+    "\r\n",
+    "\r\n",
+    "\r\n",
+    "# Immidiate reward Generator based on cummulative wealth \r\n",
+    "def get_reward(state,action,current_close,past_close,buy_history):\r\n",
+    "    if state==(0,0) or state==(1,0): #Stock position\r\n",
+    "        if action==0:\r\n",
+    "            return -1000\r\n",
+    "        elif action==1:\r\n",
+    "            return (current_close-buy_history)\r\n",
+    "        elif action==2:\r\n",
+    "            return (current_close-past_close)\r\n",
+    "    \r\n",
+    "    elif state==(0,1) or state==(1,1): #Cash Position\r\n",
+    "        if action==0:\r\n",
+    "            return 0\r\n",
+    "        elif action==1:\r\n",
+    "            return -1000\r\n",
+    "        elif action==2:\r\n",
+    "            return (current_close-past_close)\r\n",
+    "\r\n",
+    "    \r\n",
+    "    \r\n",
+    "    \r\n",
+    "\r\n"
+   ],
+   "outputs": [],
+   "metadata": {}
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "<h4>Reading and preprocessing the Dataset"
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 27,
+   "source": [
+    "stocks=pd.read_csv('all_stocks_5yr.csv')\r\n",
+    "stocks_train,stocks_test=data_prep(stocks,'AAPL')"
+   ],
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stderr",
+     "text": [
+      "C:\\Users\\mchil\\AppData\\Local\\Temp/ipykernel_12420/1010674326.py:11: SettingWithCopyWarning: \n",
+      "A value is trying to be set on a copy of a slice from a DataFrame\n",
+      "\n",
+      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
+      "  df['5day_MA'][:4]=0\n"
+     ]
+    }
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "<h4>Training the Dataset"
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 28,
+   "source": [
+    "episodes=100\r\n",
+    "port_value=1000\r\n",
+    "num_stocks=0\r\n",
+    "epsilon=1 #Epsilon Greedy\r\n",
+    "alpha=0.05 #Learning Rate\r\n",
+    "gamma=0.15 #Discount Factor\r\n",
+    "buy_history=0\r\n",
+    "net_worth=[1000] #Portfolio Value\r\n",
+    "np.random.seed()\r\n",
+    "for i in range(episodes): #Iteration for each episode\r\n",
+    "    port_value=1000\r\n",
+    "    num_stocks=0\r\n",
+    "    buy_history=0\r\n",
+    "    net_worth=[1000]\r\n",
+    "    \r\n",
+    "\r\n",
+    "    for dt in range(len(stocks_train)): #Iteration through each dataset\r\n",
+    "        long_ma=stocks_train.iloc[dt]['5day_MA']\r\n",
+    "        short_ma=stocks_train.iloc[dt]['1day_MA']\r\n",
+    "        close_price=stocks_train.iloc[dt]['close']\r\n",
+    "        next_close=0\r\n",
+    "        \r\n",
+    "        if dt>0:\r\n",
+    "            past_close=stocks_train.iloc[dt-1]['close']\r\n",
+    "        else:\r\n",
+    "            past_close=close_price\r\n",
+    "        t=trade_t(num_stocks,net_worth[-1],close_price)\r\n",
+    "        state=get_state(long_ma,short_ma,t)\r\n",
+    "        action=next_act(state,q_table,epsilon)\r\n",
+    "\r\n",
+    "        if action==0:#Buy\r\n",
+    "            \r\n",
+    "             num_stocks+=1\r\n",
+    "             buy_history=close_price\r\n",
+    "             to_append=net_worth[-1]-close_price\r\n",
+    "             net_worth.append(np.round(to_append,1))\r\n",
+    "             r=0\r\n",
+    "            \r\n",
+    "            \r\n",
+    "        \r\n",
+    "        elif action==1:#Sell\r\n",
+    "            # if num_stocks>0:\r\n",
+    "                num_stocks-=1               \r\n",
+    "                to_append=net_worth[-1]+close_price\r\n",
+    "                net_worth.append(np.round(to_append,1))\r\n",
+    "                # buy_history.pop(0)\r\n",
+    "        \r\n",
+    "        elif action==2:#hold\r\n",
+    "            to_append=net_worth[-1]+close_price\r\n",
+    "            net_worth.append(np.round(to_append,1))\r\n",
+    "            \r\n",
+    "        \r\n",
+    "                \r\n",
+    "         \r\n",
+    "\r\n",
+    "        r=get_reward(state,action,close_price,past_close,buy_history) #Getting Reward\r\n",
+    "        \r\n",
+    "        try:\r\n",
+    "            next_state=get_state(stocks_train.iloc[dt+1]['5day_MA'],stocks_train.iloc[dt+1]['1day_MA'],t)\r\n",
+    "            \r\n",
+    "        except:\r\n",
+    "            break\r\n",
+    "        #Updating Q_table by Bellmen's Equation\r\n",
+    "        q_table[state][action]=(1.-alpha)*q_table[state][action]+alpha*(r+gamma*np.max(q_table[next_state]))\r\n",
+    "    \r\n",
+    "    if (epsilon-0.01)>0.15:\r\n",
+    "        epsilon-=0.01\r\n",
+    "\r\n",
+    "print(\"Training Complete\")"
+   ],
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      "Training Complete\n"
+     ]
+    }
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 38,
+   "source": [
+    "pk.dump(q_table,open('pickl.pkl','wb'))"
+   ],
+   "outputs": [],
+   "metadata": {}
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "<h4>Tracking the Portfolio Value "
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "source": [],
+   "outputs": [],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "source": [],
+   "outputs": [],
+   "metadata": {}
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "<h4>Testing the Dataset"
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "source": [],
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      "Test Complete\n"
+     ]
+    }
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "<h4>Plotting the portfolio for the test Dataset "
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "source": [],
+   "outputs": [],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "source": [
+    "num_stocks"
+   ],
+   "outputs": [
+    {
+     "output_type": "execute_result",
+     "data": {
+      "text/plain": [
+       "94"
+      ]
+     },
+     "metadata": {},
+     "execution_count": 10
+    }
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "source": [],
+   "outputs": [],
+   "metadata": {}
+  }
+ ],
+ "metadata": {
+  "orig_nbformat": 4,
+  "language_info": {
+   "name": "python",
+   "version": "3.9.7",
+   "mimetype": "text/x-python",
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "pygments_lexer": "ipython3",
+   "nbconvert_exporter": "python",
+   "file_extension": ".py"
+  },
+  "kernelspec": {
+   "name": "python3",
+   "display_name": "Python 3.9.7 64-bit"
+  },
+  "interpreter": {
+   "hash": "60d8401257a87028599f7501811ce2c94d605f29d0573af229f453e115e13ba6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}

Source Code/all_stocks_5yr.csv ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6aea253cd19de60b568143991aaf1fa482456565c389205658d236e595e716cf
+size 29580549

Source Code/model.pkl ADDED Viewed

Binary file (247 Bytes). View file

Source Code/model_training.py ADDED Viewed

	@@ -0,0 +1,268 @@

+"""
+Project: Optimizing Stock Trading Strategy With Reinforcement Learning
+Authors: Amey Thakur & Mega Satish
+Reference: https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING
+License: MIT
+Description:
+This script implements the training phase of the Reinforcement Learning agent (Q-Learning).
+It preprocesses historical stock data, defines the market environment as a set of states
+based on Moving Average crossovers, and iteratively updates a Q-Table to learn optimal
+trading actions (Buy, Sell, Hold) that maximize portfolio returns.
+"""
+import pandas as pd
+import numpy as np
+import pickle as pkl
+import os
+# ==========================================
+# 1. Data Preprocessing
+# ==========================================
+def data_prep(data, name):
+    """
+    Preprocesses the stock data for a specific company.
+    Args:
+        data (pd.DataFrame): The complete dataset containing all stocks.
+        name (str): The ticker symbol of the stock to filter (e.g., 'AAPL').
+    Returns:
+        tuple: (train_df, test_df) - The split training and testing datasets.
+    Methodology:
+    - Filters data by stock name.
+    - Computes Technical Indicators: 5-day and 1-day Moving Averages (MA).
+        - 5-day MA represents the short-term trend baseline.
+        - 1-day MA represents the immediate price action.
+    - The interaction between these two MAs serves as the primary signal for state determination.
+    """
+    df = pd.DataFrame(data[data['Name'] == name])
+    df.dropna(inplace=True)
+    df.drop(['high', 'low', 'volume', 'Name'], axis=1, inplace=True)
+    df.reset_index(drop=True, inplace=True)
+    # Calculating Moving Averages used for State Definition
+    df['5day_MA'] = df['close'].rolling(5).mean()
+    df['1day_MA'] = df['close'].rolling(1).mean()
+    # Initialize first few rows where rolling mean is NaN
+    df.loc[:4, '5day_MA'] = 0
+    # Splitting into Train (80%) and Test (20%) sets
+    split_idx = int(len(df) * 0.8)
+    train_df = df[:split_idx]
+    test_df = df[split_idx:].reset_index(drop=True)
+    return train_df, test_df
+# ==========================================
+# 2. Environment & State Definitions
+# ==========================================
+def get_state(long_ma, short_ma, t):
+    """
+    Discretizes continuous market data into a finite set of states.
+    The state space is defined by a tuple (Trend_Signal, Holding_Status).
+    1. Trend_Signal:
+       - 0: short_ma < long_ma (Bearish/Downtrend)
+       - 1: short_ma > long_ma (Bullish/Uptrend)
+    2. Holding_Status (t):
+       - 0: Currently holding stock
+       - 1: Currently holding cash (no stock)
+    Returns:
+        tuple: (trend, holding_status) representing the current environment state.
+    """
+    if short_ma < long_ma:
+        if t == 1:
+            return (0, 1) # Bearish Trend, Holding Cash
+        else:
+            return (0, 0) # Bearish Trend, Holding Stock
+    elif short_ma > long_ma:
+        if t == 1:
+            return (1, 1) # Bullish Trend, Holding Cash
+        else:
+            return (1, 0) # Bullish Trend, Holding Stock
+    # Default case (should rarely be hit with floats)
+    return (0, 1)
+def trade_t(num_of_stocks, port_value, current_price):
+    """
+    Determines the holding capability of the agent.
+    Returns:
+        int: 1 if the agent has capital to buy (Cash), 0 if fully invested (Stock).
+    """
+    # Simply mapping: if we have stocks or cash value > current price, we can 'technically' buy/hold
+    # But in this simplified binary state (All-in or All-out), we track logical status.
+    # Here, we simplify:
+    if num_of_stocks > 0:
+        return 0 # User holds stock
+    else:
+        if port_value > current_price:
+            return 1 # User holds cash and can afford stock
+        else:
+            return 0 # User is broke/cannot buy
+# ==========================================
+# 3. Q-Learning Agent Logic
+# ==========================================
+def next_act(state, qtable, epsilon, action_space=3):
+    """
+    Selects the next action using the Epsilon-Greedy Policy.
+    Args:
+        state (tuple): The current state of the environment.
+        qtable (np.array): The Q-Table storing action-values.
+        epsilon (float): Exploration rate (probability of random action).
+    Returns:
+        int: The selected action index.
+            0: Buy
+            1: Sell
+            2: Hold
+    """
+    if np.random.rand() < epsilon:
+        # Exploration: Random action
+        action = np.random.randint(action_space)
+    else:
+        # Exploitation: Best known action from Q-Table
+        action = np.argmax(qtable[state])
+    return action
+def get_reward(state, action, current_close, past_close, buy_history):
+    """
+    Calculates the immediate reward for a given state-action pair.
+    The Reward Function is crucial for guiding the agent:
+    - Penalize invalid moves (e.g., Buying when already holding).
+    - Reward profit generation (Selling higher than bought).
+    - Reward capital preservation (Holding during downturns).
+    """
+    if state == (0, 0) or state == (1, 0): # State: Holding Stock
+        if action == 0: # Try to Buy again
+            return -1000 # Heavy Penalty for illegal move
+        elif action == 1: # Sell
+            return (current_close - buy_history) # Reward is the realized PnL
+        elif action == 2: # Hold
+            return (current_close - past_close) # Reward is the unrealized daily change
+    elif state == (0, 1) or state == (1, 1): # State: Holding Cash
+        if action == 0: # Buy
+            return 0 # Neutral reward for entering position
+        elif action == 1: # Try to Sell again
+            return -1000 # Heavy Penalty for illegal move
+        elif action == 2: # Hold (Wait)
+            return (current_close - past_close) # Opportunity cost/benefit tracking
+    return 0
+# ==========================================
+# 4. Main Training Loop
+# ==========================================
+def train_model():
+    print("Initializing Training Process...")
+    # 4.1 Initialize Q-Table
+    # Dimensions: 2 (Trend States) x 2 (Holding States) x 3 (Actions)
+    env_rows = 2
+    env_cols = 2
+    n_action = 3
+    q_table = np.zeros((env_rows, env_cols, n_action))
+    # 4.2 Load Data
+    try:
+        stocks = pd.read_csv('all_stocks_5yr.csv')
+        # We train primarily on AAPL as the representative asset for this strategy
+        stocks_train, _ = data_prep(stocks, 'AAPL')
+    except FileNotFoundError:
+        print("Error: 'all_stocks_5yr.csv' not found.")
+        return
+    # 4.3 Hyperparameters
+    episodes = 100       # Number of times to iterate over the dataset
+    epsilon = 1.0        # Initial Exploration Rate (100% random)
+    alpha = 0.05         # Learning Rate (Impact of new information)
+    gamma = 0.15         # Discount Factor (Importance of future rewards)
+    print(f"Starting Training for {episodes} episodes...")
+    for i in range(episodes):
+        # Reset Episode Variables
+        port_value = 1000
+        num_stocks = 0
+        buy_history = 0
+        net_worth = [1000]
+        # Iterate over the time-series
+        for dt in range(len(stocks_train)):
+            long_ma = stocks_train.iloc[dt]['5day_MA']
+            short_ma = stocks_train.iloc[dt]['1day_MA']
+            close_price = stocks_train.iloc[dt]['close']
+            # Get Previous Close for Reward Calc
+            if dt > 0:
+                past_close = stocks_train.iloc[dt-1]['close']
+            else:
+                past_close = close_price
+            # Determine Current State
+            t = trade_t(num_stocks, net_worth[-1], close_price)
+            state = get_state(long_ma, short_ma, t)
+            # Select Action
+            action = next_act(state, q_table, epsilon)
+            # Execute Action & Update Portfolio Logic
+            if action == 0: # Buy
+                num_stocks += 1
+                buy_history = close_price
+                net_worth.append(np.round(net_worth[-1] - close_price, 1))
+                r = 0 # Reward calculated later if needed, mostly 0 for entry
+            elif action == 1: # Sell
+                num_stocks -= 1
+                net_worth.append(np.round(net_worth[-1] + close_price, 1))
+                # buy_history handled in reward
+            elif action == 2: # Hold
+                net_worth.append(np.round(net_worth[-1] + close_price, 1)) # Simplified tracking
+            # Compute Reward
+            r = get_reward(state, action, close_price, past_close, buy_history)
+            # Observe Next State
+            try:
+                next_long = stocks_train.iloc[dt+1]['5day_MA']
+                next_short = stocks_train.iloc[dt+1]['1day_MA']
+                next_state = get_state(next_long, next_short, t)
+            except IndexError:
+                # End of data
+                break
+            # Update Q-Value using Bellman Equation
+            # Q(s,a) = (1-alpha) * Q(s,a) + alpha * (reward + gamma * max(Q(s', a')))
+            q_table[state][action] = (1. - alpha) * q_table[state][action] + alpha * (r + gamma * np.max(q_table[next_state]))
+        # Decay Epsilon to reduce exploration over time
+        if (epsilon - 0.01) > 0.15:
+            epsilon -= 0.01
+        if (i + 1) % 10 == 0:
+            print(f"Episode {i+1}/{episodes} complete. Epsilon: {epsilon:.2f}")
+    print("Training Complete.")
+    # 4.4 Save the Trained Model
+    with open('model.pkl', 'wb') as f:
+        pkl.dump(q_table, f)
+    print("Model saved to 'model.pkl'.")
+if __name__ == "__main__":
+    train_model()

Source Code/requirements.txt ADDED Viewed

	@@ -0,0 +1,4 @@

+plotly==5.3.1
+numpy==1.21.2
+streamlit==0.88.0
+pandas==1.3.2

Source Code/setup.sh ADDED Viewed

	@@ -0,0 +1,8 @@

+mkdir -p ~/.streamlit/
+echo "\
+[server]\n\
+headless = true\n\
+port = $PORT\n\
+enableCORS = false\n\
+\n\
+" > ~/.streamlit/config.toml

Technocolabs/AMEY THAKUR - BLUEPRINT.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c3a27191ddee99cfe755f89dc8e0969b8bbd7ac23ec434abddaf8c0aa28a334c
+size 51084

Technocolabs/Optimizing Stock Trading Strategy With Reinforcement Learning.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e282cd349014ab77bb883975a1fdb98fd3f83011d767e8496de1d65ac12a2571
+size 2348727

Technocolabs/PROJECT REPORT.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:193080af8207444469c3ba503f5b521b86baead5dce61f5e75fac67746b4b787
+size 2347221

Technocolabs/Technocolabs Software - Data Scientist - Internship Completion Letter.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:850927a3c60791a4582d0cbeff15f3be453cd56e289e837486f5dd9b671cd7b5
+size 171716

Technocolabs/Technocolabs Software - Data Scientist - Internship Offer Letter.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:dc9883d92c762771afdd790072ad79bd896ae36b9207f4b4b769673a2266df46
+size 71201

Technocolabs/Technocolabs Software - Data Scientist - Letter of Recommendation.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2527b413b3e0eec237aaa592ead956c839e35fa91cda1c845f7c3601a4a84547
+size 247718

Technocolabs/Technocolabs Software - Data Scientist - Project Completion Letter.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cdb1e8c9513a65afe7118e82286b633a65b4cf4d4a48d1c0e1f273c033156be1
+size 193938

codemeta.json ADDED Viewed

	@@ -0,0 +1,43 @@

+{
+    "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
+    "@type": "SoftwareSourceCode",
+    "name": "OPTIMIZING STOCK TRADING STRATEGY WITH REINFORCEMENT LEARNING",
+    "description": "Data Science Internship at Technocolabs Software. Task: To optimize stock trading strategy using Reinforcement Learning. The solution implements Q-Learning and a Streamlit-based web interface for real-time strategy visualization.",
+    "identifier": "OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING",
+    "license": "https://spdx.org/licenses/MIT.html",
+    "programmingLanguage": [
+        "Python"
+    ],
+    "author": [
+        {
+            "@type": "Person",
+            "givenName": "Amey",
+            "familyName": "Thakur",
+            "id": "https://orcid.org/0000-0001-5644-1575"
+        },
+        {
+            "@type": "Person",
+            "givenName": "Mega",
+            "familyName": "Satish",
+            "id": "https://orcid.org/0000-0002-1844-9557"
+        }
+    ],
+    "dateReleased": "2021-09-18",
+    "codeRepository": "https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING",
+    "developmentStatus": "complete",
+    "applicationCategory": "Data Science / Reinforcement Learning",
+    "keywords": [
+        "Technocolabs Software",
+        "Data Science",
+        "Stock Trading",
+        "Reinforcement Learning",
+        "Q-Learning",
+        "Python3",
+        "Pandas",
+        "Numpy",
+        "Streamlit"
+    ],
+    "relatedLink": [
+        "https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING"
+    ]
+}

docs/SPECIFICATION.md ADDED Viewed

	@@ -0,0 +1,46 @@

+# Technical Specification: Optimizing Stock Trading Strategy
+## Architectural Overview
+**Optimizing Stock Trading Strategy With Reinforcement Learning** is a predictive modeling study designed to demonstrate the application of Q-Learning in optimizing trading decisions. The project serves as a digital exploration into machine learning heuristics for financial markets, established during a Data Science internship program at Technocolabs Software.
+### Analytics Pipeline
+```mermaid
+graph TD
+    Start["Stock Data (CSV)"] --> Load["Data Ingestion (Pandas)"]
+    Load --> Feature["Feature Engineering (Moving Averages)"]
+    Feature --> Agent["Q-Learning Agent"]
+    Agent --> State["State Definition (MA Crossover + Trend)"]
+    State --> Action["Action Selection (Buy/Sell/Hold)"]
+    Action --> Portfolio["Portfolio Update"]
+    Portfolio --> Visualize["Streamlit Visualization"]
+```
+---
+## Technical Implementations
+### 1. Modeling Architecture
+-   **Core**: Built on **NumPy** and **Pandas**, utilizing custom Q-Learning logic for decision making.
+-   **Estimation Logic**: Establishing a relationship between market states (Moving Averages) and optimal actions to maximize portfolio value.
+### 2. Evaluation & Validation
+-   **Metrics**: Evaluates performance based on net worth accumulation over a 5-year period compared to a buy-and-hold strategy.
+-   **Reproducibility**: Utilizes historical stock data to promote consistent testing environments.
+-   **Heuristics**: Scalable decision logic encapsulated in a python script to process real-time simulation.
+### 3. Developmental Infrastructure
+-   **Notebook Runtime**: The primary research was conducted in **Jupyter Notebook**, exploring state representation and reward functions.
+-   **Source Production**: The analytical kernel is deployed via a **Streamlit App**, bridging the gap between statistical modeling and end-user interactive application.
+---
+## Technical Prerequisites
+-   **Runtime**: Python 3.7+ environment (Local or Cloud-based).
+-   **Dependencies**: `pandas`, `numpy`, `streamlit`, and `plotly` libraries.
+---
+*Technical Specification | Data Science | Version 1.0*

screenshots/01-landing-page.png ADDED Viewed

Git LFS Details

SHA256: c19eb11e86dac95a4826629cbd01510f1d55acc3676e243e0f7dbf3a8a2d9af9
Pointer size: 130 Bytes
Size of remote file: 48.2 kB

screenshots/02-amzn-trend.png ADDED Viewed

Git LFS Details

SHA256: ba682341788869786696bf6f3994f1af474a79c042f40436148246acbae1767b
Pointer size: 130 Bytes
Size of remote file: 70.1 kB

screenshots/03-portfolio-growth.png ADDED Viewed

Git LFS Details

SHA256: 7049d2446494aa5ded5e402f293db340d3e02416cb8cb99ebd43c3ec49be604e
Pointer size: 130 Bytes
Size of remote file: 71.7 kB

screenshots/04-alb-trend.png ADDED Viewed

Git LFS Details

SHA256: 61423f49f2f0d6f5d6f50addf5d16a7496abcf7756d35babd9316410f7c6055b
Pointer size: 130 Bytes
Size of remote file: 74.5 kB