ameythakur commited on
Commit
9ffa007
·
unverified ·
0 Parent(s):

Stock-Trading-RL

Browse files
.DS_Store ADDED
Binary file (6.15 kB). View file
 
.gitattributes ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Standardized Git Attributes for Scholarly Archiving
2
+ # https://git-scm.com/docs/gitattributes
3
+ # Auto detect text files and perform LF normalization
4
+ * text=auto
5
+ # Source code
6
+ *.py text diff=python
7
+ *.js text
8
+ *.css text
9
+ *.html text
10
+ *.sh text eol=lf
11
+ *.json text
12
+ # Documentation
13
+ *.md text
14
+ *.txt text
15
+ *.cff text
16
+ *.yml text
17
+ *.yaml text
18
+ # Data & Models
19
+ *.csv filter=lfs diff=lfs merge=lfs -text
20
+ *.pkl binary
21
+ *.pickle binary
22
+ *.h5 binary
23
+ *.pdf binary
24
+ # Images
25
+ *.jpg binary
26
+ *.jpeg binary
27
+ *.png binary
28
+ *.gif binary
29
+ *.ico binary
30
+ *.svg text
31
+ # Windows specific
32
+ *.bat text eol=crlf
33
+ *.ps1 text eol=crlf
34
+ # Linguist (GitHub Language Stats)
35
+ *.md linguist-documentation
36
+ *.js linguist-detectable
37
+ *.html linguist-detectable
38
+ *.pdf filter=lfs diff=lfs merge=lfs -text
39
+ *.png filter=lfs diff=lfs merge=lfs -text
40
+ *.jpg filter=lfs diff=lfs merge=lfs -text
41
+ *.jpeg filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Result: Comprehensive .gitignore for Python/Data Science Projects
2
+
3
+ # Byte-compiled / Optimized / DLL files
4
+ __pycache__/
5
+ *.py[cod]
6
+ *$py.class
7
+
8
+ # C extensions
9
+ *.so
10
+
11
+ # Distribution / Packaging
12
+ .Python
13
+ build/
14
+ develop-eggs/
15
+ dist/
16
+ downloads/
17
+ eggs/
18
+ .eggs/
19
+ lib/
20
+ lib64/
21
+ parts/
22
+ sdist/
23
+ var/
24
+ wheels/
25
+ share/python-wheels/
26
+ *.egg-info/
27
+ .installed.cfg
28
+ *.egg
29
+ MANIFEST
30
+
31
+ # PyInstaller
32
+ # Usually these files are written by a python script from a template
33
+ # before PyInstaller builds the exe, so as to inject date/other infos into it.
34
+ *.manifest
35
+ *.spec
36
+
37
+ # Installer logs
38
+ pip-log.txt
39
+ pip-delete-this-directory.txt
40
+
41
+ # Unit test / coverage reports
42
+ htmlcov/
43
+ .tox/
44
+ .nox/
45
+ .coverage
46
+ .coverage.*
47
+ .cache
48
+ nosetests.xml
49
+ coverage.xml
50
+ *.cover
51
+ *.py,cover
52
+ .hypothesis/
53
+ .pytest_cache/
54
+ cover/
55
+
56
+ # Translations
57
+ *.mo
58
+ *.pot
59
+
60
+ # Django stuff:
61
+ *.log
62
+ local_settings.py
63
+ db.sqlite3
64
+ db.sqlite3-journal
65
+
66
+ # Flask stuff:
67
+ instance/
68
+ .webassets-cache
69
+
70
+ # Scrapy stuff:
71
+ .scrapy
72
+
73
+ # Sphinx documentation
74
+ docs/_build/
75
+
76
+ # PyBuilder
77
+ target/
78
+
79
+ # Jupyter Notebook
80
+ .ipynb_checkpoints
81
+
82
+ # IPython
83
+ profile_default/
84
+ ipython_config.py
85
+
86
+ # pyenv
87
+ .python-version
88
+
89
+ # pipenv
90
+ # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
91
+ # However, in case of collaboration, if having platform-specific dependencies or dependencies
92
+ # with no cross-platform support, pipenv may install dependencies that don't work, or not
93
+ # install all needed dependencies.
94
+ #Pipfile.lock
95
+
96
+ # PEP 582; used by e.g. github.com/David-OConnor/pyflow
97
+ __pypackages__/
98
+
99
+ # Celery stuff
100
+ celerybeat-schedule
101
+ celerybeat.pid
102
+
103
+ # SageMath parsed files
104
+ *.sage.py
105
+
106
+ # Environments
107
+ .env
108
+ .venv
109
+ env/
110
+ venv/
111
+ ENV/
112
+ env.bak/
113
+ venv.bak/
114
+
115
+ # Spyder project settings
116
+ .spyderproject
117
+ .spyproject
118
+
119
+ # Rope project settings
120
+ .ropeproject
121
+
122
+ # mkdocs documentation
123
+ /site
124
+
125
+ # mypy
126
+ .mypy_cache/
127
+ .dmypy.json
128
+ dmypy.json
129
+
130
+ # Pyre type checker
131
+ .pyre/
132
+
133
+ # pytype static type analyzer
134
+ .pytype/
135
+
136
+ # Cython debug symbols
137
+ cython_debug/
138
+
139
+ # Model Weights (Large Files)
140
+ *.h5
141
+ *.hd5
142
+ # *.pkl (We are tracking model.pkl for this specific project as it is small/demo)
143
+ # *.pickle
144
+
145
+ # Streamlit
146
+ .streamlit/secrets.toml
147
+
148
+ # OS Generated
149
+ .DS_Store
150
+ .DS_Store?
151
+ ._*
152
+ .Spotlight-V100
153
+ .Trashes
154
+ ehthumbs.db
155
+ Thumbs.db
156
+
157
+ # IDES
158
+ .idea/
159
+ .vscode/
160
+ *.swp
161
+ *.swo
CITATION.cff ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ cff-version: 1.2.0
2
+ message: "If you use this Data Science internship project or its associated academic materials, please cite them as below."
3
+ authors:
4
+ - family-names: "Thakur"
5
+ given-names: "Amey"
6
+ orcid: "https://orcid.org/0000-0001-5644-1575"
7
+ - family-names: "Satish"
8
+ given-names: "Mega"
9
+ orcid: "https://orcid.org/0000-0002-1844-9557"
10
+ title: "OPTIMIZING STOCK TRADING STRATEGY WITH REINFORCEMENT LEARNING"
11
+ version: 1.0.0
12
+ date-released: 2021-09-18
13
+ url: "https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING"
Dockerfile ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Dockerfile for Hugging Face Spaces (Docker SDK)
2
+ FROM python:3.9-slim
3
+
4
+ # Set working directory
5
+ WORKDIR /app
6
+
7
+ # Install system dependencies
8
+ RUN apt-get update && apt-get install -y \
9
+ build-essential \
10
+ software-properties-common \
11
+ git \
12
+ && rm -rf /var/lib/apt/lists/*
13
+
14
+ # Copy requirements (using JSON array format for paths with spaces)
15
+ COPY ["Source Code/requirements.txt", "./requirements.txt"]
16
+
17
+ # Install Python dependencies
18
+ RUN pip install --no-cache-dir -r requirements.txt
19
+
20
+ # Copy the entire Source Code directory (using JSON array format and explicit directory destination)
21
+ COPY ["Source Code/", "./"]
22
+
23
+ # Expose the port that Hugging Face Spaces expects (7860)
24
+ EXPOSE 7860
25
+
26
+ # Define the command to run the application
27
+ CMD ["streamlit", "run", "Stock-RL.py", "--server.port=7860", "--server.address=0.0.0.0"]
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2021 Amey Thakur and Mega Satish
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
Mega/Filly.jpg ADDED

Git LFS Details

  • SHA256: f399507a2f19ba7319564afa208bb953bbf719b13377e9069649a41ac3c39ea7
  • Pointer size: 131 Bytes
  • Size of remote file: 299 kB
Mega/Mega.png ADDED

Git LFS Details

  • SHA256: 0ef583539af8192a1c656cf1b194e7faaf2135d4369aab21c04839c7c6a1c9d0
  • Pointer size: 131 Bytes
  • Size of remote file: 122 kB
Mega/Mega_Chair.png ADDED

Git LFS Details

  • SHA256: 776ee9d31807c16548a5ef4bf0bf858c1925e5945bfc7b686d8397c9af12083c
  • Pointer size: 132 Bytes
  • Size of remote file: 1.01 MB
Mega/Mega_Dining.jpg ADDED

Git LFS Details

  • SHA256: c31fe99e3d99646e3e70f7d3814d14f099986f56f29ae93ba37b46d0c2b21849
  • Pointer size: 131 Bytes
  • Size of remote file: 316 kB
Mega/Mega_Professional.jpg ADDED

Git LFS Details

  • SHA256: 925ac54a4ca4f6cb5fbb06984dd51ca7560c8580b8e231ce8a9be19bbefecfaf
  • Pointer size: 129 Bytes
  • Size of remote file: 7.9 kB
Mega/Mega_and_Hetvi.png ADDED

Git LFS Details

  • SHA256: 90ef035b18801866be44ae0d0ce295784780c839c03aae530dd2ec5e4d0bcd94
  • Pointer size: 132 Bytes
  • Size of remote file: 1 MB
README.md ADDED
@@ -0,0 +1,299 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+
3
+ <div align="center">
4
+
5
+ <a name="readme-top"></a>
6
+ # Optimizing Stock Trading Strategy With Reinforcement Learning
7
+
8
+ [![License: MIT](https://img.shields.io/badge/License-MIT-lightgrey)](LICENSE)
9
+ ![Status](https://img.shields.io/badge/Status-Completed-success)
10
+ [![Technology](https://img.shields.io/badge/Technology-Python%20%7C%20Reinforcement%20Learning-blueviolet)](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING)
11
+ [![Developed by Amey Thakur & Mega Satish](https://img.shields.io/badge/Developed%20by-Amey%20Thakur%20%26%20Mega%20Satish-blue.svg)](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING)
12
+
13
+ A machine learning study demonstrating the application of **Reinforcement Learning (Q-Learning)** algorithms to optimize stock trading strategies and maximize portfolio returns.
14
+
15
+ **[Source Code](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/tree/main/Source%20Code)** &nbsp;·&nbsp; **[Kaggle Notebook](https://www.kaggle.com/ameythakur20/stock-price-prediction-model)** &nbsp;·&nbsp; **[Video Demo](https://youtu.be/Q82a93hjxJE)** &nbsp;·&nbsp; **[Live Demo](https://huggingface.co/spaces/ameythakur/Stock-Trading-RL)**
16
+
17
+ <br>
18
+
19
+ <a href="https://youtu.be/Q82a93hjxJE">
20
+ <img src="https://img.youtube.com/vi/Q82a93hjxJE/maxresdefault.jpg" alt="Video Demo" width="70%">
21
+ </a>
22
+
23
+ </div>
24
+
25
+ ---
26
+
27
+ <div align="center">
28
+
29
+ [Authors](#authors) &nbsp;·&nbsp; [Overview](#overview) &nbsp;·&nbsp; [Features](#features) &nbsp;·&nbsp; [Structure](#project-structure) &nbsp;·&nbsp; [Results](#results) &nbsp;·&nbsp; [Quick Start](#quick-start) &nbsp;·&nbsp; [License](#license) &nbsp;·&nbsp; [About](#about-this-repository) &nbsp;·&nbsp; [Acknowledgments](#acknowledgments)
30
+
31
+ </div>
32
+
33
+ ---
34
+
35
+ <!-- AUTHORS -->
36
+ <div align="center">
37
+
38
+ <a name="authors"></a>
39
+ ## Authors
40
+
41
+ | <a href="https://github.com/Amey-Thakur"><img src="https://github.com/Amey-Thakur.png" width="150" height="150" alt="Amey Thakur"></a><br>[**Amey Thakur**](https://github.com/Amey-Thakur)<br><br>[![ORCID](https://img.shields.io/badge/ORCID-0000--0001--5644--1575-green.svg)](https://orcid.org/0000-0001-5644-1575) | <a href="https://github.com/msatmod"><img src="Mega/Mega.png" width="150" height="150" alt="Mega Satish"></a><br>[**Mega Satish**](https://github.com/msatmod)<br><br>[![ORCID](https://img.shields.io/badge/ORCID-0000--0002--1844--9557-green.svg)](https://orcid.org/0000-0002-1844-9557) |
42
+ | :---: | :---: |
43
+
44
+ </div>
45
+
46
+
47
+ > [!IMPORTANT]
48
+ > ### 🤝🏻 Special Acknowledgement
49
+ > *Special thanks to **[Mega Satish](https://github.com/msatmod)** for her meaningful contributions, guidance, and support that helped shape this work.*
50
+
51
+ ---
52
+
53
+ <!-- OVERVIEW -->
54
+ <a name="overview"></a>
55
+ ## Overview
56
+
57
+ **Optimizing Stock Trading Strategy With Reinforcement Learning** is a Data Science study conducted as part of the **Internship** at **Technocolabs Software**. The project focuses on the development of an intelligent agent capable of making autonomous trading decisions (Buy, Sell, Hold) to maximize profitability.
58
+
59
+ By leveraging **Q-Learning**, the system models the market environment where an agent learns optimal strategies based on price movements and moving average crossovers. The model is visualized via a **Streamlit** web application for real-time strategy simulation.
60
+
61
+ ### Computational Objectives
62
+ The analysis is governed by strict **exploratory and modeling principles** ensuring algorithmic validity:
63
+ * **State Representation**: utilization of Short-term and Long-term Moving Average crossovers to define market states.
64
+ * **Action Space**: Discrete action set (Buy, Sell, Hold) optimized through reward feedback.
65
+ * **Policy Optimization**: Implementing an Epsilon-Greedy strategy to balance exploration and exploitation of trading rules.
66
+
67
+ > [!NOTE]
68
+ > ### Research Impact
69
+ > This project was published as a research paper and successfully demonstrated the viability of RL agents in simulated trading environments. The work received official recognition from Technocolabs Software including an **Internship Completion Certificate** and **Letter of Recommendation**.
70
+ >
71
+ > * [ResearchGate](http://dx.doi.org/10.13140/RG.2.2.13054.05440)
72
+ > * [Project Completion Letter](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/blob/main/Technocolabs/Technocolabs%20Software%20-%20Data%20Scientist%20-%20Project%20Completion%20Letter.pdf)
73
+
74
+ ### Resources
75
+
76
+ | # | Resource | Description | Date |
77
+ | :---: | :--- | :--- | :--- |
78
+ | 1 | [**Source Code**](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/tree/main/Source%20Code) | Complete production repository and scripts | — |
79
+ | 2 | [**Kaggle Notebook**](https://www.kaggle.com/ameythakur20/stock-price-prediction-model) | Interactive Jupyter notebook for model training | — |
80
+ | 3 | [**Dataset**](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/blob/main/Source%20Code/all_stocks_5yr.csv) | Historical stock market data (5 Years) | — |
81
+ | 4 | [**Technical Specification**](docs/SPECIFICATION.md) | System architecture and specifications | — |
82
+ | 5 | [**Technical Report**](Technocolabs/PROJECT%20REPORT.pdf) | Comprehensive archival project documentation | September 2021 |
83
+ | 6 | [**Blueprint**](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/blob/main/Technocolabs/AMEY%20THAKUR%20-%20BLUEPRINT.pdf) | Initial project design and architecture blueprint | September 2021 |
84
+
85
+
86
+ > [!TIP]
87
+ > ### Market Adaptation
88
+ > The Q-Learning agent's performance relies heavily on the quality of historical data. Regular retraining with recent market data is recommended to adapt the Q-Table's values to shifting market trends and volatility patterns.
89
+
90
+ ---
91
+
92
+ <!-- FEATURES -->
93
+ <a name="features"></a>
94
+ ## Features
95
+
96
+ | Component | Technical Description |
97
+ |-----------|-----------------------|
98
+ | **Data Ingestion** | Automated loading and processing of historical stock data (CSV). |
99
+ | **Trend Analysis** | Computation of 5-day and 1-day Moving Averages to identify trend signals. |
100
+ | **RL Agent** | **Q-Learning** implementation with state-action mapping for decision autonomy. |
101
+ | **Portfolio Logic** | Dynamic tracking of cash, stock holdings, and total net worth over time. |
102
+ | **Visualization** | Interactive **Streamlit** dashboard using **Plotly** for financial charting. |
103
+
104
+ > [!NOTE]
105
+ > ### Empirical Context
106
+ > Stock markets are stochastic environments. This project simplifies the state space to Moving Average crossovers to demonstrate the foundational capabilities of Reinforcement Learning in financial contexts, prioritizing pedagogical clarity over high-frequency trading complexity.
107
+
108
+ ### Tech Stack
109
+ - **Runtime**: Python 3.x
110
+ - **Machine Learning**: NumPy, Pandas
111
+ - **Visualization**: Streamlit, Plotly, Matplotlib, Seaborn
112
+ - **Algorithm**: Q-Learning (Reinforcement Learning)
113
+
114
+ ---
115
+
116
+ <!-- STRUCTURE -->
117
+ <a name="project-structure"></a>
118
+ ## Project Structure
119
+
120
+ ```python
121
+ OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/
122
+
123
+ ├── docs/ # Technical Documentation
124
+ │ └── SPECIFICATION.md # Architecture & Design Specification
125
+
126
+ ├── Mega/ # Archival Attribution Assets
127
+ │ ├── Filly.jpg # Companion (Filly)
128
+ │ ├── Mega.png # Author Profile Image (Mega Satish)
129
+ │ └── ... # Additional Attribution Files
130
+
131
+ ├── screenshots/ # Application Screenshots
132
+ │ ├── 01-landing-page.png # Home Interface
133
+ │ ├── 02-amzn-trend.png # Stock Trend Visualization
134
+ │ ├── 03-portfolio-growth.png # Portfolio Value Over Time
135
+ │ └── 04-alb-trend.png # Analysis Example
136
+
137
+ ├── Source Code/ # Core Implementation
138
+ │ ├── Train_model/ # Training Notebooks
139
+ │ │ └── Model.ipynb # Q-Learning Implementation
140
+ │ │
141
+ │ ├── .streamlit/ # Streamlit Configuration
142
+ │ ├── all_stocks_5yr.csv # Historical Dataset
143
+ │ ├── model.pkl # Trained Q-Table (Pickle)
144
+ │ ├── Procfile # Heroku Deployment Config
145
+ │ ├── requirements.txt # Dependencies
146
+ │ ├── setup.sh # Environment Setup Script
147
+ │ └── Stock-RL.py # Main Application Script
148
+
149
+ ├── Technocolabs/ # Internship Artifacts
150
+ │ ├── AMEY THAKUR - BLUEPRINT.pdf # Design Blueprint
151
+ │ ├── Optimizing Stock Trading...pdf # Research Paper
152
+ │ ├── PROJECT REPORT.pdf # Final Project Report
153
+ │ └── ... # Internship Completion Documents
154
+
155
+ ├── .gitattributes # Git configuration
156
+ ├── .gitignore # Repository Filters
157
+ ├── CITATION.cff # Scholarly Citation Metadata
158
+ ├── codemeta.json # Machine-Readable Project Metadata
159
+ ├── LICENSE # MIT License Terms
160
+ ├── README.md # Project Documentation
161
+ └── SECURITY.md # Security Policy
162
+ ```
163
+
164
+ ---
165
+
166
+ <!-- RESULTS -->
167
+ <a name="results"></a>
168
+ ## Results
169
+
170
+ <div align="center">
171
+
172
+ <b>1. User Interface: Landing Page</b>
173
+ <br>
174
+ <i>The Streamlit-based dashboard allows users to select stocks and define investment parameters for real-time strategy optimization.</i>
175
+ <br><br>
176
+ <img src="screenshots/01-landing-page.png" alt="Landing Page" width="80%">
177
+
178
+ <br><br>
179
+
180
+ <b>2. Market Analysis: Stock Trend</b>
181
+ <br>
182
+ <i>Historical price visualization identifying long-term upward trends suitable for momentum-based trading strategies.</i>
183
+ <br><br>
184
+ <img src="screenshots/02-amzn-trend.png" alt="Stock Trend" width="80%">
185
+
186
+ <br><br>
187
+
188
+ <b>3. Strategy Evaluation: Portfolio Growth</b>
189
+ <br>
190
+ <i>Simulation of portfolio value over time, demonstrating the cumulative return generated by the agent against the initial capital.</i>
191
+ <br><br>
192
+ <img src="screenshots/03-portfolio-growth.png" alt="Portfolio Growth" width="80%">
193
+
194
+ <br><br>
195
+
196
+ <b>4. Risk Assessment: Volatility Analysis</b>
197
+ <br>
198
+ <i>Trend analysis highlighting periods of high volatility where the agent adjusts exposure to mitigate risk.</i>
199
+ <br><br>
200
+ <img src="screenshots/04-alb-trend.png" alt="Volatility Analysis" width="80%">
201
+
202
+ </div>
203
+
204
+ ---
205
+
206
+ <!-- QUICK START -->
207
+ <a name="quick-start"></a>
208
+ ## Quick Start
209
+
210
+ ### 1. Prerequisites
211
+ - **Python 3.7+**: Required for runtime execution. [Download Python](https://www.python.org/downloads/)
212
+ - **Streamlit**: For running the web application locally.
213
+
214
+ > [!WARNING]
215
+ > **Data Consistency**
216
+ >
217
+ > The Q-Learning agent depends on proper state definitions. Ensure that the input dataset contains the required `date`, `close`, and `Name` columns to correctly compute the Moving Average crossovers used for state discretization.
218
+
219
+ ### 2. Installation
220
+ Establish the local environment by cloning the repository and installing the computational stack:
221
+
222
+ ```bash
223
+ # Clone the repository
224
+ git clone https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING.git
225
+ cd OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING
226
+
227
+ # Navigate to Source Code directory
228
+ cd "Source Code"
229
+
230
+ # Install dependencies
231
+ pip install -r requirements.txt
232
+ ```
233
+
234
+ ### 3. Execution
235
+ Launch the web server to start the prediction application:
236
+ ```bash
237
+ streamlit run Stock-RL.py
238
+ ```
239
+ **Access**: `http://localhost:8501/`
240
+
241
+ ---
242
+
243
+ <!-- LICENSE -->
244
+ <a name="license"></a>
245
+ ## License
246
+
247
+ This academic submission, developed for the **Data Science Internship** at **Technocolabs Software**, is made available under the **MIT License**. See the [LICENSE](LICENSE) file for complete terms.
248
+
249
+ > [!NOTE]
250
+ > **Summary**: You are free to share and adapt this content for any purpose, even commercially, as long as you provide appropriate attribution to the original authors.
251
+
252
+ **Copyright © 2021 Amey Thakur & Mega Satish**
253
+
254
+ ---
255
+
256
+ <!-- ABOUT -->
257
+ <a name="about-this-repository"></a>
258
+ ## About This Repository
259
+
260
+ **Created & Maintained by**: [Amey Thakur](https://github.com/Amey-Thakur) & [Mega Satish](https://github.com/msatmod)
261
+ **Role**: Data Science Interns
262
+ **Program**: Data Science Internship
263
+ **Organization**: [Technocolabs Software](https://technocolabs.com/)
264
+
265
+ This project features **Optimizing Stock Trading Strategy With Reinforcement Learning**, a study conducted as part of an industrial internship. It explores the practical application of Q-Learning in financial economics.
266
+
267
+ **Connect:** [GitHub](https://github.com/Amey-Thakur) &nbsp;·&nbsp; [LinkedIn](https://www.linkedin.com/in/amey-thakur) &nbsp;·&nbsp; [ORCID](https://orcid.org/0000-0001-5644-1575)
268
+
269
+ ### Acknowledgments
270
+
271
+ Grateful acknowledgment to [**Mega Satish**](https://github.com/msatmod) for her exceptional collaboration and scholarly partnership during the execution of this data science internship task. Her analytical precision, deep understanding of statistical modeling, and constant support were instrumental in refining the learning algorithms used in this study. Working alongside her was a transformative experience; her thoughtful approach to problem-solving and steady encouragement turned complex challenges into meaningful learning moments. This work reflects the growth and insights gained from our side-by-side academic journey. Thank you, Mega, for everything you shared and taught along the way.
272
+
273
+ Special thanks to the **mentors at Technocolabs Software** for providing this platform for rapid skill development and industrial exposure.
274
+
275
+ ---
276
+
277
+ <div align="center">
278
+
279
+ [↑ Back to Top](#readme-top)
280
+
281
+ [Authors](#authors) &nbsp;·&nbsp; [Overview](#overview) &nbsp;·&nbsp; [Features](#features) &nbsp;·&nbsp; [Structure](#project-structure) &nbsp;·&nbsp; [Results](#results) &nbsp;·&nbsp; [Quick Start](#quick-start) &nbsp;·&nbsp; [License](#license) &nbsp;·&nbsp; [About](#about-this-repository) &nbsp;·&nbsp; [Acknowledgments](#acknowledgments)
282
+
283
+ <br>
284
+
285
+ 📈 **[OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING](https://huggingface.co/spaces/ameythakur/Stock-Trading-RL)**
286
+
287
+ ---
288
+
289
+ ### Presented as part of the Data Science Internship @ Technocolabs Software
290
+
291
+ ---
292
+
293
+ ### 🎓 [Computer Engineering Repository](https://github.com/Amey-Thakur/COMPUTER-ENGINEERING)
294
+
295
+ **Computer Engineering (B.E.) - University of Mumbai**
296
+
297
+ *Semester-wise curriculum, laboratories, projects, and academic notes.*
298
+
299
+ </div>
SECURITY.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Security Policy
2
+
3
+ ## Maintenance Status
4
+
5
+ This repository is part of a curated collection of academic, engineering, and internship projects and is maintained in a finalized and stable state. The project is preserved as a complete and authoritative record, with its scope and contents intentionally fixed to ensure long-term academic and professional reference.
6
+
7
+ ## Supported Versions
8
+
9
+ As a finalized internship project, only the version listed below is authoritative:
10
+
11
+ | Version | Supported |
12
+ | ------- | --------- |
13
+ | 1.0.0 | Yes |
14
+
15
+ ## Vulnerability Reporting Protocol
16
+
17
+ In accordance with established academic and professional standards for security disclosure, security-related observations associated with this internship project are documented through formal scholarly channels.
18
+
19
+ To document a security concern, communication is facilitated with the project curators:
20
+ - **Curator**: [Amey Thakur](https://github.com/Amey-Thakur)
21
+ - **Collaborator**: [Mega Satish](https://github.com/msatmod)
22
+ - **Method**: Reports are submitted via the repository’s [GitHub Issues](https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/issues) interface to formally record security-related findings.
23
+
24
+ Submissions include:
25
+ 1. A precise and technically accurate description of the identified issue.
26
+ 2. Demonstrable steps or technical evidence sufficient to contextualize the finding.
27
+ 3. An explanation of the issue’s relevance within the defined scope of the project.
28
+
29
+ ## Implementation Context: Optimizing Stock Trading Strategy With Reinforcement Learning
30
+
31
+ This project consists of an implementation of a Reinforcement Learning model (Q-Learning) to optimize stock trading strategies, developed as part of a Data Science internship at Technocolabs Software.
32
+
33
+ - **Scope Limitation**: This policy applies exclusively to the documentation, code, and datasets contained within this repository and does not extend to the execution environment (Python/Streamlit runtime) or third-party libraries (Pandas, NumPy, etc.).
34
+
35
+ ## Technical Integrity Statement
36
+
37
+ This repository is preserved as a fixed academic, engineering, and internship project. Security-related submissions are recorded for documentation and contextual reference and do not imply active monitoring, response obligations, or subsequent modification of the repository.
38
+
39
+ ---
40
+
41
+ *This document defines the security posture of a finalized internship project.*
Source Code/.streamlit/config.toml ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ [theme]
2
+ base="dark"
3
+ primaryColor="#02eaf9"
4
+ font="serif"
Source Code/Procfile ADDED
@@ -0,0 +1 @@
 
 
1
+ web: sh setup.sh && streamlit run Stock-RL.py
Source Code/Stock-RL.py ADDED
@@ -0,0 +1,243 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Project: Optimizing Stock Trading Strategy With Reinforcement Learning
3
+ Authors: Amey Thakur & Mega Satish
4
+ Reference: https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING
5
+ License: MIT
6
+
7
+ Description:
8
+ This script contains the Main Application logic served via Streamlit.
9
+ It loads the pre-trained Q-Learning model (model.pkl), processes user-selected
10
+ stock data, simulates the trading strategy on unseen data, and visualizes
11
+ the portfolio performance using interactive Plotly charts.
12
+ """
13
+
14
+ import numpy as np
15
+ import pandas as pd
16
+ from pandas._libs.missing import NA
17
+ import streamlit as st
18
+ import time
19
+ import plotly.graph_objects as go
20
+ import pickle as pkl
21
+
22
+ # ==========================================
23
+ # 1. Data Processing Logic
24
+ # ==========================================
25
+ # @st.cache(persist=True)
26
+ def data_prep(data, name):
27
+ """
28
+ Prepares the dataset for the selected stock ticker.
29
+
30
+ Args:
31
+ data (pd.DataFrame): The raw dataset.
32
+ name (str): The specific stock name selected by the user.
33
+
34
+ Returns:
35
+ pd.DataFrame: A clean dataframe with computed Moving Averages (5-day & 1-day).
36
+ """
37
+ df = pd.DataFrame(data[data['Name'] == name])
38
+ df.dropna(inplace=True)
39
+ df.reset_index(drop=True, inplace=True)
40
+
41
+ # Calculate Moving Averages (Technical Indicators)
42
+ # These indicators form the basis of the State Space for the RL agent.
43
+ df['5day_MA'] = df['close'].rolling(5).mean()
44
+ df['1day_MA'] = df['close'].rolling(1).mean()
45
+
46
+ # Handle initial NaN values
47
+ df.loc[:4, '5day_MA'] = 0
48
+
49
+ return df
50
+
51
+ # ==========================================
52
+ # 2. Agent Logic (Inference)
53
+ # ==========================================
54
+ # @st.cache(persist=True)
55
+ def get_state(long_ma, short_ma, t):
56
+ """
57
+ Determines the current state of the market based on MA crossovers.
58
+
59
+ Returns a tuple (Trend, Position) matching the Q-Table structure used during training.
60
+ """
61
+ if short_ma < long_ma:
62
+ if t == 1:
63
+ return (0, 1) # Bearish, Cash
64
+ else:
65
+ return (0, 0) # Bearish, Stock
66
+
67
+ elif short_ma > long_ma:
68
+ if t == 1:
69
+ return (1, 1) # Bullish, Cash
70
+ else:
71
+ return (1, 0) # Bullish, Stock
72
+
73
+ return (0, 1) # Default
74
+
75
+ # @st.cache(persist=True)
76
+ def trade_t(num_of_stocks, port_value, current_price):
77
+ """
78
+ Checks if a trade (Buy) is financially feasible.
79
+ """
80
+ if num_of_stocks >= 0:
81
+ if port_value > current_price:
82
+ return 1 # Can Buy
83
+ else: return 0
84
+ else:
85
+ if port_value > current_price:
86
+ return 1
87
+ else: return 0
88
+
89
+ # @st.cache(persist=True)
90
+ def next_act(state, qtable, epsilon, action=3):
91
+ """
92
+ Decides the next action based on the trained Q-Table.
93
+
94
+ During inference (testing), epsilon is typically 0 (pure exploitation),
95
+ meaning the agent always chooses the optimal action learned during training.
96
+ """
97
+ if np.random.rand() < epsilon:
98
+ action = np.random.randint(action)
99
+ else:
100
+ action = np.argmax(qtable[state])
101
+ return action
102
+
103
+
104
+ # @st.cache(persist=True)
105
+ def test_stock(stocks_test, q_table, invest):
106
+ """
107
+ Runs a simulation of the trading strategy on the selected stock.
108
+
109
+ Args:
110
+ stocks_test (pd.DataFrame): The stock data to test on.
111
+ q_table (np.array): The loaded reinforcement learning model.
112
+ invest (int): Initial investment amount.
113
+
114
+ Returns:
115
+ list: A time-series list of net worth values over the simulation period.
116
+ """
117
+ num_stocks = 0
118
+ epsilon = 0 # No exploration during testing/inference
119
+ net_worth = [invest]
120
+ np.random.seed()
121
+
122
+ for dt in range(len(stocks_test)):
123
+ long_ma = stocks_test.iloc[dt]['5day_MA']
124
+ short_ma = stocks_test.iloc[dt]['1day_MA']
125
+ close_price = stocks_test.iloc[dt]['close']
126
+
127
+ # Determine Current State
128
+ t = trade_t(num_stocks, net_worth[-1], close_price)
129
+ state = get_state(long_ma, short_ma, t)
130
+
131
+ # Agent chooses action
132
+ action = next_act(state, q_table, epsilon)
133
+
134
+ if action == 0: # Buy
135
+ num_stocks += 1
136
+ to_append = net_worth[-1] - close_price
137
+ net_worth.append(np.round(to_append, 1))
138
+
139
+ elif action == 1: # Sell
140
+ num_stocks -= 1
141
+ to_append = net_worth[-1] + close_price
142
+ net_worth.append(np.round(to_append, 1))
143
+
144
+ elif action == 2: # Hold
145
+ to_append = net_worth[-1] + close_price # Mark-to-market valuation
146
+ net_worth.append(np.round(to_append, 1))
147
+
148
+ # Check for next state existence
149
+ try:
150
+ next_state = get_state(stocks_test.iloc[dt+1]['5day_MA'], stocks_test.iloc[dt+1]['1day_MA'], t)
151
+ except:
152
+ break
153
+
154
+ return net_worth
155
+
156
+
157
+ # ==========================================
158
+ # 3. Streamlit Interface
159
+ # ==========================================
160
+ def fun():
161
+ # Reading the Dataset
162
+ # Ensure all_stocks_5yr.csv is in the working directory
163
+ data = pd.read_csv('all_stocks_5yr.csv')
164
+ names = list(data['Name'].unique())
165
+ names.insert(0, "<Select Names>")
166
+
167
+ st.title("Optimizing Stock Trading Strategy With Reinforcement Learning")
168
+
169
+ st.sidebar.title("Choose Stock and Investment")
170
+ st.sidebar.subheader("Choose Company Stocks")
171
+
172
+ # User Input: Select Stock
173
+ stock = st.sidebar.selectbox("(*select one stock only)", names, index=0)
174
+
175
+ if stock != "<Select Names>":
176
+ stock_df = data_prep(data, stock)
177
+
178
+ # Sidebar Checkbox: Plot Data Trend
179
+ if st.sidebar.button("Show Stock Trend", key=1):
180
+ fig = go.Figure()
181
+ fig.add_trace(go.Scatter(
182
+ x=stock_df['date'],
183
+ y=stock_df['close'],
184
+ mode='lines',
185
+ name='Stock_Trend',
186
+ line=dict(color='cyan', width=2)
187
+ ))
188
+ fig.update_layout(
189
+ title='Stock Trend of ' + stock,
190
+ xaxis_title='Date',
191
+ yaxis_title='Price ($) '
192
+ )
193
+ st.plotly_chart(fig, use_container_width=True)
194
+
195
+ # Simple heuristic for trend feedback
196
+ if stock_df.iloc[500]['close'] > stock_df.iloc[0]['close']:
197
+ original_title = '<b><p style="font-family:Play; color:Cyan; font-size: 20px;">NOTE:<br>Stock is on a solid upward trend. Investing here might be profitable.</p>'
198
+ st.markdown(original_title, unsafe_allow_html=True)
199
+ else:
200
+ original_title = '<b><p style="font-family:Play; color:Red; font-size: 20px;">NOTE:<br> Stock does not appear to be in a solid uptrend. Better not to invest here; instead, pick different stock.</p>'
201
+ st.markdown(original_title, unsafe_allow_html=True)
202
+
203
+ # Sidebar Checkbox: Investment Simulation
204
+ st.sidebar.subheader("Enter Your Available Initial Investment Fund")
205
+ invest = st.sidebar.slider('Select a range of values', 1000, 1000000)
206
+
207
+ if st.sidebar.button("Calculate", key=2):
208
+ # Load Pre-trained Model
209
+ try:
210
+ # Using 'model.pkl' as standardized
211
+ q_table = pkl.load(open('model.pkl', 'rb'))
212
+ except FileNotFoundError:
213
+ st.error("Model file 'model.pkl' not found. Please ensure the model is trained.")
214
+ return
215
+
216
+ # Run Simulation
217
+ net_worth = test_stock(stock_df, q_table, invest)
218
+ net_worth = pd.DataFrame(net_worth, columns=['value'])
219
+
220
+ # Plot Results
221
+ fig = go.Figure()
222
+ fig.add_trace(go.Scatter(
223
+ x=net_worth.index,
224
+ y=net_worth['value'],
225
+ mode='lines',
226
+ name='Net_Worth_Trend',
227
+ line=dict(color='cyan', width=2)
228
+ ))
229
+ fig.update_layout(
230
+ title='Change in Portfolio Value Day by Day',
231
+ xaxis_title='Number of Days since Feb 2013 ',
232
+ yaxis_title='Value ($) '
233
+ )
234
+ st.plotly_chart(fig, use_container_width=True)
235
+
236
+ original_title = '<b><p style="font-family:Play; color:Cyan; font-size: 20px;">NOTE:<br> Increase in your net worth as a result of a model decision.</p>'
237
+ st.markdown(original_title, unsafe_allow_html=True)
238
+
239
+
240
+ if __name__ == '__main__':
241
+ fun()
242
+ # Dummy chart for layout purposes if needed, otherwise optional
243
+ # chart_data = pd.DataFrame(np.random.randn(20, 3), columns=['a', 'b', 'c'])
Source Code/Train_model/Model.ipynb ADDED
@@ -0,0 +1,407 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": 23,
6
+ "source": [
7
+ "import pandas as pd\r\n",
8
+ "import numpy as np\r\n",
9
+ "import seaborn as sns\r\n",
10
+ "import matplotlib.pyplot as plt\r\n",
11
+ "import pickle as pk"
12
+ ],
13
+ "outputs": [],
14
+ "metadata": {}
15
+ },
16
+ {
17
+ "cell_type": "code",
18
+ "execution_count": 24,
19
+ "source": [
20
+ "df=pd.read_csv('all_stocks_5yr.csv')\r\n"
21
+ ],
22
+ "outputs": [],
23
+ "metadata": {}
24
+ },
25
+ {
26
+ "cell_type": "code",
27
+ "execution_count": 22,
28
+ "source": [
29
+ "df1=df['close']\r\n",
30
+ "df1.iloc[0]"
31
+ ],
32
+ "outputs": [
33
+ {
34
+ "output_type": "execute_result",
35
+ "data": {
36
+ "text/plain": [
37
+ "14.75"
38
+ ]
39
+ },
40
+ "metadata": {},
41
+ "execution_count": 22
42
+ }
43
+ ],
44
+ "metadata": {}
45
+ },
46
+ {
47
+ "cell_type": "code",
48
+ "execution_count": 25,
49
+ "source": [
50
+ "#Creating Environment Matrix 2x2x3\r\n",
51
+ "env_rows=2\r\n",
52
+ "env_cols=2\r\n",
53
+ "n_action=3\r\n",
54
+ "\r\n",
55
+ "q_table=np.zeros((env_rows,env_cols,n_action))\r\n",
56
+ "np.random.seed()\r\n",
57
+ "pk.dump(q_table,open(\"pickl.pkl\",'wb'))"
58
+ ],
59
+ "outputs": [],
60
+ "metadata": {}
61
+ },
62
+ {
63
+ "cell_type": "code",
64
+ "execution_count": 7,
65
+ "source": [
66
+ "pk.load(open(\"pickl.pkl\",'rb'))"
67
+ ],
68
+ "outputs": [
69
+ {
70
+ "output_type": "execute_result",
71
+ "data": {
72
+ "text/plain": [
73
+ "'hey'"
74
+ ]
75
+ },
76
+ "metadata": {},
77
+ "execution_count": 7
78
+ }
79
+ ],
80
+ "metadata": {}
81
+ },
82
+ {
83
+ "cell_type": "code",
84
+ "execution_count": 26,
85
+ "source": [
86
+ "#Defining Data Preprocessing Function\r\n",
87
+ "\r\n",
88
+ "def data_prep(data,name):\r\n",
89
+ " df=pd.DataFrame(data[data['Name']==name])\r\n",
90
+ " df.dropna(inplace=True)\r\n",
91
+ " df.drop(['high','low','volume','Name'],axis=1,inplace=True)\r\n",
92
+ " df.reset_index(drop=True,inplace=True)\r\n",
93
+ " # Calculating 5 day and 1 day Moving Average for DF\r\n",
94
+ " df['5day_MA']=df['close'].rolling(5).mean()\r\n",
95
+ " df['1day_MA']=df['close'].rolling(1).mean()\r\n",
96
+ " df['5day_MA'][:4]=0\r\n",
97
+ " #Splitting into train and Test data\r\n",
98
+ " train_df=df[:int(len(df)*0.8)]\r\n",
99
+ " test_df=df[int(len(df)*0.8):].reset_index(drop=True)\r\n",
100
+ " return train_df,test_df\r\n",
101
+ "\r\n",
102
+ "# Get the state for datapoint by Moving Average\r\n",
103
+ "def get_state(long_ma,short_ma,t):\r\n",
104
+ " if short_ma<long_ma:\r\n",
105
+ " if t==1:\r\n",
106
+ " return (0,1) #Cash\r\n",
107
+ " else :\r\n",
108
+ " return (0,0) #Stock\r\n",
109
+ " \r\n",
110
+ " elif short_ma>long_ma:\r\n",
111
+ " if t==1:\r\n",
112
+ " return (1,1) #Cash\r\n",
113
+ " else :\r\n",
114
+ " return (1,0) #Stock\r\n",
115
+ "\r\n",
116
+ "\r\n",
117
+ "#Checking if the user can trade or not\r\n",
118
+ "def trade_t(num_of_stocks,port_value,current_price):\r\n",
119
+ " if num_of_stocks>=0:\r\n",
120
+ " if port_value>current_price:\r\n",
121
+ " return 1\r\n",
122
+ " else :return 0\r\n",
123
+ " else:\r\n",
124
+ " if port_value>current_price:\r\n",
125
+ " return 1\r\n",
126
+ " else :return 0\r\n",
127
+ "\r\n",
128
+ "\r\n",
129
+ "\r\n",
130
+ "#Get next action by Epsilon greedy\r\n",
131
+ "def next_act(state,qtable,epsilon,action=3):\r\n",
132
+ " if np.random.rand() < epsilon:\r\n",
133
+ " action=np.random.randint(action)\r\n",
134
+ " else:\r\n",
135
+ " action=np.argmax(qtable[state])\r\n",
136
+ " \r\n",
137
+ " \r\n",
138
+ " return action\r\n",
139
+ "\r\n",
140
+ "\r\n",
141
+ "\r\n",
142
+ "# Immidiate reward Generator based on cummulative wealth \r\n",
143
+ "def get_reward(state,action,current_close,past_close,buy_history):\r\n",
144
+ " if state==(0,0) or state==(1,0): #Stock position\r\n",
145
+ " if action==0:\r\n",
146
+ " return -1000\r\n",
147
+ " elif action==1:\r\n",
148
+ " return (current_close-buy_history)\r\n",
149
+ " elif action==2:\r\n",
150
+ " return (current_close-past_close)\r\n",
151
+ " \r\n",
152
+ " elif state==(0,1) or state==(1,1): #Cash Position\r\n",
153
+ " if action==0:\r\n",
154
+ " return 0\r\n",
155
+ " elif action==1:\r\n",
156
+ " return -1000\r\n",
157
+ " elif action==2:\r\n",
158
+ " return (current_close-past_close)\r\n",
159
+ "\r\n",
160
+ " \r\n",
161
+ " \r\n",
162
+ " \r\n",
163
+ "\r\n"
164
+ ],
165
+ "outputs": [],
166
+ "metadata": {}
167
+ },
168
+ {
169
+ "cell_type": "markdown",
170
+ "source": [
171
+ "<h4>Reading and preprocessing the Dataset"
172
+ ],
173
+ "metadata": {}
174
+ },
175
+ {
176
+ "cell_type": "code",
177
+ "execution_count": 27,
178
+ "source": [
179
+ "stocks=pd.read_csv('all_stocks_5yr.csv')\r\n",
180
+ "stocks_train,stocks_test=data_prep(stocks,'AAPL')"
181
+ ],
182
+ "outputs": [
183
+ {
184
+ "output_type": "stream",
185
+ "name": "stderr",
186
+ "text": [
187
+ "C:\\Users\\mchil\\AppData\\Local\\Temp/ipykernel_12420/1010674326.py:11: SettingWithCopyWarning: \n",
188
+ "A value is trying to be set on a copy of a slice from a DataFrame\n",
189
+ "\n",
190
+ "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
191
+ " df['5day_MA'][:4]=0\n"
192
+ ]
193
+ }
194
+ ],
195
+ "metadata": {}
196
+ },
197
+ {
198
+ "cell_type": "markdown",
199
+ "source": [
200
+ "<h4>Training the Dataset"
201
+ ],
202
+ "metadata": {}
203
+ },
204
+ {
205
+ "cell_type": "code",
206
+ "execution_count": 28,
207
+ "source": [
208
+ "episodes=100\r\n",
209
+ "port_value=1000\r\n",
210
+ "num_stocks=0\r\n",
211
+ "epsilon=1 #Epsilon Greedy\r\n",
212
+ "alpha=0.05 #Learning Rate\r\n",
213
+ "gamma=0.15 #Discount Factor\r\n",
214
+ "buy_history=0\r\n",
215
+ "net_worth=[1000] #Portfolio Value\r\n",
216
+ "np.random.seed()\r\n",
217
+ "for i in range(episodes): #Iteration for each episode\r\n",
218
+ " port_value=1000\r\n",
219
+ " num_stocks=0\r\n",
220
+ " buy_history=0\r\n",
221
+ " net_worth=[1000]\r\n",
222
+ " \r\n",
223
+ "\r\n",
224
+ " for dt in range(len(stocks_train)): #Iteration through each dataset\r\n",
225
+ " long_ma=stocks_train.iloc[dt]['5day_MA']\r\n",
226
+ " short_ma=stocks_train.iloc[dt]['1day_MA']\r\n",
227
+ " close_price=stocks_train.iloc[dt]['close']\r\n",
228
+ " next_close=0\r\n",
229
+ " \r\n",
230
+ " if dt>0:\r\n",
231
+ " past_close=stocks_train.iloc[dt-1]['close']\r\n",
232
+ " else:\r\n",
233
+ " past_close=close_price\r\n",
234
+ " t=trade_t(num_stocks,net_worth[-1],close_price)\r\n",
235
+ " state=get_state(long_ma,short_ma,t)\r\n",
236
+ " action=next_act(state,q_table,epsilon)\r\n",
237
+ "\r\n",
238
+ " if action==0:#Buy\r\n",
239
+ " \r\n",
240
+ " num_stocks+=1\r\n",
241
+ " buy_history=close_price\r\n",
242
+ " to_append=net_worth[-1]-close_price\r\n",
243
+ " net_worth.append(np.round(to_append,1))\r\n",
244
+ " r=0\r\n",
245
+ " \r\n",
246
+ " \r\n",
247
+ " \r\n",
248
+ " elif action==1:#Sell\r\n",
249
+ " # if num_stocks>0:\r\n",
250
+ " num_stocks-=1 \r\n",
251
+ " to_append=net_worth[-1]+close_price\r\n",
252
+ " net_worth.append(np.round(to_append,1))\r\n",
253
+ " # buy_history.pop(0)\r\n",
254
+ " \r\n",
255
+ " elif action==2:#hold\r\n",
256
+ " to_append=net_worth[-1]+close_price\r\n",
257
+ " net_worth.append(np.round(to_append,1))\r\n",
258
+ " \r\n",
259
+ " \r\n",
260
+ " \r\n",
261
+ " \r\n",
262
+ "\r\n",
263
+ " r=get_reward(state,action,close_price,past_close,buy_history) #Getting Reward\r\n",
264
+ " \r\n",
265
+ " try:\r\n",
266
+ " next_state=get_state(stocks_train.iloc[dt+1]['5day_MA'],stocks_train.iloc[dt+1]['1day_MA'],t)\r\n",
267
+ " \r\n",
268
+ " except:\r\n",
269
+ " break\r\n",
270
+ " #Updating Q_table by Bellmen's Equation\r\n",
271
+ " q_table[state][action]=(1.-alpha)*q_table[state][action]+alpha*(r+gamma*np.max(q_table[next_state]))\r\n",
272
+ " \r\n",
273
+ " if (epsilon-0.01)>0.15:\r\n",
274
+ " epsilon-=0.01\r\n",
275
+ "\r\n",
276
+ "print(\"Training Complete\")"
277
+ ],
278
+ "outputs": [
279
+ {
280
+ "output_type": "stream",
281
+ "name": "stdout",
282
+ "text": [
283
+ "Training Complete\n"
284
+ ]
285
+ }
286
+ ],
287
+ "metadata": {}
288
+ },
289
+ {
290
+ "cell_type": "code",
291
+ "execution_count": 38,
292
+ "source": [
293
+ "pk.dump(q_table,open('pickl.pkl','wb'))"
294
+ ],
295
+ "outputs": [],
296
+ "metadata": {}
297
+ },
298
+ {
299
+ "cell_type": "markdown",
300
+ "source": [
301
+ "<h4>Tracking the Portfolio Value "
302
+ ],
303
+ "metadata": {}
304
+ },
305
+ {
306
+ "cell_type": "code",
307
+ "execution_count": null,
308
+ "source": [],
309
+ "outputs": [],
310
+ "metadata": {}
311
+ },
312
+ {
313
+ "cell_type": "code",
314
+ "execution_count": null,
315
+ "source": [],
316
+ "outputs": [],
317
+ "metadata": {}
318
+ },
319
+ {
320
+ "cell_type": "markdown",
321
+ "source": [
322
+ "<h4>Testing the Dataset"
323
+ ],
324
+ "metadata": {}
325
+ },
326
+ {
327
+ "cell_type": "code",
328
+ "execution_count": 8,
329
+ "source": [],
330
+ "outputs": [
331
+ {
332
+ "output_type": "stream",
333
+ "name": "stdout",
334
+ "text": [
335
+ "Test Complete\n"
336
+ ]
337
+ }
338
+ ],
339
+ "metadata": {}
340
+ },
341
+ {
342
+ "cell_type": "markdown",
343
+ "source": [
344
+ "<h4>Plotting the portfolio for the test Dataset "
345
+ ],
346
+ "metadata": {}
347
+ },
348
+ {
349
+ "cell_type": "code",
350
+ "execution_count": null,
351
+ "source": [],
352
+ "outputs": [],
353
+ "metadata": {}
354
+ },
355
+ {
356
+ "cell_type": "code",
357
+ "execution_count": 10,
358
+ "source": [
359
+ "num_stocks"
360
+ ],
361
+ "outputs": [
362
+ {
363
+ "output_type": "execute_result",
364
+ "data": {
365
+ "text/plain": [
366
+ "94"
367
+ ]
368
+ },
369
+ "metadata": {},
370
+ "execution_count": 10
371
+ }
372
+ ],
373
+ "metadata": {}
374
+ },
375
+ {
376
+ "cell_type": "code",
377
+ "execution_count": null,
378
+ "source": [],
379
+ "outputs": [],
380
+ "metadata": {}
381
+ }
382
+ ],
383
+ "metadata": {
384
+ "orig_nbformat": 4,
385
+ "language_info": {
386
+ "name": "python",
387
+ "version": "3.9.7",
388
+ "mimetype": "text/x-python",
389
+ "codemirror_mode": {
390
+ "name": "ipython",
391
+ "version": 3
392
+ },
393
+ "pygments_lexer": "ipython3",
394
+ "nbconvert_exporter": "python",
395
+ "file_extension": ".py"
396
+ },
397
+ "kernelspec": {
398
+ "name": "python3",
399
+ "display_name": "Python 3.9.7 64-bit"
400
+ },
401
+ "interpreter": {
402
+ "hash": "60d8401257a87028599f7501811ce2c94d605f29d0573af229f453e115e13ba6"
403
+ }
404
+ },
405
+ "nbformat": 4,
406
+ "nbformat_minor": 2
407
+ }
Source Code/all_stocks_5yr.csv ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6aea253cd19de60b568143991aaf1fa482456565c389205658d236e595e716cf
3
+ size 29580549
Source Code/model.pkl ADDED
Binary file (247 Bytes). View file
 
Source Code/model_training.py ADDED
@@ -0,0 +1,268 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Project: Optimizing Stock Trading Strategy With Reinforcement Learning
3
+ Authors: Amey Thakur & Mega Satish
4
+ Reference: https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING
5
+ License: MIT
6
+
7
+ Description:
8
+ This script implements the training phase of the Reinforcement Learning agent (Q-Learning).
9
+ It preprocesses historical stock data, defines the market environment as a set of states
10
+ based on Moving Average crossovers, and iteratively updates a Q-Table to learn optimal
11
+ trading actions (Buy, Sell, Hold) that maximize portfolio returns.
12
+ """
13
+
14
+ import pandas as pd
15
+ import numpy as np
16
+ import pickle as pkl
17
+ import os
18
+
19
+ # ==========================================
20
+ # 1. Data Preprocessing
21
+ # ==========================================
22
+ def data_prep(data, name):
23
+ """
24
+ Preprocesses the stock data for a specific company.
25
+
26
+ Args:
27
+ data (pd.DataFrame): The complete dataset containing all stocks.
28
+ name (str): The ticker symbol of the stock to filter (e.g., 'AAPL').
29
+
30
+ Returns:
31
+ tuple: (train_df, test_df) - The split training and testing datasets.
32
+
33
+ Methodology:
34
+ - Filters data by stock name.
35
+ - Computes Technical Indicators: 5-day and 1-day Moving Averages (MA).
36
+ - 5-day MA represents the short-term trend baseline.
37
+ - 1-day MA represents the immediate price action.
38
+ - The interaction between these two MAs serves as the primary signal for state determination.
39
+ """
40
+ df = pd.DataFrame(data[data['Name'] == name])
41
+ df.dropna(inplace=True)
42
+ df.drop(['high', 'low', 'volume', 'Name'], axis=1, inplace=True)
43
+ df.reset_index(drop=True, inplace=True)
44
+
45
+ # Calculating Moving Averages used for State Definition
46
+ df['5day_MA'] = df['close'].rolling(5).mean()
47
+ df['1day_MA'] = df['close'].rolling(1).mean()
48
+
49
+ # Initialize first few rows where rolling mean is NaN
50
+ df.loc[:4, '5day_MA'] = 0
51
+
52
+ # Splitting into Train (80%) and Test (20%) sets
53
+ split_idx = int(len(df) * 0.8)
54
+ train_df = df[:split_idx]
55
+ test_df = df[split_idx:].reset_index(drop=True)
56
+
57
+ return train_df, test_df
58
+
59
+ # ==========================================
60
+ # 2. Environment & State Definitions
61
+ # ==========================================
62
+ def get_state(long_ma, short_ma, t):
63
+ """
64
+ Discretizes continuous market data into a finite set of states.
65
+
66
+ The state space is defined by a tuple (Trend_Signal, Holding_Status).
67
+
68
+ 1. Trend_Signal:
69
+ - 0: short_ma < long_ma (Bearish/Downtrend)
70
+ - 1: short_ma > long_ma (Bullish/Uptrend)
71
+
72
+ 2. Holding_Status (t):
73
+ - 0: Currently holding stock
74
+ - 1: Currently holding cash (no stock)
75
+
76
+ Returns:
77
+ tuple: (trend, holding_status) representing the current environment state.
78
+ """
79
+ if short_ma < long_ma:
80
+ if t == 1:
81
+ return (0, 1) # Bearish Trend, Holding Cash
82
+ else:
83
+ return (0, 0) # Bearish Trend, Holding Stock
84
+
85
+ elif short_ma > long_ma:
86
+ if t == 1:
87
+ return (1, 1) # Bullish Trend, Holding Cash
88
+ else:
89
+ return (1, 0) # Bullish Trend, Holding Stock
90
+
91
+ # Default case (should rarely be hit with floats)
92
+ return (0, 1)
93
+
94
+ def trade_t(num_of_stocks, port_value, current_price):
95
+ """
96
+ Determines the holding capability of the agent.
97
+
98
+ Returns:
99
+ int: 1 if the agent has capital to buy (Cash), 0 if fully invested (Stock).
100
+ """
101
+ # Simply mapping: if we have stocks or cash value > current price, we can 'technically' buy/hold
102
+ # But in this simplified binary state (All-in or All-out), we track logical status.
103
+ # Here, we simplify:
104
+ if num_of_stocks > 0:
105
+ return 0 # User holds stock
106
+ else:
107
+ if port_value > current_price:
108
+ return 1 # User holds cash and can afford stock
109
+ else:
110
+ return 0 # User is broke/cannot buy
111
+
112
+ # ==========================================
113
+ # 3. Q-Learning Agent Logic
114
+ # ==========================================
115
+ def next_act(state, qtable, epsilon, action_space=3):
116
+ """
117
+ Selects the next action using the Epsilon-Greedy Policy.
118
+
119
+ Args:
120
+ state (tuple): The current state of the environment.
121
+ qtable (np.array): The Q-Table storing action-values.
122
+ epsilon (float): Exploration rate (probability of random action).
123
+
124
+ Returns:
125
+ int: The selected action index.
126
+ 0: Buy
127
+ 1: Sell
128
+ 2: Hold
129
+ """
130
+ if np.random.rand() < epsilon:
131
+ # Exploration: Random action
132
+ action = np.random.randint(action_space)
133
+ else:
134
+ # Exploitation: Best known action from Q-Table
135
+ action = np.argmax(qtable[state])
136
+
137
+ return action
138
+
139
+ def get_reward(state, action, current_close, past_close, buy_history):
140
+ """
141
+ Calculates the immediate reward for a given state-action pair.
142
+
143
+ The Reward Function is crucial for guiding the agent:
144
+ - Penalize invalid moves (e.g., Buying when already holding).
145
+ - Reward profit generation (Selling higher than bought).
146
+ - Reward capital preservation (Holding during downturns).
147
+ """
148
+ if state == (0, 0) or state == (1, 0): # State: Holding Stock
149
+ if action == 0: # Try to Buy again
150
+ return -1000 # Heavy Penalty for illegal move
151
+ elif action == 1: # Sell
152
+ return (current_close - buy_history) # Reward is the realized PnL
153
+ elif action == 2: # Hold
154
+ return (current_close - past_close) # Reward is the unrealized daily change
155
+
156
+ elif state == (0, 1) or state == (1, 1): # State: Holding Cash
157
+ if action == 0: # Buy
158
+ return 0 # Neutral reward for entering position
159
+ elif action == 1: # Try to Sell again
160
+ return -1000 # Heavy Penalty for illegal move
161
+ elif action == 2: # Hold (Wait)
162
+ return (current_close - past_close) # Opportunity cost/benefit tracking
163
+
164
+ return 0
165
+
166
+ # ==========================================
167
+ # 4. Main Training Loop
168
+ # ==========================================
169
+ def train_model():
170
+ print("Initializing Training Process...")
171
+
172
+ # 4.1 Initialize Q-Table
173
+ # Dimensions: 2 (Trend States) x 2 (Holding States) x 3 (Actions)
174
+ env_rows = 2
175
+ env_cols = 2
176
+ n_action = 3
177
+ q_table = np.zeros((env_rows, env_cols, n_action))
178
+
179
+ # 4.2 Load Data
180
+ try:
181
+ stocks = pd.read_csv('all_stocks_5yr.csv')
182
+ # We train primarily on AAPL as the representative asset for this strategy
183
+ stocks_train, _ = data_prep(stocks, 'AAPL')
184
+ except FileNotFoundError:
185
+ print("Error: 'all_stocks_5yr.csv' not found.")
186
+ return
187
+
188
+ # 4.3 Hyperparameters
189
+ episodes = 100 # Number of times to iterate over the dataset
190
+ epsilon = 1.0 # Initial Exploration Rate (100% random)
191
+ alpha = 0.05 # Learning Rate (Impact of new information)
192
+ gamma = 0.15 # Discount Factor (Importance of future rewards)
193
+
194
+ print(f"Starting Training for {episodes} episodes...")
195
+
196
+ for i in range(episodes):
197
+ # Reset Episode Variables
198
+ port_value = 1000
199
+ num_stocks = 0
200
+ buy_history = 0
201
+ net_worth = [1000]
202
+
203
+ # Iterate over the time-series
204
+ for dt in range(len(stocks_train)):
205
+ long_ma = stocks_train.iloc[dt]['5day_MA']
206
+ short_ma = stocks_train.iloc[dt]['1day_MA']
207
+ close_price = stocks_train.iloc[dt]['close']
208
+
209
+ # Get Previous Close for Reward Calc
210
+ if dt > 0:
211
+ past_close = stocks_train.iloc[dt-1]['close']
212
+ else:
213
+ past_close = close_price
214
+
215
+ # Determine Current State
216
+ t = trade_t(num_stocks, net_worth[-1], close_price)
217
+ state = get_state(long_ma, short_ma, t)
218
+
219
+ # Select Action
220
+ action = next_act(state, q_table, epsilon)
221
+
222
+ # Execute Action & Update Portfolio Logic
223
+ if action == 0: # Buy
224
+ num_stocks += 1
225
+ buy_history = close_price
226
+ net_worth.append(np.round(net_worth[-1] - close_price, 1))
227
+ r = 0 # Reward calculated later if needed, mostly 0 for entry
228
+
229
+ elif action == 1: # Sell
230
+ num_stocks -= 1
231
+ net_worth.append(np.round(net_worth[-1] + close_price, 1))
232
+ # buy_history handled in reward
233
+
234
+ elif action == 2: # Hold
235
+ net_worth.append(np.round(net_worth[-1] + close_price, 1)) # Simplified tracking
236
+
237
+ # Compute Reward
238
+ r = get_reward(state, action, close_price, past_close, buy_history)
239
+
240
+ # Observe Next State
241
+ try:
242
+ next_long = stocks_train.iloc[dt+1]['5day_MA']
243
+ next_short = stocks_train.iloc[dt+1]['1day_MA']
244
+ next_state = get_state(next_long, next_short, t)
245
+ except IndexError:
246
+ # End of data
247
+ break
248
+
249
+ # Update Q-Value using Bellman Equation
250
+ # Q(s,a) = (1-alpha) * Q(s,a) + alpha * (reward + gamma * max(Q(s', a')))
251
+ q_table[state][action] = (1. - alpha) * q_table[state][action] + alpha * (r + gamma * np.max(q_table[next_state]))
252
+
253
+ # Decay Epsilon to reduce exploration over time
254
+ if (epsilon - 0.01) > 0.15:
255
+ epsilon -= 0.01
256
+
257
+ if (i + 1) % 10 == 0:
258
+ print(f"Episode {i+1}/{episodes} complete. Epsilon: {epsilon:.2f}")
259
+
260
+ print("Training Complete.")
261
+
262
+ # 4.4 Save the Trained Model
263
+ with open('model.pkl', 'wb') as f:
264
+ pkl.dump(q_table, f)
265
+ print("Model saved to 'model.pkl'.")
266
+
267
+ if __name__ == "__main__":
268
+ train_model()
Source Code/requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ plotly==5.3.1
2
+ numpy==1.21.2
3
+ streamlit==0.88.0
4
+ pandas==1.3.2
Source Code/setup.sh ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ mkdir -p ~/.streamlit/
2
+ echo "\
3
+ [server]\n\
4
+ headless = true\n\
5
+ port = $PORT\n\
6
+ enableCORS = false\n\
7
+ \n\
8
+ " > ~/.streamlit/config.toml
Technocolabs/AMEY THAKUR - BLUEPRINT.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c3a27191ddee99cfe755f89dc8e0969b8bbd7ac23ec434abddaf8c0aa28a334c
3
+ size 51084
Technocolabs/Optimizing Stock Trading Strategy With Reinforcement Learning.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e282cd349014ab77bb883975a1fdb98fd3f83011d767e8496de1d65ac12a2571
3
+ size 2348727
Technocolabs/PROJECT REPORT.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:193080af8207444469c3ba503f5b521b86baead5dce61f5e75fac67746b4b787
3
+ size 2347221
Technocolabs/Technocolabs Software - Data Scientist - Internship Completion Letter.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:850927a3c60791a4582d0cbeff15f3be453cd56e289e837486f5dd9b671cd7b5
3
+ size 171716
Technocolabs/Technocolabs Software - Data Scientist - Internship Offer Letter.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc9883d92c762771afdd790072ad79bd896ae36b9207f4b4b769673a2266df46
3
+ size 71201
Technocolabs/Technocolabs Software - Data Scientist - Letter of Recommendation.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2527b413b3e0eec237aaa592ead956c839e35fa91cda1c845f7c3601a4a84547
3
+ size 247718
Technocolabs/Technocolabs Software - Data Scientist - Project Completion Letter.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cdb1e8c9513a65afe7118e82286b633a65b4cf4d4a48d1c0e1f273c033156be1
3
+ size 193938
codemeta.json ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
3
+ "@type": "SoftwareSourceCode",
4
+ "name": "OPTIMIZING STOCK TRADING STRATEGY WITH REINFORCEMENT LEARNING",
5
+ "description": "Data Science Internship at Technocolabs Software. Task: To optimize stock trading strategy using Reinforcement Learning. The solution implements Q-Learning and a Streamlit-based web interface for real-time strategy visualization.",
6
+ "identifier": "OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING",
7
+ "license": "https://spdx.org/licenses/MIT.html",
8
+ "programmingLanguage": [
9
+ "Python"
10
+ ],
11
+ "author": [
12
+ {
13
+ "@type": "Person",
14
+ "givenName": "Amey",
15
+ "familyName": "Thakur",
16
+ "id": "https://orcid.org/0000-0001-5644-1575"
17
+ },
18
+ {
19
+ "@type": "Person",
20
+ "givenName": "Mega",
21
+ "familyName": "Satish",
22
+ "id": "https://orcid.org/0000-0002-1844-9557"
23
+ }
24
+ ],
25
+ "dateReleased": "2021-09-18",
26
+ "codeRepository": "https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING",
27
+ "developmentStatus": "complete",
28
+ "applicationCategory": "Data Science / Reinforcement Learning",
29
+ "keywords": [
30
+ "Technocolabs Software",
31
+ "Data Science",
32
+ "Stock Trading",
33
+ "Reinforcement Learning",
34
+ "Q-Learning",
35
+ "Python3",
36
+ "Pandas",
37
+ "Numpy",
38
+ "Streamlit"
39
+ ],
40
+ "relatedLink": [
41
+ "https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING"
42
+ ]
43
+ }
docs/SPECIFICATION.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Technical Specification: Optimizing Stock Trading Strategy
2
+
3
+ ## Architectural Overview
4
+
5
+ **Optimizing Stock Trading Strategy With Reinforcement Learning** is a predictive modeling study designed to demonstrate the application of Q-Learning in optimizing trading decisions. The project serves as a digital exploration into machine learning heuristics for financial markets, established during a Data Science internship program at Technocolabs Software.
6
+
7
+ ### Analytics Pipeline
8
+
9
+ ```mermaid
10
+ graph TD
11
+ Start["Stock Data (CSV)"] --> Load["Data Ingestion (Pandas)"]
12
+ Load --> Feature["Feature Engineering (Moving Averages)"]
13
+ Feature --> Agent["Q-Learning Agent"]
14
+ Agent --> State["State Definition (MA Crossover + Trend)"]
15
+ State --> Action["Action Selection (Buy/Sell/Hold)"]
16
+ Action --> Portfolio["Portfolio Update"]
17
+ Portfolio --> Visualize["Streamlit Visualization"]
18
+ ```
19
+
20
+ ---
21
+
22
+ ## Technical Implementations
23
+
24
+ ### 1. Modeling Architecture
25
+ - **Core**: Built on **NumPy** and **Pandas**, utilizing custom Q-Learning logic for decision making.
26
+ - **Estimation Logic**: Establishing a relationship between market states (Moving Averages) and optimal actions to maximize portfolio value.
27
+
28
+ ### 2. Evaluation & Validation
29
+ - **Metrics**: Evaluates performance based on net worth accumulation over a 5-year period compared to a buy-and-hold strategy.
30
+ - **Reproducibility**: Utilizes historical stock data to promote consistent testing environments.
31
+ - **Heuristics**: Scalable decision logic encapsulated in a python script to process real-time simulation.
32
+
33
+ ### 3. Developmental Infrastructure
34
+ - **Notebook Runtime**: The primary research was conducted in **Jupyter Notebook**, exploring state representation and reward functions.
35
+ - **Source Production**: The analytical kernel is deployed via a **Streamlit App**, bridging the gap between statistical modeling and end-user interactive application.
36
+
37
+ ---
38
+
39
+ ## Technical Prerequisites
40
+
41
+ - **Runtime**: Python 3.7+ environment (Local or Cloud-based).
42
+ - **Dependencies**: `pandas`, `numpy`, `streamlit`, and `plotly` libraries.
43
+
44
+ ---
45
+
46
+ *Technical Specification | Data Science | Version 1.0*
screenshots/01-landing-page.png ADDED

Git LFS Details

  • SHA256: c19eb11e86dac95a4826629cbd01510f1d55acc3676e243e0f7dbf3a8a2d9af9
  • Pointer size: 130 Bytes
  • Size of remote file: 48.2 kB
screenshots/02-amzn-trend.png ADDED

Git LFS Details

  • SHA256: ba682341788869786696bf6f3994f1af474a79c042f40436148246acbae1767b
  • Pointer size: 130 Bytes
  • Size of remote file: 70.1 kB
screenshots/03-portfolio-growth.png ADDED

Git LFS Details

  • SHA256: 7049d2446494aa5ded5e402f293db340d3e02416cb8cb99ebd43c3ec49be604e
  • Pointer size: 130 Bytes
  • Size of remote file: 71.7 kB
screenshots/04-alb-trend.png ADDED

Git LFS Details

  • SHA256: 61423f49f2f0d6f5d6f50addf5d16a7496abcf7756d35babd9316410f7c6055b
  • Pointer size: 130 Bytes
  • Size of remote file: 74.5 kB