NLP_WSD / README.md
gkc55's picture
Fix NLTK data loading and Docker configuration
24e1f38
---
title: Word Sense Disambiguation
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
---
# πŸ‘‹ Hi, I'm Gunjankumar Nitin Choudhari
[![LinkedIn](https://img.shields.io/badge/LinkedIn-Connect-blue)](https://linkedin.com/in/gunjankumarchoudhari)
[![Portfolio](https://img.shields.io/badge/Portfolio-Visit-orange)](https://gunjankumar55.github.io/Gunjan_Portfolio/)
[![Email](https://img.shields.io/badge/Email-Contact-red)](mailto:gunjankumarchoudhari@gmail.com)
[![YouTube](https://img.shields.io/badge/YouTube-Code%20Spirit-red)](https://youtube.com/@codespirit)
## 🎯 About Me
I'm a B.Tech student at Ramrao Adik Institute of Technology, specializing in Computer Engineering with a focus on Data Science. With a strong CGPA of 9.40/10, I'm passionate about building intelligent systems and contributing to technological innovation.
## πŸŽ“ Education
- **B.Tech in Computer Engineering (Data Science)**
Ramrao Adik Institute of Technology, D. Y. Patil Deemed to be University
CGPA: 9.40/10 | 2022 - Present
## πŸ› οΈ Tech Stack
- **Languages**: Python, Java, C, SQL, NoSQL, HTML, CSS, JavaScript
- **ML/AI**: TensorFlow, Scikit-learn, LLMs, Generative AI
- **Web**: Flask, JSP, Servlets
- **Mobile**: Flutter, Dart
- **Tools**: Git, Power BI, VS Code, Jupyter, LaTeX, Figma
- **Cloud**: Oracle Cloud, AWS
- **Databases**: MySQL, MongoDB
## πŸ† Certifications
- Oracle Cloud Infrastructure 2024 Generative AI Certified Professional
- Intel Unnati Training on AI
- Data Science for Engineers (NPTEL)
- Programming in Python (NPTEL)
- Alteryx Machine Learning Fundamentals
- Data Structure and Algorithms – Internshala
- PowerBI for Beginners - Simplilearn
## πŸš€ Featured Projects
### 1. [AgrowAssist](https://github.com/Gunjankumar55/Agroassist---Smart-Crop-Recommendation-using-ML)
Smart Agricultural Assistance System with ML-powered crop recommendations
- 35% improved prediction accuracy
- Flutter mobile app with TensorFlow Lite
- Real-time disease detection
- Power BI data visualization
### 2. [AskDB](https://github.com/Gunjankumar55/askDB---Smart-text-to-sql-)
Natural Language to SQL Query Converter
- LLM-powered query conversion
- Flask backend with MySQL integration
- Responsive Bootstrap UI
- Efficient API communication
### 3. [LESK BERT WSD](https://github.com/Gunjankumar55/LESK_BERT_WSD)
Advanced Word Sense Disambiguation System
- BERT embeddings integration
- Enhanced Lesk algorithm
- Real-time feedback system
- Flask web interface
## πŸ’Ό Professional Experience
### Python Data Analyst Intern @ SPRINGBOARD
- Implemented ML algorithms improving prediction accuracy by 25%
- Created interactive Power BI dashboards
- Enhanced data visualization efficiency by 30%
### Java Development Trainee @ IT-Networkz Infosystem
- Developed MVC architecture web applications
- Optimized database operations with JDBC
- Improved performance speed by 30%
## 🎯 Leadership & Activities
- Head of Design at CSI, RAIT
- Co-Head Design Officer at CSI, RAIT
- Co-Chief Design Officer at Social Wing, RAIT
- NSS Volunteer (2023-2024)
- Technical Content Creator on YouTube
## πŸ“Š GitHub Stats
![Your GitHub stats](https://github-readme-stats.vercel.app/api?username=Gunjankumar55&show_icons=true&theme=radical)
## 🌱 Currently Learning
- Advanced Generative AI
- Cloud Architecture
- System Design
- DevOps practices
## 🎯 Goals
- Contribute to open-source projects
- Master full-stack development
- Build impactful AI applications
- Learn cloud technologies
## ⚑ Fun Fact
When I'm not coding, you'll find me playing football, cricket, or creating graphic designs!
---
⭐️ From [Gunjankumar55](https://github.com/Gunjankumar55)
# LESK BERT WSD - Advanced Word Sense Disambiguation
[![Python](https://img.shields.io/badge/Python-3.9-blue.svg)](https://www.python.org/)
[![Flask](https://img.shields.io/badge/Flask-2.0.1-green.svg)](https://flask.palletsprojects.com/)
[![NLTK](https://img.shields.io/badge/NLTK-3.8.1-orange.svg)](https://www.nltk.org/)
[![BERT](https://img.shields.io/badge/BERT-4.30.2-yellow.svg)](https://huggingface.co/transformers/)
An advanced Word Sense Disambiguation (WSD) system that combines the Lesk algorithm with BERT embeddings for improved accuracy in determining word meanings from context.
## πŸš€ Features
- **Enhanced Lesk Algorithm**: Improved version of the traditional Lesk algorithm
- **BERT Integration**: Uses BERT embeddings for better context understanding
- **Interactive Web Interface**: User-friendly Flask-based web application
- **Real-time Feedback System**: Learn from user corrections to improve accuracy
- **Context-Aware Processing**: Considers surrounding words and phrases
- **Multiple Sense Support**: Handles words with multiple meanings effectively
## πŸ› οΈ Technical Stack
- **Backend**: Flask, Python 3.9
- **NLP**: NLTK, BERT Transformers
- **Frontend**: HTML, CSS, JavaScript
- **Deployment**: Docker, Hugging Face Spaces
- **Version Control**: Git
## πŸ“‹ Prerequisites
- Python 3.9+
- pip (Python package manager)
- Docker (for containerization)
## πŸš€ Quick Start
1. Clone the repository:
```bash
git clone https://github.com/Gunjankumar55/LESK_BERT_WSD.git
cd LESK_BERT_WSD
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Run the application:
```bash
python app.py
```
4. Access the web interface at `http://localhost:5000`
## πŸ’‘ Usage
1. Enter a sentence containing an ambiguous word
2. The system automatically detects potential ambiguous words
3. Select a word to disambiguate
4. View detailed results including:
- Word definitions
- Example usage
- Confidence scores
- Alternative meanings
## 🎯 Project Structure
```
LESK_BERT_WSD/
β”œβ”€β”€ app.py # Main application file
β”œβ”€β”€ requirements.txt # Project dependencies
β”œβ”€β”€ Dockerfile # Docker configuration
β”œβ”€β”€ templates/ # HTML templates
└── static/ # Static assets
```
## 🀝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## πŸ“ License
This project is licensed under the MIT License - see the LICENSE file for details.
## πŸ‘¨β€πŸ’» Author
**Gunjankumar Choudhari**
- GitHub: [@Gunjankumar55](https://github.com/Gunjankumar55)
- LinkedIn: [Gunjankumar Choudhari](https://linkedin.com/in/gunjankumarchoudhari)
- Portfolio: [Gunjan Portfolio](https://gunjankumar55.github.io/Gunjan_Portfolio/)
## πŸ™ Acknowledgments
- NLTK team for the excellent NLP tools
- Hugging Face for BERT implementation
- Flask team for the web framework