Spaces:
Sleeping
Sleeping
File size: 3,156 Bytes
ff024d2 ae91091 3773a26 ae91091 3773a26 ae91091 3773a26 ae91091 3773a26 ae91091 3773a26 ae91091 3773a26 ae91091 3773a26 ae91091 3773a26 ae91091 3773a26 ae91091 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 |
---
title: NLP Intelligence
emoji: π€
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
---
# NLP Intelligence β Social Monitoring Web Application
Hexagonal (Ports & Adapters) architecture for Mongolian social media content analysis.
## Repository Structure
```
NLP-intelligence/
βββ nlp_core/ # Domain Core β NER, sentiment, topic modeling, preprocessing (pure Python)
βββ adapters/
β βββ api/ # FastAPI REST adapter (routers, schemas, services)
β βββ ner_mongolian/ # Fine-tuned NER model config/tokenizer (weights on HF Hub)
β βββ sumbee/ # Future Sumbee.mn integration
βββ frontend/ # Next.js dashboard & admin panel
βββ Data/ # Training data & reference datasets (NOT used at runtime)
β βββ data/ # CoNLL-format training/validation/test files (v1 pipeline)
β βββ datav2/ # JSONL character-offset training data + scripts (v2 pipeline)
β βββ NER-dataset/ # Reference data (locations.json, abbreviations, names)
βββ eval/ # Model evaluation scripts
βββ Dockerfile # Multi-stage production build
βββ nginx.conf # Reverse proxy config (port 7860)
βββ start.sh # Docker entrypoint
βββ requirements.txt
```
**Production code:** `nlp_core/`, `adapters/api/`, `frontend/` β included in Docker image.
**ML development:** `Data/`, `eval/` β excluded from Docker. See [Data/README.md](Data/README.md) for details.
## Model
The NER model is hosted on HuggingFace Hub: `Nomio4640/ner-mongolian`. It is downloaded automatically during Docker build and at runtime (if not cached locally). Model weights are NOT stored in git.
To version a new model after training:
```bash
git tag model-v1.0 -m "F1: 0.XX, trained on train_final.conll"
```
## Quick Start
### Local Development
```bash
# Backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cd adapters/api
PYTHONPATH=../../ uvicorn main:app --reload --host 0.0.0.0 --port 8000
```
API docs: http://localhost:8000/docs
```bash
# Frontend
cd frontend
npm install
npm run dev
```
Dashboard: http://localhost:3000
### Docker
```bash
docker build -t nlp-intelligence .
docker run -p 7860:7860 nlp-intelligence
```
App: http://localhost:7860
### Usage
1. Open http://localhost:3000
2. Upload a CSV file with a `text` or `Text` column
3. View NER, sentiment, and network analysis results
4. Go to `/admin` to manage the knowledge base, labels, and stopwords
## API Endpoints
| Method | Path | Description |
|--------|------|-------------|
| POST | /api/upload | Upload CSV for analysis |
| POST | /api/analyze | Analyze single text |
| POST | /api/analyze/batch | Analyze batch of texts |
| POST | /api/network | Get network graph data |
| POST | /api/insights | Get analysis insights |
| GET/POST | /api/admin/knowledge | Knowledge base CRUD |
| GET/POST | /api/admin/labels | Custom label mapping |
| GET/POST/DELETE | /api/admin/stopwords | Stopword management |
|