Update ML Intern artifact metadata

609ba3d verified 18 days ago

11.8 kB

	---
	tags:
	- ml-intern
	---
	# 🤖 AI-Powered HR Task Optimizer

	<p align="center">
	<img src="https://img.shields.io/badge/Next.js-14-black?style=for-the-badge&logo=next.js" />
	<img src="https://img.shields.io/badge/Node.js-20-green?style=for-the-badge&logo=node.js" />
	<img src="https://img.shields.io/badge/Python-3.11-blue?style=for-the-badge&logo=python" />
	<img src="https://img.shields.io/badge/FastAPI-0.100+-009688?style=for-the-badge&logo=fastapi" />
	<img src="https://img.shields.io/badge/PostgreSQL-15-336791?style=for-the-badge&logo=postgresql" />
	<img src="https://img.shields.io/badge/Redis-7-DC382D?style=for-the-badge&logo=redis" />
	<img src="https://img.shields.io/badge/Tailwind-3.4-38B2AC?style=for-the-badge&logo=tailwind-css" />
	</p>

	<p align="center">
	<b>Production-grade AI recruitment platform</b> — automate resume screening, rank candidates with embeddings + LLM reranking, prioritize recruiter tasks with ML, and manage the entire hiring pipeline.
	</p>

	---

	## 🚀 What This Is

	A startup-grade SaaS ATS (Applicant Tracking System) built for modern HR teams. It combines:

	- Semantic resume parsing with LLM extraction
	- AI candidate ranking using sentence embeddings + GPT-4o reranking
	- Intelligent task prioritization with LightGBM risk prediction
	- Interview scheduling with Google Calendar integration
	- AI email assistant with human-in-the-loop approval
	- Real-time analytics for hiring velocity and pipeline bottlenecks

	Live Demo: [Coming Soon]
	Architecture Deep Dive: [See ADRs](./docs/adr/)

	---

	## 🏗️ Architecture

	```
	┌─────────────────────────────────────────────────────────────┐
	│ CLIENT LAYER │
	│ Next.js 14 (App Router) ──► Vercel Edge / Serverless │
	│ - SSR Dashboards (SEO + performance) │
	│ - React Server Components for data-heavy tables │
	└──────────────────────┬──────────────────────────────────────┘
	│ HTTPS / JWT
	┌──────────────────────▼──────────────────────────────────────┐
	│ API GATEWAY (Node.js) │
	│ Express.js + Helmet + Rate Limiter + Request Validator │
	│ - Auth middleware (JWT + OAuth passthrough) │
	│ - Router: /api/v1/* → Main API │
	│ /ai/v1/* → AI Service Proxy (internal mTLS) │
	└──────┬─────────────────────┬────────────────────────────────┘
	│ │
	┌──────▼──────┐ ┌───────▼────────┐
	│ CORE API │ │ AI SERVICE │
	│ Node.js │ │ FastAPI │
	│ Express │ │ (GPU/CPU) │
	│ PostgreSQL │ │ Sentence- │
	│ Redis │ │ Transformers │
	│ Bull MQ │ │ OpenAI SDK │
	└──────┬──────┘ └───────┬────────┘
	│ │
	┌───▼────┐ ┌───▼────┐
	│ AWS │ │ AWS │
	│ S3 │ │ SQS / │
	│(Resumes│ │ Redis │
	│ PDFs) │ │ (Queue)│
	└────────┘ └────────┘
	```

	Pattern: BFF (Backend-for-Frontend) + AI Microservice

	- Node.js Core API handles I/O concurrency (auth, CRUD, notifications, calendar APIs)
	- FastAPI AI Service isolates the ML lifecycle (torch, transformers, CUDA dependencies)
	- Next.js App Router uses Server Components for dashboard data and Server Actions for mutations

	---

	## ✨ Features

	\| Feature \| Description \| AI/ML Component \|
	\|---------\|-------------\|-----------------\|
	\| 🤖 AI Task Prioritization \| Dynamically ranks recruiter tasks by urgency, deadline, candidate quality, and workload \| LightGBM risk model + heuristic blend \|
	\| 📄 Resume Screening \| Upload PDFs, extract structured data (skills, experience, education) \| unstructured.io + GPT-4o extraction \|
	\| 🎯 Smart Candidate Ranking \| Semantic similarity scoring + LLM reranking for precision \| Sentence Transformers + GPT-4o \|
	\| 📅 Interview Scheduler \| Auto-manage slots, calendar sync, reminders, multi-stage workflow \| Google Calendar API + BullMQ cron \|
	\| 📊 Recruitment Dashboard \| Pipeline analytics, hiring progress, task monitoring \| PostgreSQL aggregations + Recharts \|
	\| ✉️ AI Email Assistant \| Generate follow-ups, invites, rejections with human approval \| GPT-4o with few-shot prompting \|
	\| 📈 Productivity Analytics \| Time-to-hire, conversion rates, recruiter efficiency, bottlenecks \| Survival analysis + funnel metrics \|
	\| 🔔 Notification System \| Smart alerts, deadline reminders, candidate inactivity \| SSE + Redis Pub/Sub \|

	---

	## 🛠️ Tech Stack

	### Frontend
	- Next.js 14 (App Router, Server Components, Server Actions)
	- Tailwind CSS + shadcn/ui primitives
	- TanStack Query for client-side data fetching
	- Zustand for lightweight global state
	- React Hook Form + Zod for validation
	- Recharts / Tremor for analytics

	### Backend
	- Node.js + Express (Core API)
	- Python + FastAPI (AI Microservice)
	- PostgreSQL 15 (primary database + pgvector for semantic search)
	- Redis 7 (caching, sessions, BullMQ job queues)
	- BullMQ (background job processing)

	### AI/ML
	- Sentence Transformers (`all-MiniLM-L6-v2` for embeddings)
	- OpenAI GPT-4o (resume extraction, email generation, reranking)
	- LightGBM (task prioritization model)
	- scikit-learn (scoring ensembles)
	- unstructured.io + pdfplumber (PDF parsing)

	### Auth & Deployment
	- JWT + OAuth 2.0 (Google, GitHub)
	- Vercel (frontend)
	- Railway / Render (backend + AI service)
	- AWS S3 (resume storage)
	- SendGrid / AWS SES (transactional email)

	---

	## 📁 Monorepo Structure

	```
	hr-task-optimizer/
	├── apps/
	│ ├── web/ # Next.js 14 App Router
	│ │ ├── app/ # Route groups, Server Components
	│ │ ├── components/ # UI primitives + domain composites
	│ │ └── lib/ # API wrappers, utilities
	│ ├── api/ # Node.js Core API
	│ │ ├── src/modules/ # Domain modules (auth, jobs, candidates, tasks)
	│ │ ├── src/workers/ # BullMQ job processors
	│ │ └── Dockerfile
	│ └── ai-service/ # Python FastAPI
	│ ├── app/routers/ # Embeddings, screening, generation, ranking
	│ ├── services/ # Model singletons, LLM clients
	│ └── Dockerfile.gpu
	├── packages/
	│ ├── shared-types/ # Zod schemas → TS + Pydantic
	│ ├── ui/ # shadcn/ui base config
	│ └── eslint-config/
	├── infra/
	│ ├── docker-compose.yml # Local dev stack
	│ ├── k8s/ # Kubernetes manifests
	│ └── terraform/ # AWS/GCP provisioning
	├── docs/
	│ └── adr/ # Architecture Decision Records
	└── turbo.json
	```

	---

	## 🚀 Quick Start

	### Prerequisites
	- Docker + Docker Compose
	- Node.js 20+ + pnpm
	- Python 3.11+

	### 1. Clone & Install

	```bash
	git clone https://github.com/plplpl183/ai-powered-hr-task-optimizer.git
	cd ai-powered-hr-task-optimizer
	pnpm install
	```

	### 2. Environment Setup

	```bash
	# Copy env files
	cp apps/web/.env.example apps/web/.env.local
	cp apps/api/.env.example apps/api/.env
	cp apps/ai-service/.env.example apps/ai-service/.env

	# Fill in your credentials (OpenAI, Google OAuth, AWS S3, etc.)
	```

	### 3. Start Local Stack

	```bash
	# Start PostgreSQL, Redis, MinIO (S3 mock)
	docker-compose -f infra/docker-compose.yml up -d

	# Run database migrations
	pnpm db:migrate

	# Start all apps in dev mode
	pnpm dev
	```

	Services will be available at:
	- Web: http://localhost:3000
	- Core API: http://localhost:4000
	- AI Service: http://localhost:8000
	- PostgreSQL: localhost:5432
	- Redis: localhost:6379
	- MinIO (S3): http://localhost:9000

	---

	## 🧪 Testing

	```bash
	# Unit tests
	pnpm test

	# Integration tests (requires local stack)
	pnpm test:integration

	# AI service tests
	pnpm test:ai
	```

	---

	## 📊 Performance & Scale

	\| Metric \| Target \| Implementation \|
	\|--------\|--------\|----------------\|
	\| Resume parsing \| <5s per PDF \| Async BullMQ worker + model singleton \|
	\| Candidate ranking \| <200ms for top-20 \| pgvector cosine similarity + LLM reranker \|
	\| Task prioritization \| <100ms \| LightGBM inference + Redis caching \|
	\| Dashboard load \| <1s TTFB \| Next.js Server Components + ISR \|
	\| Concurrent users \| 1000+ \| Horizontal scaling via K8s / Railway \|

	---

	## 🔐 Security

	- HTTP-only cookies with `SameSite=Lax` for refresh tokens
	- Rate limiting by IP + user (Redis-backed)
	- File upload validation via magic numbers + size limits
	- Parameterized queries (Drizzle ORM) — SQL injection impossible
	- CORS restricted to known origins
	- Helmet.js security headers
	- Human-in-the-loop approval for all AI-generated emails

	---

	## 📖 Documentation

	- [Architecture Decision Records](./docs/adr/)
	- [001: Why PostgreSQL over MongoDB](./docs/adr/001-why-postgres.md)
	- [002: Why FastAPI for AI Service](./docs/adr/002-why-fastapi-for-ai.md)
	- [003: Task Prioritization Heuristic vs ML](./docs/adr/003-task-prioritization.md)
	- [API Documentation](https://api.hr-task-optimizer.dev/docs) (OpenAPI/Swagger)
	- [Contributing Guidelines](./CONTRIBUTING.md)

	---

	## 🤝 Contributing

	We use [Conventional Commits](https://www.conventionalcommits.org/):

	```bash
	feat: add AI email generation endpoint
	fix: resolve race condition in task prioritization
	docs: update API documentation
	refactor: extract resume parser into service class
	test: add integration tests for interview scheduler
	```

	See [CONTRIBUTING.md](./CONTRIBUTING.md) for details.

	---

	## 📄 License

	MIT License — see [LICENSE](./LICENSE) for details.

	---

	## 🙏 Acknowledgments

	- [sentence-transformers](https://www.sbert.net/) for embedding models
	- [unstructured.io](https://unstructured.io/) for PDF parsing
	- [shadcn/ui](https://ui.shadcn.com/) for UI primitives
	- [BullMQ](https://docs.bullmq.io/) for job queues

	---

	<p align="center">
	Built with ❤️ by <a href="https://github.com/plplpl183">@plplpl183</a>
	</p>

	<!-- ml-intern-provenance -->
	## Generated by ML Intern

	This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub.

	- Try ML Intern: https://smolagents-ml-intern.hf.space
	- Source code: https://github.com/huggingface/ml-intern

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = "plplpl183/ai-powered-hr-task-optimizer"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id)
	```

	For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.