Spaces:

dinukpathiraja
/

boqapi

Running

File size: 2,632 Bytes

a1e723e
a2c5f90
 
a1e723e
 
 
 
 
 
 
a2c5f90
2e2fd75
a2c5f90
2e2fd75
 
a2c5f90
 
 
2e2fd75
 
 
a2c5f90
2e2fd75
a2c5f90
2e2fd75
a2c5f90
2e2fd75
a2c5f90
 
 
 
 
2e2fd75
a2c5f90
2e2fd75
a2c5f90
2e2fd75
 
a2c5f90
 
2e2fd75
 
a2c5f90
2e2fd75
 
 
a2c5f90
2e2fd75
 
 
 
 
 
 
 
7b09ae3
 
2e2fd75
 
a2c5f90
 
2e2fd75
a2c5f90
2e2fd75
 
a2c5f90
2e2fd75
a2c5f90
2e2fd75
 
a2c5f90
 
 
2e2fd75
a2c5f90
2e2fd75
 
a2c5f90
2e2fd75
 
a2c5f90

---
title: LangChain Chat API
emoji: 💬
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
app_port: 7860
---

# LangChain Chat API

A production-ready Chat API powered by **LangChain** and **HuggingFace Inference Endpoints**. This lightweight FastAPI application proxies user queries to advanced Open-Source Large Language Models (like `DeepSeek-R1-0528` or `Qwen3`) using the `HuggingFaceEndpoint` class, avoiding the need for expensive local GPU inference.

## Features
- **LangChain Integration:** Uses `PromptTemplate` and `ChatHuggingFace` with structured outputs to cleanly format and return model responses.
- **Lightweight Architecture:** Eliminates the need for massive PyTorch or Transformers dependencies by offloading inference to Hugging Face.
- **FastAPI Backend:** Provides robust, asynchronous request handling with CORS middleware and global exception management.

---

## Setup and Execution

### 1. Requirements

Since the application offloads inference to the Hugging Face API, you must provide your Hugging Face API Token.

1. Get a Hugging Face API Token from your [Hugging Face Settings](https://huggingface.co/settings/tokens).
2. Set it as an environment variable (or as a Secret if deploying to Hugging Face Spaces):
```bash
export HUGGINGFACEHUB_API_TOKEN="hf_your_token_here"
```

### 2. Running with Docker 

The provided `Dockerfile` builds a highly-optimized, slim Python image.

```bash
docker build -t chat-api .
docker run -p 7860:7860 -e HUGGINGFACEHUB_API_TOKEN="your_token" chat-api
```

### 3. Running Locally

```bash
# Clone the repository and navigate to the project directory
cd multimodal-rag

# Install Python dependencies
pip install -r requirements.txt

# Run the FastAPI application
python app/main.py
```

---

## Example API Usage

### 1. Root / UI Check
When accessing the Space UI directly in a browser, the application returns a status payload pointing to the endpoints.
```bash
curl -X GET "http://localhost:7860/"
```

### 2. Health Check
```bash
curl -X GET "http://localhost:7860/api/health" 
```

### 3. Chat Endpoint
Submit a question to the LLM. It returns a structured JSON answer containing the answer and justification.

```bash
curl -X POST "http://localhost:7860/api/chat" \
     -H "Content-Type: application/json" \
     -d '{
           "question": "What is LangChain?"
         }'
```

**Response Example:**
```json
{
  "answer": "LangChain is an open-source framework designed to simplify the creation of applications using large language models.",
  "justification": "The user asked for a definition of LangChain, which this provides concisely."
}
```