Spaces:

Ashok75
/

react

Sleeping

App Files Files Community

react / README.md

Ashok75

Update README.md

a6112f3 verified about 2 months ago

2.08 kB

title: React
emoji: 🌍
colorFrom: yellow
colorTo: gray
sdk: gradio
sdk_version: 6.8.0
app_file: app.py
pinned: false

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Nanbeige4.1-3B Inference Server

Lightweight remote LLM inference service for Enterprise ReAct Agent systems.

Overview

This Hugging Face Space hosts the Nanbeige4.1-3B model as a remote inference API, designed to work with local agent orchestration systems. The model runs entirely in this Space, while all agent logic, tools, and memory systems run on the user's local machine.

Model Information

Model: Nanbeige/Nanbeige4.1-3B
Parameters: 3B
Context Window: 8K tokens
Capabilities: Tool calling, reasoning, 500+ tool invocation rounds
License: Apache 2.0

API Endpoints

POST /chat

Main chat completion endpoint (OpenAI-compatible).

Request:

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "tools": [...],
  "stream": false,
  "max_tokens": 2048,
  "temperature": 0.6,
  "top_p": 0.95
}

Response:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "Nanbeige/Nanbeige4.1-3B",
  "choices": [...],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 50,
    "total_tokens": 70
  }
}

GET /chat

Web interface for testing.

GET /health

Health check endpoint.

Usage with Local Agent

import requests

response = requests.post(
    "https://your-space.hf.space/chat",
    json={
        "messages": [{"role": "user", "content": "Hello!"}],
        "temperature": 0.6
    }
)
result = response.json()

Hardware Requirements

GPU: Recommended (CUDA-compatible)
CPU: Fallback supported
Memory: ~8GB RAM minimum

Local Agent Repository

For the complete local agent system that connects to this Space, see the companion repository.