Ashok75 commited on
Commit
ff281ca
·
verified ·
1 Parent(s): 0c3338f

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -12
README.md CHANGED
@@ -1,12 +1,82 @@
1
- ---
2
- title: React
3
- emoji: 🌍
4
- colorFrom: yellow
5
- colorTo: gray
6
- sdk: gradio
7
- sdk_version: 6.8.0
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Nanbeige4.1-3B Inference Server
2
+
3
+ Lightweight remote LLM inference service for Enterprise ReAct Agent systems.
4
+
5
+ ## Overview
6
+
7
+ This Hugging Face Space hosts the **Nanbeige4.1-3B** model as a remote inference API, designed to work with local agent orchestration systems. The model runs entirely in this Space, while all agent logic, tools, and memory systems run on the user's local machine.
8
+
9
+ ## Model Information
10
+
11
+ - **Model**: [Nanbeige/Nanbeige4.1-3B](https://huggingface.co/Nanbeige/Nanbeige4.1-3B)
12
+ - **Parameters**: 3B
13
+ - **Context Window**: 8K tokens
14
+ - **Capabilities**: Tool calling, reasoning, 500+ tool invocation rounds
15
+ - **License**: Apache 2.0
16
+
17
+ ## API Endpoints
18
+
19
+ ### POST /chat
20
+ Main chat completion endpoint (OpenAI-compatible).
21
+
22
+ **Request:**
23
+ ```json
24
+ {
25
+ "messages": [
26
+ {"role": "system", "content": "You are a helpful assistant."},
27
+ {"role": "user", "content": "Hello!"}
28
+ ],
29
+ "tools": [...],
30
+ "stream": false,
31
+ "max_tokens": 2048,
32
+ "temperature": 0.6,
33
+ "top_p": 0.95
34
+ }
35
+ ```
36
+
37
+ **Response:**
38
+ ```json
39
+ {
40
+ "id": "chatcmpl-...",
41
+ "object": "chat.completion",
42
+ "created": 1234567890,
43
+ "model": "Nanbeige/Nanbeige4.1-3B",
44
+ "choices": [...],
45
+ "usage": {
46
+ "prompt_tokens": 20,
47
+ "completion_tokens": 50,
48
+ "total_tokens": 70
49
+ }
50
+ }
51
+ ```
52
+
53
+ ### GET /chat
54
+ Web interface for testing.
55
+
56
+ ### GET /health
57
+ Health check endpoint.
58
+
59
+ ## Usage with Local Agent
60
+
61
+ ```python
62
+ import requests
63
+
64
+ response = requests.post(
65
+ "https://your-space.hf.space/chat",
66
+ json={
67
+ "messages": [{"role": "user", "content": "Hello!"}],
68
+ "temperature": 0.6
69
+ }
70
+ )
71
+ result = response.json()
72
+ ```
73
+
74
+ ## Hardware Requirements
75
+
76
+ - **GPU**: Recommended (CUDA-compatible)
77
+ - **CPU**: Fallback supported
78
+ - **Memory**: ~8GB RAM minimum
79
+
80
+ ## Local Agent Repository
81
+
82
+ For the complete local agent system that connects to this Space, see the companion repository.