turtle170 commited on
Commit
c9c4656
·
verified ·
1 Parent(s): 85c62fd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -13
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  title: ZeroEngine V0.1
3
  emoji: 🚀
4
- colorFrom: blue
5
  colorTo: gray
6
  sdk: gradio
7
  sdk_version: 6.5.0
@@ -10,16 +10,10 @@ pinned: false
10
  license: apache-2.0
11
  ---
12
 
13
- # ZeroEngine System Kernel
14
- A specialized inference engine optimized for low-resource Hugging Face Spaces (2 vCPUs / 16GB RAM).
15
 
16
- ## Key Features
17
- - **Deterministic Partitioning**: Strictly splits 2 vCPUs between two concurrent users.
18
- - **Resource Gatekeeper**: Prevents OOM crashes with a strict 50% RAM model limit and 200MB system buffer.
19
- - **Ghosting Queue**: Enables pre-typing and background prompt preparation for queued users.
20
- - **Persistence Layer**: Tracks model popularity by pushing telemetry JSONs to the HF Hub via `HF_TOKEN`.
21
-
22
- ## Hardware Specifications
23
- - **CPU**: 2 vCPUs (shared)
24
- - **RAM**: 16 GB (Shared)
25
- - **Optimization**: `llama-cpp` with mmap and single-core pinning per slot.
 
1
  ---
2
  title: ZeroEngine V0.1
3
  emoji: 🚀
4
+ colorFrom: gray
5
  colorTo: gray
6
  sdk: gradio
7
  sdk_version: 6.5.0
 
10
  license: apache-2.0
11
  ---
12
 
13
+ # ZeroEngine V0.1 (Kernel)
14
+ High-performance inference engine for 2-vCPU / 16GB RAM constraints.
15
 
16
+ ## Optimizations
17
+ - **KV-Cache Stitching**: Asynchronous pre-evaluation of queue inputs.
18
+ - **Hard Partitioning**: Dedicated core assignment per concurrent user.
19
+ - **Memory Mapping**: weights mapped via `mmap` to preserve RAM for context.