Spaces:

Allanatrix
/

Nexa_Labs

Paused

App Files Files Community

Nexa_Labs / Spec.md

Allanatrix

Upload 57 files

d8328bf verified 3 months ago

preview code

raw

history blame contribute delete

10.4 kB

	### 1. Purpose

	The NexaSci Agent Kit is a self-contained, local-first agent stack built around:

	- NexaSci Assistant — a 10B post-trained scientific reasoning model

	- SPECTER (or similar) — a scientific paper embedding model

	- Tool Server — FastAPI-based tool-calling backend

	- Sandbox Environment — controlled Python execution + scientific libraries

	- Simple Web UI — local interface for interactive use


	The kit is designed to:

	- Let technical users run the full scientific agent locally (on their own GPU)

	- Provide a reusable template for future agents (e.g., SWE, bio, materials)

	- Integrate reasoning, retrieval, code, and scientific tools in one place

	- Avoid any requirement for hosted services / managed SaaS


	----------

	### 2. High-Level Architecture

	Components:

	1. LLM: NexaSci Assistant

	- 10B model

	- Post-trained for:

	- tool calling (JSON ToolCall / ToolResult protocol)

	- structured scientific outputs (hypothesis, methodology, limitations, etc.)

	- paper usage + citations

	- self-assessment (“I’m not sure → call tools”)

	2. Embedding Model: SPECTER (or similar)

	- Scientific document embedding model

	- Used to:

	- embed paper abstracts / sections

	- perform semantic search over a local corpus

	- support similarity queries for the agent

	- Runs on CPU or GPU (optional acceleration)

	3. Tool Server (FastAPI)

	- Exposes tools to NexaSci:

	- `python.run`: sandboxed Python executor

	- `papers.search`: query external APIs or local index

	- `papers.fetch`: get metadata/abstracts

	- `papers.search_corpus`: query SPECTER-based local corpus (optional)

	- Can be extended with:

	- chemistry engines (e.g., RDKit-ish workflows)

	- PDE solvers (e.g., Fenics-like wrappers)

	- quantum simulation stubs

	4. Agent Controller

	- Orchestrates the agent loop:

	- send user prompt + history to LLM

	- parse tool calls

	- call tool server

	- feed back results

	- stop on `final` message

	- Stateless, minimal, and reusable across agents

	5. Web UI

	- Lightweight, local-only UI

	- Provides:

	- input box

	- streaming output

	- optional view of tool traces

	- Built with something simple (e.g. FastAPI + HTML/JS, or Gradio/Streamlit)


	----------

	### 3. Repository Layout

	Proposed repo structure:

	`nexa-sci-agent-kit/
	SPEC.md
	README.md

	docker/
	Dockerfile # GPU-accelerated base image
	docker-compose.yml # optional, for combined agent+tools+ui

	agent/
	controller.py # agent loop (LLM ↔ tools)
	client_llm.py # NexaSci loading + chat interface (transformers/vLLM)
	tool_client.py # HTTP client for FastAPI tools
	config.yaml # model + server config (ports, endpoints, HF repo)

	tools/
	server.py # FastAPI app exposing tools
	schemas.py # Pydantic models for ToolCall/ToolResult
	python_sandbox.py # sandboxing helpers
	paper_sources/
	arxiv_client.py
	pubmed_client.py
	corpus_search.py # SPECTER-based local search

	webui/
	app.py # minimal web server (can be Gradio/Streamlit/FastAPI)
	static/ # JS/CSS assets (if needed)
	templates/ # optional HTML templates

	examples/
	run_local_agent.py # CLI demo (no UI)
	sample_prompts.md # curated example prompts

	scripts/
	download_models.py # pull NexaSci + SPECTER weights
	init_corpus.py # optional: build local paper index
	install.sh # convenience installer

	requirements.txt`

	This layout is reusable: swap `client_llm.py` + tools, and you have a SWE agent kit.

	----------

	### 4. Models

	#### 4.1 NexaSci Assistant (LLM)

	- Weights: hosted on Hugging Face (e.g. `darkstar/nexa-sci-10b`)

	- Form: merged distilled + tool-calling QLoRA

	- Capabilities:

	- Hypothesis + methodology generation

	- Tool calling (Python, paper search)

	- Structured JSON final reports

	- Uncertainty detection → calls tools when unsure


	Load options:

	- Transformers (`AutoModelForCausalLM`) for simplicity

	- vLLM for GPU-accelerated inference with long contexts / parallel requests


	Config in `agent/config.yaml`:

	`model_repo: "darkstar/nexa-sci-10b" backend: "vllm" # or "transformers" max_tokens: 1024 temperature: 0.3 top_p: 0.9 tool_prefix: "~~~toolcall" tool_suffix: "~~~" final_prefix: "~~~final" final_suffix: "~~~"`

	#### 4.2 Embedding Model (SPECTER or similar)

	- Weights: e.g. a SPECTER HF repo

	- Use:

	- embed titles/abstracts/sections

	- populate FAISS / similar index

	- support `papers.search_corpus` tool


	Config in `agent/config.yaml`:

	`embedding_model_repo: "allenai/specter2_base" # example embedding_device: "cuda" # or "cpu"`

	----------

	### 5. Tool Server & Sandbox

	#### 5.1 FastAPI Tool Server

	`tools/server.py`:

	- Endpoint examples:

	- `POST /tools/python.run`

	- Input: `{ "code": "...", "timeout_s": 5 }`

	- Output: `{ "stdout": "...", "stderr": "...", "artifacts": [] }`

	- `POST /tools/papers.search`

	- Input: `{ "query": "...", "top_k": 10 }`

	- Output: `[ { "title": "...", "abstract": "...", "doi": "...", "year": 2020 } ]`

	- `POST /tools/papers.fetch`

	- Input: `{ "doi": "10.XXXX/..." }`

	- Output: `{ "title": "...", "abstract": "...", "bibtex": "...", ... }`

	- `POST /tools/papers.search_corpus` (optional, embedding-based)

	- Input: `{ "query": "...", "top_k": 20 }`

	- Output: `[ { "paper_id": "...", "title": "...", "abstract": "...", "score": 0.87 } ]`


	#### 5.2 Python Sandbox

	`tools/python_sandbox.py` handles:

	- Execution in a restricted namespace:

	- `numpy`, `scipy`, `pandas`, `matplotlib` available

	- optional domain libs: `sympy`, `rdkit`, `ase`, simple PDE solvers

	- Constraints:

	- time limit (e.g. 5–10 seconds)

	- memory limit (via resource module)

	- no file system access outside a temp dir

	- no network

	- Returns:

	- stdout / stderr

	- optional artifact paths (e.g. plots in `/tmp/artifacts`)


	This gives the agent a safe-ish playground for:

	- simple chemistry calcs

	- ODE/PDE toy simulations

	- statistical summaries

	- plotting


	(Domain-heavy engines can be added as specialized tools later.)

	----------

	### 6. Agent Controller

	`agent/controller.py` implements the core loop:

	1. Initialize messages with:

	- system prompt (scientific assistant, tool protocol)

	- user prompt

	2. Call `client_llm.generate(messages)`

	3. Parse output:

	- If it contains a ToolCall block → parse JSON → dispatch via `tool_client.py`

	- Append a `tool` message with the tool result

	4. Repeat until a Final block is produced

	5. Return final JSON + pretty-render (for UI)


	Design goals:

	- Keep controller stateless and minimal

	- Use a small set of message roles: `system`, `user`, `assistant`, `tool`

	- Make it trivial to plug in a different LLM backend


	----------

	### 7. Web UI

	`webui/app.py`:

	- Provides a local web interface:

	- text area for prompt

	- dropdown for “mode” (e.g. “Explain paper”, “Design experiment”, “Run simulation”)

	- button to run agent

	- area to show:

	- final answer

	- optional tool trace (expandable)

	- Implementation options:

	- Gradio: fastest way to get a web UI

	- Streamlit: also easy, nice for scientists

	- Or a simple HTML/JS frontend served via FastAPI


	This is _local-only_ by default.

	----------

	### 8. Docker & GPU Acceleration

	#### 8.1 Dockerfile

	`docker/Dockerfile` (conceptual spec):

	- Base image: `nvidia/cuda:12.x-cudnn-runtime-ubuntu20.04`

	- Install:

	- Python 3.10+

	- `pip`, `uv` or `conda` (your call)

	- `torch` + CUDA

	- `transformers`, `vllm` (optional)

	- `fastapi`, `uvicorn`

	- `sentence-transformers` or `specter` deps

	- `numpy`, `scipy`, `pandas`, `matplotlib`

	- any light scientific deps you want in v1

	- Copy repo

	- `pip install -r requirements.txt`

	- Default `CMD`:

	- either start tool server OR start web UI

	- Docker Compose can spin both


	#### 8.2 Usage

	Example:

	`docker build -t nexa-sci-agent-kit -f docker/Dockerfile .
	docker run --gpus all -p 8000:8000 -p 7860:7860 nexa-sci-agent-kit`

	This should bring up:

	- tool server on port 8000

	- web UI on port 7860


	----------

	### 9. Reusability / Template Design

	The kit is meant to be cloned as:

	- `nexa-sci-agent-kit` → scientific agent

	- `nexa-swe-agent-kit` → SWE/debugging agent

	- etc.


	To create a new agent kit, you:

	- swap `model_repo` in `config.yaml`

	- swap or extend tools in `tools/server.py`

	- adjust system prompt in `agent/client_llm.py`

	- optionally adjust UI text


	Everything else stays the same.