Nexa_Labs / Spec.md
Allanatrix's picture
Upload 57 files
d8328bf verified

A newer version of the Gradio SDK is available: 6.7.0

Upgrade

1. Purpose

The NexaSci Agent Kit is a self-contained, local-first agent stack built around:

  • NexaSci Assistant — a 10B post-trained scientific reasoning model

  • SPECTER (or similar) — a scientific paper embedding model

  • Tool Server — FastAPI-based tool-calling backend

  • Sandbox Environment — controlled Python execution + scientific libraries

  • Simple Web UI — local interface for interactive use

The kit is designed to:

  • Let technical users run the full scientific agent locally (on their own GPU)

  • Provide a reusable template for future agents (e.g., SWE, bio, materials)

  • Integrate reasoning, retrieval, code, and scientific tools in one place

  • Avoid any requirement for hosted services / managed SaaS


2. High-Level Architecture

Components:

  1. LLM: NexaSci Assistant

    • 10B model

    • Post-trained for:

      • tool calling (JSON ToolCall / ToolResult protocol)

      • structured scientific outputs (hypothesis, methodology, limitations, etc.)

      • paper usage + citations

      • self-assessment (“I’m not sure → call tools”)

  2. Embedding Model: SPECTER (or similar)

    • Scientific document embedding model

    • Used to:

      • embed paper abstracts / sections

      • perform semantic search over a local corpus

      • support similarity queries for the agent

    • Runs on CPU or GPU (optional acceleration)

  3. Tool Server (FastAPI)

    • Exposes tools to NexaSci:

      • python.run: sandboxed Python executor

      • papers.search: query external APIs or local index

      • papers.fetch: get metadata/abstracts

      • papers.search_corpus: query SPECTER-based local corpus (optional)

    • Can be extended with:

      • chemistry engines (e.g., RDKit-ish workflows)

      • PDE solvers (e.g., Fenics-like wrappers)

      • quantum simulation stubs

  4. Agent Controller

    • Orchestrates the agent loop:

      • send user prompt + history to LLM

      • parse tool calls

      • call tool server

      • feed back results

      • stop on final message

    • Stateless, minimal, and reusable across agents

  5. Web UI

    • Lightweight, local-only UI

    • Provides:

      • input box

      • streaming output

      • optional view of tool traces

    • Built with something simple (e.g. FastAPI + HTML/JS, or Gradio/Streamlit)


3. Repository Layout

Proposed repo structure:

`nexa-sci-agent-kit/ SPEC.md README.md

docker/ Dockerfile # GPU-accelerated base image docker-compose.yml # optional, for combined agent+tools+ui

agent/ controller.py # agent loop (LLM ↔ tools) client_llm.py # NexaSci loading + chat interface (transformers/vLLM) tool_client.py # HTTP client for FastAPI tools config.yaml # model + server config (ports, endpoints, HF repo)

tools/ server.py # FastAPI app exposing tools schemas.py # Pydantic models for ToolCall/ToolResult python_sandbox.py # sandboxing helpers paper_sources/ arxiv_client.py pubmed_client.py corpus_search.py # SPECTER-based local search

webui/ app.py # minimal web server (can be Gradio/Streamlit/FastAPI) static/ # JS/CSS assets (if needed) templates/ # optional HTML templates

examples/ run_local_agent.py # CLI demo (no UI) sample_prompts.md # curated example prompts

scripts/ download_models.py # pull NexaSci + SPECTER weights init_corpus.py # optional: build local paper index install.sh # convenience installer

requirements.txt`

This layout is reusable: swap client_llm.py + tools, and you have a SWE agent kit.


4. Models

4.1 NexaSci Assistant (LLM)

  • Weights: hosted on Hugging Face (e.g. darkstar/nexa-sci-10b)

  • Form: merged distilled + tool-calling QLoRA

  • Capabilities:

    • Hypothesis + methodology generation

    • Tool calling (Python, paper search)

    • Structured JSON final reports

    • Uncertainty detection → calls tools when unsure

Load options:

  • Transformers (AutoModelForCausalLM) for simplicity

  • vLLM for GPU-accelerated inference with long contexts / parallel requests

Config in agent/config.yaml:

model_repo: "darkstar/nexa-sci-10b" backend: "vllm" # or "transformers" max_tokens: 1024 temperature: 0.3 top_p: 0.9 tool_prefix: "~~~toolcall" tool_suffix: "~~~" final_prefix: "~~~final" final_suffix: "~~~"

4.2 Embedding Model (SPECTER or similar)

  • Weights: e.g. a SPECTER HF repo

  • Use:

    • embed titles/abstracts/sections

    • populate FAISS / similar index

    • support papers.search_corpus tool

Config in agent/config.yaml:

embedding_model_repo: "allenai/specter2_base" # example embedding_device: "cuda" # or "cpu"


5. Tool Server & Sandbox

5.1 FastAPI Tool Server

tools/server.py:

  • Endpoint examples:

    • POST /tools/python.run

      • Input: { "code": "...", "timeout_s": 5 }

      • Output: { "stdout": "...", "stderr": "...", "artifacts": [] }

    • POST /tools/papers.search

      • Input: { "query": "...", "top_k": 10 }

      • Output: [ { "title": "...", "abstract": "...", "doi": "...", "year": 2020 } ]

    • POST /tools/papers.fetch

      • Input: { "doi": "10.XXXX/..." }

      • Output: { "title": "...", "abstract": "...", "bibtex": "...", ... }

    • POST /tools/papers.search_corpus (optional, embedding-based)

      • Input: { "query": "...", "top_k": 20 }

      • Output: [ { "paper_id": "...", "title": "...", "abstract": "...", "score": 0.87 } ]

5.2 Python Sandbox

tools/python_sandbox.py handles:

  • Execution in a restricted namespace:

    • numpy, scipy, pandas, matplotlib available

    • optional domain libs: sympy, rdkit, ase, simple PDE solvers

  • Constraints:

    • time limit (e.g. 5–10 seconds)

    • memory limit (via resource module)

    • no file system access outside a temp dir

    • no network

  • Returns:

    • stdout / stderr

    • optional artifact paths (e.g. plots in /tmp/artifacts)

This gives the agent a safe-ish playground for:

  • simple chemistry calcs

  • ODE/PDE toy simulations

  • statistical summaries

  • plotting

(Domain-heavy engines can be added as specialized tools later.)


6. Agent Controller

agent/controller.py implements the core loop:

  1. Initialize messages with:

    • system prompt (scientific assistant, tool protocol)

    • user prompt

  2. Call client_llm.generate(messages)

  3. Parse output:

    • If it contains a ToolCall block → parse JSON → dispatch via tool_client.py

    • Append a tool message with the tool result

  4. Repeat until a Final block is produced

  5. Return final JSON + pretty-render (for UI)

Design goals:

  • Keep controller stateless and minimal

  • Use a small set of message roles: system, user, assistant, tool

  • Make it trivial to plug in a different LLM backend


7. Web UI

webui/app.py:

  • Provides a local web interface:

    • text area for prompt

    • dropdown for “mode” (e.g. “Explain paper”, “Design experiment”, “Run simulation”)

    • button to run agent

    • area to show:

      • final answer

      • optional tool trace (expandable)

  • Implementation options:

    • Gradio: fastest way to get a web UI

    • Streamlit: also easy, nice for scientists

    • Or a simple HTML/JS frontend served via FastAPI

This is local-only by default.


8. Docker & GPU Acceleration

8.1 Dockerfile

docker/Dockerfile (conceptual spec):

  • Base image: nvidia/cuda:12.x-cudnn-runtime-ubuntu20.04

  • Install:

    • Python 3.10+

    • pip, uv or conda (your call)

    • torch + CUDA

    • transformers, vllm (optional)

    • fastapi, uvicorn

    • sentence-transformers or specter deps

    • numpy, scipy, pandas, matplotlib

    • any light scientific deps you want in v1

  • Copy repo

  • pip install -r requirements.txt

  • Default CMD:

    • either start tool server OR start web UI

    • Docker Compose can spin both

8.2 Usage

Example:

docker build -t nexa-sci-agent-kit -f docker/Dockerfile . docker run --gpus all -p 8000:8000 -p 7860:7860 nexa-sci-agent-kit

This should bring up:

  • tool server on port 8000

  • web UI on port 7860


9. Reusability / Template Design

The kit is meant to be cloned as:

  • nexa-sci-agent-kit → scientific agent

  • nexa-swe-agent-kit → SWE/debugging agent

  • etc.

To create a new agent kit, you:

  • swap model_repo in config.yaml

  • swap or extend tools in tools/server.py

  • adjust system prompt in agent/client_llm.py

  • optionally adjust UI text

Everything else stays the same.