Spaces:

build-small-hackathon
/

trace-field-notes

Running on Zero

App Files Files Community

trace-field-notes / README.md

JacobLinCool

docs: list team hf username

326ac0d verified 17 days ago

preview code

Raw

History Blame Contribute Delete

6.14 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

metadata

title: Trace Field Notes
emoji: 🧭
colorFrom: green
colorTo: gray
sdk: gradio
sdk_version: 6.16.0
app_file: app.py
pinned: false
license: mit
short_description: Qualitative field reports for coding-agent session traces.
tags:
  - build-small
  - track:backyard
  - sponsor:openbmb
  - sponsor:openai
  - sponsor:nvidia
  - achievement:offbrand
  - achievement:fieldnotes
  - gradio-server
  - zerogpu
  - coding-agents
models:
  - openbmb/MiniCPM5-1B
  - nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
  - openai/privacy-filter

Trace Field Notes

Trace Field Notes turns long coding-agent session logs into qualitative field reports: where the agent got stuck, how it detoured, what it tried, how it recovered, and whether its final claim matched its own evidence.

Most agent traces are too long to read after the fact. Tool telemetry is noisy, private, and often the wrong level of detail. This app focuses on a narrower question: what did the agent say about its own work while it was solving a task? The answer becomes a field notebook, not a benchmark.

Team

HF username: @JacobLinCool

Who it is for

Trace Field Notes is for developers, researchers, and hackathon builders who use Codex, Claude Code, Pi Agent, or similar coding agents and want to understand the session narrative after the code is written:

Was the agent blocked, or just exploring?
Did it change strategy for a good reason?
Did a detour produce a better route?
Did the closeout claim overstate what was verified?
What can the next run learn from this one?

The app does not claim to inspect hidden reasoning or prove that the final code is correct. It reports the visible narrative the agent wrote.

How to use it

Find a local coding-agent session log.
Review and redact anything sensitive before upload.
Upload .jsonl, .json, .txt, or .log.
Choose the analysis engine:
- Quick analysis: openbmb/MiniCPM5-1B
- Deeper analysis: nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
- Rule-based: deterministic codebook, no model
Choose GPU for the Hugging Face ZeroGPU path or CPU for a no-quota run.
Read the report: verdict, trail map, episode detail, terrain groups, detour analysis, closeout audit, and redacted narrative export.

Common local trace locations:

# Codex
ls ~/.codex/sessions

# Claude Code
ls ~/.claude/projects

# Pi Agent
ls ~/.pi/agent/sessions

Technology

The frontend is a custom React field-notebook UI served through gradio.Server. It deliberately avoids the default Gradio component look so the report feels like a qualitative trail map rather than a form.

The backend pipeline is:

parser.py loads Codex, Claude Code, Pi Agent, JSONL, JSON, text, and log files into visible narrative messages.
redaction.py applies deterministic secret and PII patterns.
privacy_filter.py optionally adds openai/privacy-filter on the Space GPU.
analyzer.py identifies difficulty episodes and classifies them with a deterministic codebook.
model_runtime.py optionally asks MiniCPM5 1B or Nemotron 3 Nano 30B-A3B to rewrite the analysis into a richer structured field report.
view_model.py adapts the result into the JSON shape rendered by the UI.
profiling.py logs per-stage timing and resource snapshots to server logs.

The app streams real progress events so long runs do not look frozen: upload, extract, redact, chart, classify, synthesize, and model analysis.

Build Small fit

Trace Field Notes targets the Backyard AI track: it solves a specific, practical problem for people already using coding agents.

It also targets these Build Small prizes / badges:

Best Use of Codex: Codex helped develop, debug, package, document, and produce the demo video. The connected GitHub history includes Codex-attributed commits.
Best MiniCPM Build: Quick analysis uses openbmb/MiniCPM5-1B.
Nemotron Hardware Prize: Deeper analysis uses nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.
Off Brand: the app uses gradio.Server with a custom React trail-map UI, not stock Gradio blocks.
Best Demo: the repo includes a polished demo video, article draft, and public X post.

It does not target Tiny Titan because the optional Nemotron path is 30B, and it does not target Best Use of Modal because the runtime is Hugging Face ZeroGPU / CPU, not Modal.

Privacy posture

Agent traces can include prompts, tool inputs, command output, local paths, screenshots, secrets, private source code, and personal data. Review and redact before uploading or sharing.

By default, Trace Field Notes:

ignores raw tool-call contents;
analyzes only visible assistant narrative messages plus optional user context;
runs deterministic secret redaction;
can run openai/privacy-filter for a second PII pass;
exports only redacted narrative text.

Local development

python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python app.py

Run tests:

python3.11 -m unittest discover -s tests

Optional environment settings are listed in .env.example.

Codex contribution

Codex assisted with repository inspection, implementation debugging, test verification, privacy/README hardening, Hugging Face deployment preparation, demo-video scripting, voiceover generation, video composition, frame/ASR verification, and hackathon submission packaging.