trace-field-notes / README.md
JacobLinCool's picture
docs: list team hf username
326ac0d verified
|
Raw
History Blame Contribute Delete
6.14 kB
---
title: Trace Field Notes
emoji: 🧭
colorFrom: green
colorTo: gray
sdk: gradio
sdk_version: 6.16.0
app_file: app.py
pinned: false
license: mit
short_description: Qualitative field reports for coding-agent session traces.
tags:
- build-small
- track:backyard
- sponsor:openbmb
- sponsor:openai
- sponsor:nvidia
- achievement:offbrand
- achievement:fieldnotes
- gradio-server
- zerogpu
- coding-agents
models:
- openbmb/MiniCPM5-1B
- nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
- openai/privacy-filter
---
# Trace Field Notes
Trace Field Notes turns long coding-agent session logs into qualitative field
reports: where the agent got stuck, how it detoured, what it tried, how it
recovered, and whether its final claim matched its own evidence.
Most agent traces are too long to read after the fact. Tool telemetry is noisy,
private, and often the wrong level of detail. This app focuses on a narrower
question: what did the agent *say* about its own work while it was solving a
task? The answer becomes a field notebook, not a benchmark.
## Links
- Live Space: https://huggingface.co/spaces/build-small-hackathon/trace-field-notes
- App runtime: https://build-small-hackathon-trace-field-notes.hf.space/
- GitHub: https://github.com/JacobLinCool/trace-field-notes
- Demo video: https://youtu.be/1QNZlqkl8zo
- Demo MP4 asset: https://huggingface.co/spaces/build-small-hackathon/trace-field-notes/resolve/main/assets/trace-field-notes-demo.mp4
- Article draft: [`docs/article.md`](docs/article.md)
- Social post draft: [`docs/social-post.md`](docs/social-post.md)
- Public X post: https://x.com/JacobLinCool/status/2066160425952334155
## Team
- HF username: [@JacobLinCool](https://huggingface.co/JacobLinCool)
## Who it is for
Trace Field Notes is for developers, researchers, and hackathon builders who use
Codex, Claude Code, Pi Agent, or similar coding agents and want to understand
the session narrative after the code is written:
- Was the agent blocked, or just exploring?
- Did it change strategy for a good reason?
- Did a detour produce a better route?
- Did the closeout claim overstate what was verified?
- What can the next run learn from this one?
The app does **not** claim to inspect hidden reasoning or prove that the final
code is correct. It reports the visible narrative the agent wrote.
## How to use it
1. Find a local coding-agent session log.
2. Review and redact anything sensitive before upload.
3. Upload `.jsonl`, `.json`, `.txt`, or `.log`.
4. Choose the analysis engine:
- **Quick analysis**: `openbmb/MiniCPM5-1B`
- **Deeper analysis**: `nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16`
- **Rule-based**: deterministic codebook, no model
5. Choose **GPU** for the Hugging Face ZeroGPU path or **CPU** for a no-quota
run.
6. Read the report: verdict, trail map, episode detail, terrain groups, detour
analysis, closeout audit, and redacted narrative export.
Common local trace locations:
```bash
# Codex
ls ~/.codex/sessions
# Claude Code
ls ~/.claude/projects
# Pi Agent
ls ~/.pi/agent/sessions
```
## Technology
The frontend is a custom React field-notebook UI served through `gradio.Server`.
It deliberately avoids the default Gradio component look so the report feels
like a qualitative trail map rather than a form.
The backend pipeline is:
1. `parser.py` loads Codex, Claude Code, Pi Agent, JSONL, JSON, text, and log
files into visible narrative messages.
2. `redaction.py` applies deterministic secret and PII patterns.
3. `privacy_filter.py` optionally adds `openai/privacy-filter` on the Space GPU.
4. `analyzer.py` identifies difficulty episodes and classifies them with a
deterministic codebook.
5. `model_runtime.py` optionally asks MiniCPM5 1B or Nemotron 3 Nano 30B-A3B to
rewrite the analysis into a richer structured field report.
6. `view_model.py` adapts the result into the JSON shape rendered by the UI.
7. `profiling.py` logs per-stage timing and resource snapshots to server logs.
The app streams real progress events so long runs do not look frozen: upload,
extract, redact, chart, classify, synthesize, and model analysis.
## Build Small fit
Trace Field Notes targets the **Backyard AI** track: it solves a specific,
practical problem for people already using coding agents.
It also targets these Build Small prizes / badges:
- **Best Use of Codex**: Codex helped develop, debug, package, document, and
produce the demo video. The connected GitHub history includes Codex-attributed
commits.
- **Best MiniCPM Build**: Quick analysis uses `openbmb/MiniCPM5-1B`.
- **Nemotron Hardware Prize**: Deeper analysis uses
`nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16`.
- **Off Brand**: the app uses `gradio.Server` with a custom React trail-map UI,
not stock Gradio blocks.
- **Best Demo**: the repo includes a polished demo video, article draft, and
public X post.
It does **not** target Tiny Titan because the optional Nemotron path is 30B, and
it does **not** target Best Use of Modal because the runtime is Hugging Face
ZeroGPU / CPU, not Modal.
## Privacy posture
Agent traces can include prompts, tool inputs, command output, local paths,
screenshots, secrets, private source code, and personal data. Review and redact
before uploading or sharing.
By default, Trace Field Notes:
- ignores raw tool-call contents;
- analyzes only visible assistant narrative messages plus optional user context;
- runs deterministic secret redaction;
- can run `openai/privacy-filter` for a second PII pass;
- exports only redacted narrative text.
## Local development
```bash
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python app.py
```
Run tests:
```bash
python3.11 -m unittest discover -s tests
```
Optional environment settings are listed in [`.env.example`](.env.example).
## Codex contribution
Codex assisted with repository inspection, implementation debugging, test
verification, privacy/README hardening, Hugging Face deployment preparation,
demo-video scripting, voiceover generation, video composition, frame/ASR
verification, and hackathon submission packaging.