Spaces:

build-small-hackathon
/

MiniCPM5-1B-Agent

Running

App Files Files Community

MiniCPM5-1B-Agent

Commit History

Add tags apply from https://build-small-hackathon-field-guide.hf.space/submit

adf4aac

Running
verified

Nekochu commited on 16 days ago

Add Color readme

26c8f12
verified

Nekochu commited on 18 days ago

Add live demo x16 in README

785a8da
verified

Nekochu commited on 18 days ago

Add live demo MiniCPM5-1B-Agent speed-up x16

8a263fa
verified

Nekochu commited on 18 days ago

Add to clarify qualifies evaluation (No, don't have X to share)

f163bb0
verified

Nekochu commited on 19 days ago

live-stream: separate <think> from the answer. Backend marks the think->answer boundary; UI labels the live text 'thinking' WHILE reasoning, then collapses it to a 'reasoned' tag and shows the answer distinctly - so the streaming think is never mistaken for the final answer.

52dd24c
verified

Nekochu commited on 21 days ago

DRY sampler on the action phase (CODEAGENT_DRY=0.8): penalizes repeated token SEQUENCES to break the 'Example: ... Example: ...' ramble, without the flat repeat_penalty that garbles code (single-token repeats like indentation stay free)

2e41126
verified

Nekochu commited on 21 days ago

force-answer after web cap: count BOTH web_search+web_fetch toward cap (4), then inject a strong no-tools 'answer now' directive; if it still tool-calls, stop with what it has. Fixes the 'stops but never answers' case.

79d9cfd
verified

Nekochu commited on 21 days ago

revert think temp to greedy (0.9 made the 1B over-create - built an HTML app for a math question); keep the web-search cap/dedup which is the real fix. CODEAGENT_THINK_TEMP env still lets you try 0.9.

be25da0
verified

Nekochu commited on 21 days ago

stop web-search rabbit-hole: dedup identical queries (warn, don't re-run) + hard cap 4 real searches + break if it keeps searching past the cap; use OpenBMB-recommended think sampling (temp 0.9 / top_p 0.95) instead of greedy (greedy think fed the loop)

b953e2b
verified

Nekochu commited on 21 days ago

token-level streaming: complete() streams via SSE + on_token callback; run_agent pushes the in-progress generation (think then action) to the UI live; run() renders it typing out. Eval stays non-streaming.

2040de4
verified

Nekochu commited on 21 days ago

step-level streaming (live tool-call trajectory) + repeat-loop guard (re-steer when the 1B degenerates into repetition) + README clean tags/Output examples

0b557a3
verified

Nekochu commited on 21 days ago

README: clean tag block (fix best-agent typo, drop the 3 non-Field-Guide tags into a descriptive group) + add Output examples (real Modal-run outputs) (#2)

59b5431

Nekochu commited on 21 days ago

Update README.md

b1abeb7
verified

Nekochu commited on 21 days ago

point refs to final public homes + drop placeholder (#1)

a399cf8

Nekochu commited on 21 days ago

fix double-box user bubble: Gradio nests .message.user > [data-testid=user]; my selector styled BOTH -> two stacked boxes. Now the OUTER is the single bubble, the INNER is flattened (transparent/no box). Verified via live CSS injection.

f746311
verified

Nekochu commited on 21 days ago

fix chart blank: 1B was WRITING chart.png as TEXT (623-byte bogus png -> 0x0 render). write tool now REFUSES image-ext text-writes + steers to matplotlib savefig; _render_media_file skips files without a real image magic-number (no blank bubble).

9af691a
verified

Nekochu commited on 21 days ago

fix REGRESSION: prompt hardening ('write then final answer is one sentence') made the 1B SKIP the bash run + hallucinate 'saved chart.png' without running -> no image. Restore: after writing a runnable script you MUST run it with bash + read output before answering; never claim you ran code you did not.

a4918c4
verified

Nekochu commited on 21 days ago

fix chart/media not displaying: the 1B savefig's to an absolute /workspace/ path inside the script (bash rewrite can't reach in-code paths), so the PNG landed outside the scanned sandbox. Now: ensure /workspace exists at startup + _extra_media scans it this-turn-only (no cross-session leak; concurrency_limit=1). Refactored shared _render_media_file.

60a8729
verified

Nekochu commited on 21 days ago

initial commit

fee4923

Nekochu commited on 21 days ago

Commit History

Add tags apply from https://build-small-hackathon-field-guide.hf.space/submit adf4aac Running verified

Add Color readme 26c8f12 verified

Add live demo x16 in README 785a8da verified

Add live demo MiniCPM5-1B-Agent speed-up x16 8a263fa verified

Add to clarify qualifies evaluation (No, don't have X to share) f163bb0 verified

live-stream: separate <think> from the answer. Backend marks the think->answer boundary; UI labels the live text 'thinking' WHILE reasoning, then collapses it to a 'reasoned' tag and shows the answer distinctly - so the streaming think is never mistaken for the final answer. 52dd24c verified

DRY sampler on the action phase (CODEAGENT_DRY=0.8): penalizes repeated token SEQUENCES to break the 'Example: ... Example: ...' ramble, without the flat repeat_penalty that garbles code (single-token repeats like indentation stay free) 2e41126 verified

force-answer after web cap: count BOTH web_search+web_fetch toward cap (4), then inject a strong no-tools 'answer now' directive; if it still tool-calls, stop with what it has. Fixes the 'stops but never answers' case. 79d9cfd verified

revert think temp to greedy (0.9 made the 1B over-create - built an HTML app for a math question); keep the web-search cap/dedup which is the real fix. CODEAGENT_THINK_TEMP env still lets you try 0.9. be25da0 verified

stop web-search rabbit-hole: dedup identical queries (warn, don't re-run) + hard cap 4 real searches + break if it keeps searching past the cap; use OpenBMB-recommended think sampling (temp 0.9 / top_p 0.95) instead of greedy (greedy think fed the loop) b953e2b verified

token-level streaming: complete() streams via SSE + on_token callback; run_agent pushes the in-progress generation (think then action) to the UI live; run() renders it typing out. Eval stays non-streaming. 2040de4 verified

step-level streaming (live tool-call trajectory) + repeat-loop guard (re-steer when the 1B degenerates into repetition) + README clean tags/Output examples 0b557a3 verified

README: clean tag block (fix best-agent typo, drop the 3 non-Field-Guide tags into a descriptive group) + add Output examples (real Modal-run outputs) (#2) 59b5431

Update README.md b1abeb7 verified

point refs to final public homes + drop placeholder (#1) a399cf8

fix double-box user bubble: Gradio nests .message.user > [data-testid=user]; my selector styled BOTH -> two stacked boxes. Now the OUTER is the single bubble, the INNER is flattened (transparent/no box). Verified via live CSS injection. f746311 verified

fix chart blank: 1B was WRITING chart.png as TEXT (623-byte bogus png -> 0x0 render). write tool now REFUSES image-ext text-writes + steers to matplotlib savefig; _render_media_file skips files without a real image magic-number (no blank bubble). 9af691a verified

initial commit fee4923

Add tags apply from https://build-small-hackathon-field-guide.hf.space/submit

adf4aac

Running
verified

Add Color readme

26c8f12
verified

Add live demo x16 in README

785a8da
verified

Add live demo MiniCPM5-1B-Agent speed-up x16

8a263fa
verified

Add to clarify qualifies evaluation (No, don't have X to share)

f163bb0
verified

live-stream: separate <think> from the answer. Backend marks the think->answer boundary; UI labels the live text 'thinking' WHILE reasoning, then collapses it to a 'reasoned' tag and shows the answer distinctly - so the streaming think is never mistaken for the final answer.

52dd24c
verified

DRY sampler on the action phase (CODEAGENT_DRY=0.8): penalizes repeated token SEQUENCES to break the 'Example: ... Example: ...' ramble, without the flat repeat_penalty that garbles code (single-token repeats like indentation stay free)

2e41126
verified

force-answer after web cap: count BOTH web_search+web_fetch toward cap (4), then inject a strong no-tools 'answer now' directive; if it still tool-calls, stop with what it has. Fixes the 'stops but never answers' case.

79d9cfd
verified

revert think temp to greedy (0.9 made the 1B over-create - built an HTML app for a math question); keep the web-search cap/dedup which is the real fix. CODEAGENT_THINK_TEMP env still lets you try 0.9.

be25da0
verified

stop web-search rabbit-hole: dedup identical queries (warn, don't re-run) + hard cap 4 real searches + break if it keeps searching past the cap; use OpenBMB-recommended think sampling (temp 0.9 / top_p 0.95) instead of greedy (greedy think fed the loop)

b953e2b
verified

token-level streaming: complete() streams via SSE + on_token callback; run_agent pushes the in-progress generation (think then action) to the UI live; run() renders it typing out. Eval stays non-streaming.

2040de4
verified

step-level streaming (live tool-call trajectory) + repeat-loop guard (re-steer when the 1B degenerates into repetition) + README clean tags/Output examples

0b557a3
verified

README: clean tag block (fix best-agent typo, drop the 3 non-Field-Guide tags into a descriptive group) + add Output examples (real Modal-run outputs) (#2)

59b5431

Update README.md

b1abeb7
verified

point refs to final public homes + drop placeholder (#1)

a399cf8

fix double-box user bubble: Gradio nests .message.user > [data-testid=user]; my selector styled BOTH -> two stacked boxes. Now the OUTER is the single bubble, the INNER is flattened (transparent/no box). Verified via live CSS injection.

f746311
verified

fix chart blank: 1B was WRITING chart.png as TEXT (623-byte bogus png -> 0x0 render). write tool now REFUSES image-ext text-writes + steers to matplotlib savefig; _render_media_file skips files without a real image magic-number (no blank bubble).

9af691a
verified

initial commit

fee4923