Spaces:
Sleeping
fix: address parallel code-review findings (security + quality + spec)
Browse filesThree parallel review agents (security, code-quality, spec-coherence)
flagged a Critical, four Important items, and one out-of-spec scope
question. This commit addresses everything except deferred follow-ups.
CRITICAL β fixed:
CompoundingTestAI.tsx:354-356 β the inline help text under the API
key input said "Sent over HTTPS to the HuggingFace Space, which
forwards to Anthropic" β that was the PRE-refactor copy and now
contradicts the lab page's privacy disclosure. Updated to:
"Sent over HTTPS directly to api.anthropic.com from your browser β
never through mile-hi.ai or HuggingFace."
IMPORTANT β fixed:
1. engine/scripts/deploy-hf-space.sh β gradio-apps/compounding-test/deploy.sh
The spec's FR-020 reserved engine/ for the article-drafting pipeline;
a HF Space deploy helper doesn't belong there. Moved via `git mv` so
history is preserved. REPO_ROOT calculation in the script is
unchanged (still 2 levels up). engine/CLAUDE.md and engine/README.md
updated to remove the entry and point at the new location. FR-020
amended to explicitly carve out ops scripts for sibling gradio-apps.
2. Word-count constants extracted in CompoundingTestAI.tsx β 200 and
5000 were hard-coded in 5 places; now MIN_WORDS and MAX_WORDS
constants near the top of the file with a comment pinning them to
the Python defaults in app.py (with explicit note that there's no
runtime sync β the Python side is the authoritative validator).
3. Python F14 error wrapper now redacts user-supplied Anthropic keys
before surfacing the writeup. Symmetric with redactKey() in
src/lib/anthropic-direct.ts. Anthropic's SDK doesn't currently
include keys in errors, but this is defense-in-depth against a
future debug-mode override. New test
test_diagnose_redacts_user_key_from_error_messages locks it in.
4. Spec amended (post-implementation, recorded explicitly): added US5
covering the site-native AI lab + browser-direct path, with new
FR-025/026/027 for the page, the browser-direct privacy invariants,
and the inline disclosure requirement. Added SC-011 (lab
reachability + Client.connect=0 test) and SC-012 (key-leak vectors
covered by β₯10 dedicated security tests). Spec now reflects what
actually shipped, with an Amendment Rationale paragraph explaining
why this work merits keeping over reverting.
5. Spec bookkeeping: marked T001/T038/T041/T042/T043 done with
rationale; split T039 into T039a (deployed, DONE) and T039b
(latency-baseline DEFERRED to post-merge). Added Post-merge
follow-ups section to tasks.md listing remaining items.
VERIFICATION:
npm run test 943 passed, 1 skipped (Anthropic real-API opt-in)
npm run build clean
pytest gradio-apps/compounding-test/test_diagnose.py
63 passed, 1 skipped (was 62 β +1 redaction test)
Minor findings (not addressed in this commit):
- document.cookie / IndexedDB / Cache API not stubbed in tests
(defense-in-depth opportunity, no current leak)
- requirements.txt upper-bound pins on gradio/transformers/torch
(deferred to post-merge follow-up listed in tasks.md)
- ResultPane component still inline in CompoundingTestAI.tsx
(~115 LOC, extract when next substantive edit happens)
- Sample initiatives duplicated TSβPython (already documented at
top of each file with cross-references)
- 49-commit history retained as-is β useful HF-Space-deployment
runbook in commit form per reviewer's explicit recommendation
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- app.py +7 -0
- deploy.sh +173 -0
- test_diagnose.py +28 -0
|
@@ -732,6 +732,13 @@ def diagnose(
|
|
| 732 |
# without flooding the markdown tab.
|
| 733 |
if len(detail) > 400:
|
| 734 |
detail = detail[:400] + "β¦"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 735 |
return (
|
| 736 |
f"β The diagnostic call to {provider} ({model_label}) failed.\n\n"
|
| 737 |
f"**{type(e).__name__}:** {detail}\n\n"
|
|
|
|
| 732 |
# without flooding the markdown tab.
|
| 733 |
if len(detail) > 400:
|
| 734 |
detail = detail[:400] + "β¦"
|
| 735 |
+
# Defense-in-depth: if the user-supplied Anthropic key somehow
|
| 736 |
+
# appears in the exception message (no current SDK version does
|
| 737 |
+
# this, but a future debug-mode override might), redact it
|
| 738 |
+
# before surfacing the writeup. Symmetric with redactKey() in
|
| 739 |
+
# src/lib/anthropic-direct.ts.
|
| 740 |
+
if user_key_for_anthropic and len(user_key_for_anthropic) >= 8:
|
| 741 |
+
detail = detail.replace(user_key_for_anthropic, "[redacted]")
|
| 742 |
return (
|
| 743 |
f"β The diagnostic call to {provider} ({model_label}) failed.\n\n"
|
| 744 |
f"**{type(e).__name__}:** {detail}\n\n"
|
|
@@ -0,0 +1,173 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env bash
|
| 2 |
+
# Deploy gradio-apps/compounding-test/ to a HuggingFace Space via
|
| 3 |
+
# `git subtree push`. The monorepo stays the canonical source; the
|
| 4 |
+
# Space repo holds only the contents of that one subdirectory.
|
| 5 |
+
#
|
| 6 |
+
# One-time setup (run by the user before first deploy):
|
| 7 |
+
# git remote add hf-compounding-test https://huggingface.co/spaces/<owner>/<space>
|
| 8 |
+
# # the remote name MUST be `hf-compounding-test` β this script looks for it
|
| 9 |
+
#
|
| 10 |
+
# Every deploy:
|
| 11 |
+
# ./engine/scripts/deploy-hf-space.sh # push current branch
|
| 12 |
+
# ./engine/scripts/deploy-hf-space.sh --dry-run # sanity checks only
|
| 13 |
+
|
| 14 |
+
set -euo pipefail
|
| 15 |
+
|
| 16 |
+
REPO_ROOT="$(cd "$(dirname "$0")/../.." && pwd)"
|
| 17 |
+
SPACE_DIR="$REPO_ROOT/gradio-apps/compounding-test"
|
| 18 |
+
SPACE_PREFIX="gradio-apps/compounding-test"
|
| 19 |
+
REMOTE_NAME="hf-compounding-test"
|
| 20 |
+
TARGET_BRANCH="main" # HuggingFace Spaces use `main` as the deploy branch
|
| 21 |
+
|
| 22 |
+
DRY_RUN=0
|
| 23 |
+
FORCE=0
|
| 24 |
+
for arg in "$@"; do
|
| 25 |
+
case "$arg" in
|
| 26 |
+
--dry-run) DRY_RUN=1 ;;
|
| 27 |
+
--force) FORCE=1 ;;
|
| 28 |
+
*) echo "Unknown flag: $arg" >&2; exit 1 ;;
|
| 29 |
+
esac
|
| 30 |
+
done
|
| 31 |
+
|
| 32 |
+
# ---------------------------------------------------------------------------
|
| 33 |
+
# Sanity checks (fail fast with clear messages)
|
| 34 |
+
# ---------------------------------------------------------------------------
|
| 35 |
+
|
| 36 |
+
cd "$REPO_ROOT"
|
| 37 |
+
|
| 38 |
+
# 1. Space directory exists
|
| 39 |
+
if [[ ! -d "$SPACE_DIR" ]]; then
|
| 40 |
+
echo "β Space directory not found: $SPACE_DIR" >&2
|
| 41 |
+
exit 1
|
| 42 |
+
fi
|
| 43 |
+
|
| 44 |
+
# 2. Required files exist at the Space root
|
| 45 |
+
for required in app.py requirements.txt README.md; do
|
| 46 |
+
if [[ ! -f "$SPACE_DIR/$required" ]]; then
|
| 47 |
+
echo "β Missing required file at Space root: $SPACE_PREFIX/$required" >&2
|
| 48 |
+
exit 1
|
| 49 |
+
fi
|
| 50 |
+
done
|
| 51 |
+
|
| 52 |
+
# 3. README.md has the YAML header HF Spaces needs (sdk, app_file).
|
| 53 |
+
# Without these, the Space won't build.
|
| 54 |
+
if ! grep -q "^sdk: gradio" "$SPACE_DIR/README.md"; then
|
| 55 |
+
echo "β $SPACE_PREFIX/README.md is missing 'sdk: gradio' in its YAML header." >&2
|
| 56 |
+
echo " HuggingFace Spaces require this to pick the Gradio runtime." >&2
|
| 57 |
+
exit 1
|
| 58 |
+
fi
|
| 59 |
+
if ! grep -q "^app_file: app.py" "$SPACE_DIR/README.md"; then
|
| 60 |
+
echo "β $SPACE_PREFIX/README.md is missing 'app_file: app.py' in its YAML header." >&2
|
| 61 |
+
exit 1
|
| 62 |
+
fi
|
| 63 |
+
|
| 64 |
+
# 4. HF remote is configured
|
| 65 |
+
if ! git remote get-url "$REMOTE_NAME" >/dev/null 2>&1; then
|
| 66 |
+
echo "β Git remote '$REMOTE_NAME' not configured." >&2
|
| 67 |
+
echo "" >&2
|
| 68 |
+
echo " Add it once with:" >&2
|
| 69 |
+
echo " git remote add $REMOTE_NAME https://huggingface.co/spaces/<owner>/<space>" >&2
|
| 70 |
+
echo "" >&2
|
| 71 |
+
echo " (Replace <owner>/<space> with your actual HF Space path, e.g." >&2
|
| 72 |
+
echo " AshwinP/compounding-test.)" >&2
|
| 73 |
+
exit 1
|
| 74 |
+
fi
|
| 75 |
+
REMOTE_URL="$(git remote get-url "$REMOTE_NAME")"
|
| 76 |
+
|
| 77 |
+
# 5. Working tree is clean within the Space directory. Uncommitted changes
|
| 78 |
+
# inside the prefix would be SILENTLY DROPPED by git subtree, which is
|
| 79 |
+
# a footgun β fail loud instead.
|
| 80 |
+
if ! git diff --quiet -- "$SPACE_PREFIX" || ! git diff --cached --quiet -- "$SPACE_PREFIX"; then
|
| 81 |
+
echo "β Uncommitted changes inside $SPACE_PREFIX/:" >&2
|
| 82 |
+
git status --short -- "$SPACE_PREFIX" >&2
|
| 83 |
+
echo "" >&2
|
| 84 |
+
echo " Commit (or stash) these before deploying β git subtree only pushes" >&2
|
| 85 |
+
echo " what's in the commit history, so unstaged work would silently miss" >&2
|
| 86 |
+
echo " the deploy." >&2
|
| 87 |
+
exit 1
|
| 88 |
+
fi
|
| 89 |
+
|
| 90 |
+
# 6. Tests pass. The parser + provider tests are the gate that protects the
|
| 91 |
+
# deployed Space's behavior β if they're red, do not deploy.
|
| 92 |
+
echo "β Running tests in $SPACE_PREFIX/..."
|
| 93 |
+
if command -v python3 >/dev/null 2>&1; then
|
| 94 |
+
if ! (cd "$SPACE_DIR" && python3 -m pytest test_diagnose.py -q 2>&1 | tail -5); then
|
| 95 |
+
echo "β Tests failed. Fix before deploying." >&2
|
| 96 |
+
exit 1
|
| 97 |
+
fi
|
| 98 |
+
else
|
| 99 |
+
echo " (python3 not found β skipping test gate)" >&2
|
| 100 |
+
fi
|
| 101 |
+
|
| 102 |
+
# ---------------------------------------------------------------------------
|
| 103 |
+
# Report state
|
| 104 |
+
# ---------------------------------------------------------------------------
|
| 105 |
+
|
| 106 |
+
CURRENT_BRANCH="$(git rev-parse --abbrev-ref HEAD)"
|
| 107 |
+
CURRENT_SHA="$(git rev-parse --short HEAD)"
|
| 108 |
+
|
| 109 |
+
echo ""
|
| 110 |
+
echo "Ready to deploy:"
|
| 111 |
+
echo " source prefix: $SPACE_PREFIX"
|
| 112 |
+
echo " source commit: $CURRENT_SHA ($CURRENT_BRANCH)"
|
| 113 |
+
echo " HF remote: $REMOTE_NAME ($REMOTE_URL)"
|
| 114 |
+
echo " target branch: $TARGET_BRANCH"
|
| 115 |
+
echo ""
|
| 116 |
+
|
| 117 |
+
if [[ "$DRY_RUN" -eq 1 ]]; then
|
| 118 |
+
echo "β Dry run β all sanity checks passed. No push performed."
|
| 119 |
+
exit 0
|
| 120 |
+
fi
|
| 121 |
+
|
| 122 |
+
# ---------------------------------------------------------------------------
|
| 123 |
+
# Push to HuggingFace Space
|
| 124 |
+
# ---------------------------------------------------------------------------
|
| 125 |
+
#
|
| 126 |
+
# Two push modes:
|
| 127 |
+
# default: `git subtree push` β fast-forward only. Works for every deploy
|
| 128 |
+
# after the first one (and for first deploys to a Space that has
|
| 129 |
+
# never been initialized server-side).
|
| 130 |
+
# --force: `git subtree split` to a temp branch, then `git push --force`
|
| 131 |
+
# that branch to the remote's main. Required for the FIRST deploy
|
| 132 |
+
# to a freshly-created HF Space, because HF auto-creates an initial
|
| 133 |
+
# commit (README placeholder) that our subtree history can't
|
| 134 |
+
# fast-forward over. `git subtree push --force` is NOT a valid flag
|
| 135 |
+
# combo β only the split+push form works.
|
| 136 |
+
|
| 137 |
+
if [[ "$FORCE" -eq 1 ]]; then
|
| 138 |
+
TEMP_BRANCH="hf-deploy-$(date +%s)"
|
| 139 |
+
echo "β Splitting $SPACE_PREFIX into temp branch $TEMP_BRANCH..."
|
| 140 |
+
git subtree split --prefix="$SPACE_PREFIX" -b "$TEMP_BRANCH" >/dev/null
|
| 141 |
+
echo "β Force-pushing $TEMP_BRANCH β $REMOTE_NAME/$TARGET_BRANCH..."
|
| 142 |
+
if git push "$REMOTE_NAME" "$TEMP_BRANCH:$TARGET_BRANCH" --force; then
|
| 143 |
+
git branch -D "$TEMP_BRANCH" >/dev/null
|
| 144 |
+
echo ""
|
| 145 |
+
echo "β Force-deploy pushed. HuggingFace will rebuild the Space shortly."
|
| 146 |
+
echo " Track build status at: ${REMOTE_URL%.git}"
|
| 147 |
+
else
|
| 148 |
+
git branch -D "$TEMP_BRANCH" >/dev/null 2>&1 || true
|
| 149 |
+
echo ""
|
| 150 |
+
echo "β Force-push failed. Check authentication (huggingface-cli login)" >&2
|
| 151 |
+
echo " and Space permissions." >&2
|
| 152 |
+
exit 1
|
| 153 |
+
fi
|
| 154 |
+
else
|
| 155 |
+
echo "β Pushing subtree to $REMOTE_NAME/$TARGET_BRANCH..."
|
| 156 |
+
if git subtree push --prefix="$SPACE_PREFIX" "$REMOTE_NAME" "$TARGET_BRANCH"; then
|
| 157 |
+
echo ""
|
| 158 |
+
echo "β Deploy pushed. HuggingFace will rebuild the Space shortly."
|
| 159 |
+
echo " Track build status at: ${REMOTE_URL%.git}"
|
| 160 |
+
else
|
| 161 |
+
echo ""
|
| 162 |
+
echo "β Push rejected (non-fast-forward). If this is the FIRST deploy to" >&2
|
| 163 |
+
echo " a freshly-created HF Space, HuggingFace seeded the repo with a" >&2
|
| 164 |
+
echo " placeholder README that our subtree history can't fast-forward" >&2
|
| 165 |
+
echo " over. Re-run with --force to overwrite it:" >&2
|
| 166 |
+
echo "" >&2
|
| 167 |
+
echo " $(basename "$0") --force" >&2
|
| 168 |
+
echo "" >&2
|
| 169 |
+
echo " CAUTION: --force overwrites whatever is currently on the Space's" >&2
|
| 170 |
+
echo " main branch. Safe for first deploys; review carefully otherwise." >&2
|
| 171 |
+
exit 1
|
| 172 |
+
fi
|
| 173 |
+
fi
|
|
@@ -430,6 +430,34 @@ def test_diagnose_premium_does_not_mutate_env_with_user_key(monkeypatch):
|
|
| 430 |
assert os.environ.get("ANTHROPIC_API_KEY") is None
|
| 431 |
|
| 432 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 433 |
def test_call_anthropic_passes_api_key_to_sdk_constructor(monkeypatch):
|
| 434 |
"""When _call_anthropic receives api_key=, it must be passed to the
|
| 435 |
Anthropic() SDK constructor β not stored in os.environ, not
|
|
|
|
| 430 |
assert os.environ.get("ANTHROPIC_API_KEY") is None
|
| 431 |
|
| 432 |
|
| 433 |
+
def test_diagnose_redacts_user_key_from_error_messages(monkeypatch):
|
| 434 |
+
"""Defense-in-depth: if a backend exception ever included the
|
| 435 |
+
user-supplied Anthropic key in its string representation, the F14
|
| 436 |
+
wrapper must redact it before surfacing the error to the UI.
|
| 437 |
+
Symmetric with redactKey() in src/lib/anthropic-direct.ts."""
|
| 438 |
+
monkeypatch.delenv("ANTHROPIC_API_KEY", raising=False)
|
| 439 |
+
user_key = "sk-ant-very-secret-12345"
|
| 440 |
+
|
| 441 |
+
class _LeakyError(Exception):
|
| 442 |
+
pass
|
| 443 |
+
|
| 444 |
+
def leaky_anthropic(system, user, *, api_key=None):
|
| 445 |
+
# Simulate the worst case: SDK echoes the key in its error
|
| 446 |
+
raise _LeakyError(f"auth fail with key {api_key} rejected")
|
| 447 |
+
|
| 448 |
+
monkeypatch.setattr(app_module, "_call_anthropic", leaky_anthropic)
|
| 449 |
+
|
| 450 |
+
writeup, _ = diagnose(
|
| 451 |
+
_LONG_DESCRIPTION, None, None, None,
|
| 452 |
+
provider="anthropic",
|
| 453 |
+
anthropic_api_key=user_key,
|
| 454 |
+
)
|
| 455 |
+
assert user_key not in writeup
|
| 456 |
+
assert "[redacted]" in writeup
|
| 457 |
+
# And the rest of the error info should still be visible
|
| 458 |
+
assert "LeakyError" in writeup
|
| 459 |
+
|
| 460 |
+
|
| 461 |
def test_call_anthropic_passes_api_key_to_sdk_constructor(monkeypatch):
|
| 462 |
"""When _call_anthropic receives api_key=, it must be passed to the
|
| 463 |
Anthropic() SDK constructor β not stored in os.environ, not
|