[ { "repo_id": "build-small-hackathon/small-talk", "title": "Small Talk", "summary": "No summary available yet.", "url": "https://huggingface.co/spaces/build-small-hackathon/small-talk", "zone": "Builder Workshop", "track": "Backyard AI", "tags": [ "achievement", "sponsor", "talk", "summary", "available", "yet." ], "difficulty": "Builder-Focused", "quest": "🛠 Builder Workshop: Inspect how the app uses models, traces, or datasets.", "likes": 89, "sdk": "gradio", "status": "unknown", "last_modified": "", "emoji": "🛠", "readme_text": "---\ntitle: Small Talk\nemoji: 🎙️\ncolorFrom: indigo\ncolorTo: green\nsdk: gradio\nsdk_version: 6.17.3\napp_file: app.py\npinned: true\nshort_description: An AI-to-AI robot podcast hosted by Reachy Minis\ntags:\n - reachy_mini\n - livekit\n - webrtc\n - three.js\n - track:wood\n - sponsor:nvidia\n - sponsor:modal\n - achievement:offbrand\n - achievement:llama\n - achievement:fieldnotes\n - achievement:offgrid\n - badge-tiny-titan\n---\n\n# Small Talk\n\nAn AI-to-AI podcast hosted by Reachy Mini robots. They join a live WebRTC call,\neach with its own personality and voice, and talk it out while you watch a\nMeet-style grid of their 3D digital twins moving in sync with the conversation.\nGive them a topic and they write the script, design their own voices, dress\nthemselves, and go live. Own a Reachy Mini? It can join a show as a real cast\nmember and speak its lines through the actual robot.\n\n**Team:** [GauravGosain](https://huggingface.co/GauravGosain) and [nkapila6](https://huggingface.co/nkapila6).\n\n- **Demo video:** https://youtu.be/obP4C1eH77I\n- **Build write-ups:** [Small Talk on the Hugging Face blog](https://huggingface.co/blog/build-small-hackathon/small-talk) · [nkapila.me](https://nkapila.me/posts/small-talk)\n- **Launch posts:** [@_GauravGosain on X](https://x.com/_GauravGosain/status/2066013749304344915) · [Nikhil Kapila on LinkedIn](https://www.linkedin.com/posts/nikhilkapila_buildsmall-huggingface-modal-ugcPost-7471791298199408640-OBzS/)\n\n## What you can do\n\n- **Watch a live generated show.** Pick a topic. One structured Nemotron call\n writes the cast and the full speaker-to-dialogue script, Qwen3-TTS voices each\n line, and the next line renders while the current one plays. Subtitles, a\n pre-show \"writers' room\", and rolling continuations keep it going.\n- **Set the cast.** A slider picks 2 to 5 hosts, or how many simulated co-hosts\n fill in around your physical robots.\n- **Design a robot.** Choose a name, personality, voice, shell colour, and props.\n The same Nemotron brain styles its wardrobe from your description.\n- **Tune into Reachy FM.** A radio station of AI-written songs with synced\n karaoke lyrics, a spinning vinyl deck, an audio-reactive visualizer, and a DJ\n robot in headphones that does mic breaks and bops to the beat.\n- **Bring your own Reachy.** A single Go binary turns a physical Reachy Mini into\n a cast member that speaks its own lines, head and antennas moving with the\n speech.\n\n## How it is built\n\nThe whole app is served by `gradio.Server`, a FastAPI host with Gradio's backend\nwhere custom routes take priority, so the visitor only ever sees a hand-built\nthree.js frontend. There is no default Gradio component anywhere in the product.\n\n```mermaid\nflowchart LR\n topic([Topic]) --> llm[\"NVIDIA Nemotron 4B
llama.cpp on Modal\"]\n llm -->|one structured call| script[[\"Cast plus script
(JSON)\"]]\n script --> tts[\"Qwen3-TTS
on Modal\"]\n tts -->|\"line N+1 renders
while line N plays\"| pub[ReachyPublisher]\n pub --> sfu{{\"LiveKit SFU
(WebRTC)\"}}\n sfu --> web[\"Browser:
3D twins + subtitles\"]\n sfu --> robot[\"Physical Reachy
(Go companion)\"]\n```\n\n- **Brain.** NVIDIA Nemotron Nano (4B) served through llama.cpp on Modal. A single\n constrained, structured call returns the full cast and script as JSON. We found\n constrained structured output far more reliable than chaining calls.\n- **Voice.** Qwen3-TTS VoiceDesign on Modal, one consistent character voice per\n host, generated as a cascade so there is no dead air between lines.\n- **Realtime.** A self-hosted LiveKit SFU carries the audio over WebRTC. Subtitles\n and show status ride LiveKit data messages.\n- **Twins.** The official Reachy Mini URDF and meshes in three.js, with head and\n antenna motion blending a speech-reactive envelope and the real recorded Reachy\n emotions and dances.\n\nThe Space itself runs CPU-only. All inference is delegated to Modal serverless GPUs.\nThe Modal serving code for the Nemotron (llama.cpp) and Qwen3-TTS endpoints lives in\n[nkapila6/llama-modal-serve](https://github.com/nkapila6/llama-modal-serve).\n\n## Built for the Build Small Hackathon\n\nEverything runs on models well under the 32B cap, and most of the work is done by\na single 4B model. Small Talk is in the running for:\n\n| Category | Why it qualifies |\n|---|---|\n| **Thousand Token Wood** | A whimsical, AI-native entertainment platform. |\n| **NVIDIA** | The brain is NVIDIA Nemotron. |\n| **Modal** | The LLM and the TTS both run on Modal at runtime. |\n| **Off Brand** | A fully custom three.js UI built on `gradio.Server`. |\n| **Tiny Titan** | The reasoning brain is a 4B model. |\n| **Llama Champion** | Nemotron is served through the llama.cpp runtime. |\n| **Off the Grid** | No proprietary or closed model APIs. Every model (Nemotron, Qwen3-TTS) is open-weight and self-hosted via llama.cpp; Modal provides the compute, not the model. |\n| **Field Notes** | A full build write-up is published on the HF blog. |\n| **Bonus Quest Champion** | T" }, { "repo_id": "build-small-hackathon/claim-ready", "title": "Claim Ready", "summary": "No summary available yet.", "url": "https://huggingface.co/spaces/build-small-hackathon/claim-ready", "zone": "Language Village", "track": "An Adventure in Thousand Token Wood", "tags": [ "achievement", "claim", "ready", "summary", "available", "yet." ], "difficulty": "Experimental", "quest": "🌍 Language Village: Find one use case where this helps communication.", "likes": 67, "sdk": "gradio", "status": "unknown", "last_modified": "", "emoji": "🌍", "readme_text": "---\ntitle: ClaimReady — Claim Submission Check\nemoji: 🏥\ncolorFrom: blue\ncolorTo: green\nsdk: gradio\napp_file: app.py\npinned: false\nshort_description: Catch claim errors before you submit — open model\ntags:\n - build-small-hackathon\n - document-ai\n - vision-language\n - multilingual\n - healthcare\n - track:backyard\n - achievement:offgrid\n - achievement:offbrand\n---\n\n# 🏥 ClaimReady\n\n**An assistive pre-check for hospital health-insurance claims — built for hospital administration\nstaff.** ClaimReady helps hospitals **identify potential compliance issues before submitting claim\ndocuments to insurance providers**. It analyses uploaded documents against package-specific\ntreatment guidelines and provides **actionable feedback on missing, incomplete, or non-compliant\ndocuments**, allowing hospitals to correct issues early in the process.\n\nIt runs **entirely inside the Space on a small open model — Gemma 3 12B (≤ 32B)** — with **no cloud\ninference API**.\n\n## 🎥 Demo & Links\n\n- **▶️ Demo video:** https://youtu.be/A4PCHp89bsQ\n- **📣 Social post:** https://www.linkedin.com/posts/ullivinaybabu_build-huggingface-gradio-share-7472141735138119681-80Th/\n- **💻 Source code (GitHub):** https://github.com/vinayulli/claim_ready\n- **🚀 Live app:** this Space — open the **\"Load a sample claim\"** buttons at the top to try it in one tap.\n\n## 📋 Overview\n\nClaimReady is used by **hospital administration / billing staff**. For this prototype it uses the\n**PMJAY (Pradhan Mantri Jan Arogya Yojana)** scheme as an example, where hospitals must follow\n**Standard Treatment Guidelines (STGs)** when submitting claims. Because these guidelines vary\nacross treatment packages, the application evaluates each claim against the **selected package's\nrequirements**.\n\nThe goal is simple: **reduce claim rejections, minimise processing delays, and improve operational\nefficiency** for hospitals and claim teams.\n\n## 🚩 Problem Statement\n\nHealth-insurance claims — for example under India's **Ayushman Bharat / PMJAY** scheme — require a\nspecific set of **supporting documents** that must satisfy the applicable **clinical / treatment\nguidelines** for each procedure and stage (pre-authorization / claim).\n\n- Every claim must include **all mandatory documents** and meet defined **content conditions** for the procedure and stage.\n- A missing document — or a value that doesn't meet a condition — can lead to **claim rejection**, delays and rework.\n\n## ✅ Solution — What ClaimReady Offers\n\n- 📄 Reads every uploaded document with **on-device OCR** — images *and* PDFs.\n- ✅ Verifies the set against the **required-document checklist** for the selected package and stage.\n- 🔎 Evaluates **content rules** (thresholds, conditions) against the values it *actually reads*.\n- 🌐 Handles **mixed-language** documents — e.g. **English + Hindi / Telugu** in the same record.\n- 🖼️ Built-in **document viewer** — preview every page (images and PDF pages) before checking.\n- ⚠️ Surfaces **missing, incomplete, or non-compliant documents early**, with supporting evidence — as an **assistive** pre-check.\n\n## 🔧 How It Works\n\n1. **Select a package + stage** → the app loads the required-document checklist and content rules (data-driven from `packages.json`).\n2. **Upload documents (or click a sample)** → PDFs and images are rendered to page images (PyMuPDF) and shown in the viewer.\n3. **Run the check** → the document images + the checklist + a strict review prompt go to **Gemma 3 12B**, which **OCRs and reasons in a single pass** and returns structured JSON.\n4. **Review** → rendered as a clear **✅ / ❌ / ⚠️** assistive review with evidence and an action list. (Low-confidence \"present\" documents are flagged as gaps, to stay conservative.)\n\n## 📈 Scope & Scalability\n\n- Currently supports **four treatment packages** as a proof of concept.\n- The logic is **data-driven** (`packages.json`), so the architecture is **scalable** and can easily\n be extended to support many more **packages, insurance providers, and compliance frameworks**.\n\n## 🧱 Built Small\n\n| | |\n|---|---|\n| **Model** | `google/gemma-3-12b-it` — a small, open **≤ 32B** model |\n| **Runtime** | Hugging Face **Transformers on ZeroGPU** — runs inside the Space, **no external / cloud inference API** |\n| **Stack** | Gradio · PyMuPDF · Pillow · 🤗 Transformers |\n\n## 🩺 Note\n\nClaimReady is a **decision-support** tool — it highlights likely gaps for review and does **not**\napprove or reject claims. The sample claims in the app are **synthetic** (fictional patients) —\n**no real patient data**.\n\n---\n\n*Built by [**@vinaybabu**](https://huggingface.co/vinaybabu) for the **HuggingFace Build Small Hackathon** · Backyard AI track.*\n" }, { "repo_id": "build-small-hackathon/PITCHFIGHT_AI", "title": "Pitchfight_Ai", "summary": "No summary available yet.", "url": "https://huggingface.co/spaces/build-small-hackathon/PITCHFIGHT_AI", "zone": "Experiment Lab", "track": "An Adventure in Thousand Token Wood", "tags": [ "achievement", "nvidia", "pitchfight_ai", "summary", "available", "yet." ], "difficulty": "Experimental", "quest": "🧪 Experiment Lab: Try the demo and identify its core idea.", "likes": 100, "sdk": "gradio", "status": "unknown", "last_modified": "", "emoji": "🧪", "readme_text": "---\ntitle: PitchFight AI\nemoji: ⚔️\ncolorFrom: red\ncolorTo: yellow\nsdk: gradio\nsdk_version: 6.17.3\npython_version: 3.12\napp_file: app.py\npinned: true\nlicense: mit\nshort_description: An AI founder pressure arena for practicing startup pitches.\ntags:\n - gradio\n - build-small-hackathon\n - backyard-ai\n - startup\n - pitch-practice\n - founder-tooling\n - voice-ai\n - nvidia\n - nemotron\n - best-demo\n - community-choice\n - track:backyard\n - sponsor:nvidia\n - achievement:offbrand\n - achievement:sharing\n - achievement:fieldnotes\n---\n\n
\n\n# ⚔️ PitchFight AI\n\n### Your first tough pitch should not be in front of a real judge.\n\n**An AI founder pressure arena where student builders practice startup pitches, survive judge-style questions, enter a deal round, and leave with a scorecard that shows exactly what to fix.**\n\n**Try it:** [Live Space](https://huggingface.co/spaces/build-small-hackathon/PITCHFIGHT_AI) · [Demo Video](https://www.youtube.com/watch?v=s4_BzIBhqxc) · [Hugging Face Blog](https://huggingface.co/blog/prakhar811/pitchfight-ai-practice-the-pitch-before-the-real-r) · [GitHub](https://github.com/prakhar811/PitchFight/) · [LinkedIn Post](https://www.linkedin.com/feed/update/urn:li:activity:7472261677472927744/)\n\n\n\n \"PitchFight\n\n\n**⚔️ Click the arena above to try PitchFight AI.**\n\n
\n\n---\n\n## TL;DR for Judges\n\n- **Backyard AI:** PitchFight AI helps student founders practice before hackathons, demo days, mentor reviews, and pitch rooms.\n- **NVIDIA Nemotron Quest:** The judge reasoning is powered by NVIDIA Nemotron through the backend API.\n- **Off Brand:** The app uses a custom cinematic frontend instead of default Gradio components.\n- **Best Demo:** The submission includes a live Space, demo video, and Hugging Face Blog write-up.\n- **Field Notes:** `FIELD_NOTES.md` and the Hugging Face Blog explain the product idea, build decisions, and technical approach.\n- **Community Choice:** The public Space is available for users to try and upvote.\n- **Judges' Wildcard:** PitchFight AI combines founder tooling, coaching, voice, and pressure simulation in one focused demo.\n\n---\n\n## Why this exists\n\nMost student founders do not lose because their idea is bad.\n\nThey lose because the first hard question comes too late.\n\n> \"What is your moat?\" \n> \"Who exactly pays?\" \n> \"Why now?\" \n> \"What happens if a bigger player copies this?\"\n\nPitchFight AI gives student builders a private practice room before the real room. It turns pitch practice into a pressure battle so founders can sharpen their story, defend their idea, and build confidence before facing real judges.\n\n---\n\n## What you can do\n\n- **Enter your startup idea.** Start with a raw pitch or load a demo founder.\n- **Get a founder briefing.** The system structures your idea into problem, solution, users, traction, competitors, and ask.\n- **Choose your opponent.** Face a Skeptical VC, Technical Judge, or Hackathon Judge.\n- **Pick the pressure level.** Practice Mode, Judge Mode, and Investor Mode change the intensity.\n- **Survive pitch rounds.** Answer realistic follow-up questions based on your pitch.\n- **Use voice mode.** Practice pitching and answering out loud.\n- **Enter the deal phase.** Defend your ask in a negotiation-style pressure round.\n- **Get a scorecard.** See what landed, what broke, and what to improve next.\n\n---\n\n## Demo\n\n**Watch the demo:** [https://www.youtube.com/watch?v=s4_BzIBhqxc](https://www.youtube.com/watch?v=s4_BzIBhqxc)\n\n\n \"PitchFight\n\n\n**Live Space:** [https://huggingface.co/spaces/build-small-hackathon/PITCHFIGHT_AI](https://huggingface.co/spaces/build-small-hackathon/PITCHFIGHT_AI) \n**Build write-up:** [https://huggingface.co/blog/prakhar811/pitchfight-ai-practice-the-pitch-before-the-real-r](https://huggingface.co/blog/prakhar811/pitchfight-ai-practice-the-pitch-before-the-real-r) \n**GitHub Repository:** [https://github.com/prakhar811/PitchFight/](https://github.com/prakhar811/PitchFight/)\n\n---\n\n## How it is built\n\nPitchFight AI runs as a Hugging Face Gradio Space, but the user experience is built as a custom frontend rather than a default Gradio interface.\n\nThe frontend talks to backend routes in `app.py`. The backend handles pitch structuring, judge personas, battle state, voice mode, deal rounds, and scorecard generation. NVIDIA Nemotron is used as the core reasoning model for the AI judge.\n\n```mermaid\nflowchart LR\n founder[Student Founder] --> ui[Custom PitchFight Frontend]\n ui --> api[Gradio Backend / app.py]\n api --> router[Model Router]\n router --> nemotron[NVIDIA Nemotron API]" }, { "repo_id": "build-small-hackathon/field-guide", "title": "Field Guide", "summary": "No summary available yet.", "url": "https://huggingface.co/spaces/build-small-hackathon/field-guide", "zone": "Experiment Lab", "track": "An Adventure in Thousand Token Wood", "tags": [ "field", "guide", "summary", "available", "yet.", "docker" ], "difficulty": "Experimental", "quest": "🧪 Experiment Lab: Explore the Space and write one improvement idea.", "likes": 28, "sdk": "docker", "status": "unknown", "last_modified": "", "emoji": "🧪", "readme_text": "---\ntitle: Build Small\nemoji: 🛠️\ncolorFrom: green\ncolorTo: gray\nsdk: docker\napp_port: 7860\npinned: false\n---\n\n# Build Small · Hackathon Field Guide\n\nThe field guide and partner directory for the Build Small hackathon — a SvelteKit\nsite listing each sponsor's models, capabilities, prizes, starter Spaces and\nsupport channels.\n\n> Configuration reference for the Spaces metadata above:\n> https://huggingface.co/docs/hub/spaces-config-reference\n\n## Deployment (Hugging Face Spaces · Docker)\n\nThis Space runs as a **Docker SDK** Space. On every push, Hugging Face builds the\n[`Dockerfile`](./Dockerfile) and runs the resulting container, which serves the\napp on the port declared by `app_port` (`7860`).\n\nThe image is a multi-stage build:\n\n1. **build stage** — installs dependencies with `pnpm` and runs `pnpm run build`.\n The app uses [`@sveltejs/adapter-node`](https://svelte.dev/docs/kit/adapter-node),\n which emits a standalone Node server at `build/index.js`.\n2. **run stage** — installs production dependencies only, copies the built\n server, and launches it as a non-root user (UID 1000, as Spaces requires).\n\nThe server reads `PORT` and `HOST` from the environment; the Dockerfile sets\n`PORT=7860` and `HOST=0.0.0.0` so it binds correctly inside the Space.\n\nNote: the site is fully prerendered (`prerender = true`), so the Node server is\nmostly serving static HTML today. Docker + adapter-node leaves room to add\nserver-side routes or SSR later without changing the deploy path.\n\n## Local development\n\n```sh\npnpm install\npnpm run dev\n```\n\n## Production build\n\n```sh\npnpm run build # outputs build/ via adapter-node\nnode build/index.js # runs the server (defaults to PORT=3000)\n```\n\n## Build the container locally\n\n```sh\ndocker build -t build-small .\ndocker run --rm -p 7860:7860 build-small\n# open http://localhost:7860\n```\n" }, { "repo_id": "build-small-hackathon/kirana-saathi", "title": "Kirana Saathi", "summary": "No summary available yet.", "url": "https://huggingface.co/spaces/build-small-hackathon/kirana-saathi", "zone": "Builder Workshop", "track": "Backyard AI", "tags": [ "achievement", "sponsor", "kirana", "saathi", "summary", "available" ], "difficulty": "Builder-Focused", "quest": "🛠 Builder Workshop: Check what small-model technique this project demonstrates.", "likes": 20, "sdk": "docker", "status": "unknown", "last_modified": "", "emoji": "🛠", "readme_text": "---\ntitle: Kirana Saathi\nemoji: 🏪\ncolorFrom: yellow\ncolorTo: red\nsdk: docker\napp_port: 7860\npinned: false\nshort_description: Hindi voice + photo inventory & udhaar ledger for kiranas\ntags:\n - track:backyard\n - sponsor:modal\n - sponsor:openbmb\n - achievement:offbrand\n - achievement:llama\n - achievement:sharing\n - achievement:fieldnotes\n---\n\n# Kirana Saathi\n\nA Hindi-first assistant for a small kirana (corner) shop. The owner just talks, or shows a photo of a bill, and the app keeps the two things that used to live on paper: the stock, and the *udhaar*, the running credit customers pay back later. It also watches the shelf, flags what is about to expire or is not selling, suggests a discount to clear slow stock, and reminds the owner to restock before a festival.\n\nBuilt for the **Build Small Hackathon**, Backyard AI track. Everything runs on open-weight models, each well under 32B. Understanding and vision come from **MiniCPM-V-4.6** (~5-6B) by OpenBMB. No proprietary frontier API touches the loop. For this demo the model is reached through OpenBMB's hosted MiniCPM API, while speech runs on Modal and the bill OCR pre-pass runs on llama.cpp (see Hosting).\n\n- Live Space: https://huggingface.co/spaces/build-small-hackathon/kirana-saathi\n- Blog (Field Notes): https://build-small-hackathon-kirana-saathi.hf.space/blog\n- Demo video: https://www.loom.com/share/7a675a1918bf4233bd44c8e049f84c97\n- Social post: https://x.com/ps_1506/status/2066625864482349310\n- Code: https://github.com/PrathamSingla15/kirana-saathi\n\n## Architecture\n\n![Kirana Saathi turns a spoken line or a bill photo into a finished ledger entry, in five steps](figures/kirana_final.png)\n\nOne shopkeeper turn runs through five steps:\n\n1. **Hear and read.** faster-whisper turns Hindi speech into text. For a bill or *khata* photo, Surya OCR does a first pass over the page, then MiniCPM-V-4.6 reads it.\n2. **Decide.** MiniCPM-V-4.6 (~5-6B, with vision) runs as a guided-JSON tool router. Instead of native tool-calling, the model is constrained to emit a small JSON object that names the tool and its arguments, so the choice is always well-formed.\n3. **Act.** Tools split in two: writes (add stock, record a sale, note credit, record a payment) and reads (the day's dashboard, a customer's dues, why an item is not moving).\n4. **Confirm.** Anything that would write is read back as a yes/no question. Nothing is saved until the owner says yes.\n5. **Reply.** The answer shows up in the Bahi-Khata ledger screen. When the owner taps the speaker, Veena (with a SNAC decoder) reads it back in one steady Hindi voice.\n\nThe router keeps the two guarantees that matter: confirm-before-write, and a visible tool-call trace under every reply.\n\nSee [`design.md`](design.md) for the full design.\n\n## What it does\n\n- **Voice credit book.** *\"Sharma ji ne 200 ka udhaar liya\"* stages an entry; *\"kiska kitna baaki hai?\"* returns a ranked list of who owes what.\n- **Receive stock by photo.** Hold up the supplier bill. The lines come into a table you can fix before anything is saved, with an estimated expiry when the bill prints none.\n- **Expiry and FEFO.** Sells the oldest stock first and warns before items go off.\n- **Festival nudges.** Restock reminders before demand jumps, not after.\n- **\"Why isn't X selling?\"** Reasons over the sales trend, the stock, and the price, then gives a plain answer and a fix, such as a small clearance discount.\n- **Polite reminders.** Drafts a Hindi collection message for overdue credit, and never sends it on its own.\n- **Money at a glance.** Cost, price, and margin per item, the value of everything on the shelf, and the day's takings.\n\nTwo rails keep it safe: it never writes to the books without a yes, and it never sells stock the shop does not have.\n\n## Stack (open-weight models, ≤32B)\n\n| Layer | Choice |\n|---|---|\n| LLM + vision | **MiniCPM-V-4.6** (~5-6B) by **OpenBMB**, via OpenBMB's OpenAI-compatible API |\n| Agent | guided-JSON tool routing (`guided_json` / `guided_choice`) over MiniCPM |\n| Bill OCR pre-pass | **Surya**, via **llama.cpp** |\n| Speech to text | **faster-whisper** large-v3 (Hindi) |\n| Text to speech | **Veena** (Hindi / Hinglish) with a **SNAC** decoder |\n| Database | two **SQLite** files, `inventory.db` + `transactions.db`, read together via `ATTACH` |\n| Frontend | custom **Gradio** \"Bahi-Khata\" single-screen app |\n\n## Hosting (open-weight, no proprietary frontier API)\n\nEvery model here is open-weight. MiniCPM-V-4.6 is small enough to self-host on a single GPU, and the speech models are smaller still, so nothing forces a hosted dependency. To be clear about what the live demo actually does:\n\n- **LLM + vision** is served through **OpenBMB's hosted MiniCPM API** (OpenAI-compatible). The Space calls it over `KIRANA_LLM_BASE_URL` with `KIRANA_LLM_API_KEY` and `KIRANA_LLM_MODEL`.\n- **Speech** runs on **Modal**: faster-whisper for STT and Veena + SNAC for TTS, on a warm L4 GPU, at `/stt` and `/tts`.\n- **Bill OCR pre-pass** runs **Surya on lla" }, { "repo_id": "build-small-hackathon/dukaan-saathi", "title": "Dukaan Saathi", "summary": "No summary available yet.", "url": "https://huggingface.co/spaces/build-small-hackathon/dukaan-saathi", "zone": "Builder Workshop", "track": "Backyard AI", "tags": [ "achievement", "dukaan", "saathi", "summary", "available", "yet." ], "difficulty": "Builder-Focused", "quest": "🛠 Builder Workshop: Inspect how the app uses models, traces, or datasets.", "likes": 18, "sdk": "docker", "status": "unknown", "last_modified": "", "emoji": "🛠", "readme_text": "---\ntitle: Dukaan Saathi\nemoji: 🏪\ncolorFrom: yellow\ncolorTo: red\nsdk: docker\napp_port: 7860\npinned: false\nshort_description: Hindi voice + photo inventory & udhaar ledger for kiranas\ntags:\n - track:backyard\n - sponsor:modal\n - achievement:offbrand\n - achievement:llama\n - achievement:sharing\n - achievement:fieldnotes\n---\n\n# Dukaan Saathi\n\nA Hindi-first assistant for a small kirana (corner) shop. The owner just talks, or shows a photo of a bill, and the app keeps the two things that used to live on paper: the stock, and the *udhaar*, the running credit customers pay back later. It also watches the shelf, flags what is about to expire or is not selling, suggests a discount to clear slow stock, and reminds the owner to restock before a festival.\n\nBuilt for the **Build Small Hackathon**, Backyard AI track. Everything runs on open-weight models, each well under 32B. For the demo those models are self-hosted on Modal, with no proprietary cloud AI anywhere in the loop.\n\n- Live Space: https://huggingface.co/spaces/build-small-hackathon/dukaan-saathi\n- Blog (Field Notes): https://build-small-hackathon-dukaan-saathi.hf.space/blog\n- Demo video: https://www.loom.com/share/7a675a1918bf4233bd44c8e049f84c97\n- Social post: https://x.com/ps_1506/status/2066625864482349310\n- Code: https://github.com/PrathamSingla15/dukaan-saathi\n\n## Architecture\n\n![Dukaan Saathi turns a spoken line or a bill photo into a finished ledger entry, in five steps](figures/dukaan_final.png)\n\nOne shopkeeper turn runs through five steps:\n\n1. **Hear and read.** faster-whisper turns Hindi speech into text. For a bill or *khata* photo, Surya OCR does a first pass over the page, then Gemma reads it.\n2. **Decide.** Gemma 4 (12B, with vision) runs as a deepagents loop and picks the right tool.\n3. **Act.** Tools split in two: writes (add stock, record a sale, note credit, record a payment) and reads (the day's dashboard, a customer's dues, why an item is not moving).\n4. **Confirm.** Anything that would write is read back as a yes/no question. Nothing is saved until the owner says yes.\n5. **Reply.** The answer shows up in the Bahi-Khata ledger screen. When the owner taps the speaker, Veena (with a SNAC decoder) reads it back in one steady Hindi voice.\n\nSee [`design.md`](design.md) for the full design.\n\n## What it does\n\n- **Voice credit book.** *\"Sharma ji ne 200 ka udhaar liya\"* stages an entry; *\"kiska kitna baaki hai?\"* returns a ranked list of who owes what.\n- **Receive stock by photo.** Hold up the supplier bill. The lines come into a table you can fix before anything is saved, with an estimated expiry when the bill prints none.\n- **Expiry and FEFO.** Sells the oldest stock first and warns before items go off.\n- **Festival nudges.** Restock reminders before demand jumps, not after.\n- **\"Why isn't X selling?\"** Reasons over the sales trend, the stock, and the price, then gives a plain answer and a fix, such as a small clearance discount.\n- **Polite reminders.** Drafts a Hindi collection message for overdue credit, and never sends it on its own.\n- **Money at a glance.** Cost, price, and margin per item, the value of everything on the shelf, and the day's takings.\n\nTwo rails keep it safe: it never writes to the books without a yes, and it never sells stock the shop does not have.\n\n## Stack (open-weight models, ≤32B)\n\n| Layer | Choice |\n|---|---|\n| LLM + vision | **Gemma 4 (12B)**, Q4_K_M GGUF, via **llama.cpp** (`llama-server`, OpenAI-compatible `/v1`) |\n| Bill OCR pre-pass | **Surya** |\n| Agent | **deepagents** (LangChain) driving the local model |\n| Speech to text | **faster-whisper** large-v3 (Hindi) |\n| Text to speech | **Veena** (Hindi / Hinglish) with a **SNAC** decoder |\n| Database | two **SQLite** files, `inventory.db` + `transactions.db`, read together via `ATTACH` |\n| Frontend | custom **Gradio** \"Bahi-Khata\" single-screen app |\n\n## Hosting (Modal + HF Space)\n\nThe Hugging Face Space runs the Gradio UI on CPU; all GPU work is on Modal, in one app (`dukaan-llm`, `scripts/modal_app.py`) split across two warm L4 GPUs so neither model starves:\n\n- **GPU 1 (L4)** runs the LLM + vision/OCR: llama.cpp `llama-server` with Gemma 4 (12B) GGUF, served OpenAI-compatible at `/v1`.\n- **GPU 2 (L4)** runs speech: faster-whisper for STT and Veena + SNAC for TTS, at `/stt` and `/tts`.\n\nBoth stay warm with `min_containers=1`. Deploy with `MODAL_PROFILE=projects-ps MIN_CONTAINERS=1 PYTHONPATH=\"$PWD\" modal deploy scripts/modal_app.py`, then point the Space at the two URLs through the secrets `DUKAAN_LLM_BASE_URL` / `DUKAAN_STT_BASE_URL` / `DUKAAN_TTS_BASE_URL`, plus `HF_TOKEN` for the gated Veena weights.\n\nA config-only swap to **MiniCPM-V 4.6** (≤4B) runs the same app on a smaller vision stack.\n\n## Tracks and badges\n\n| Badge | Evidence |\n|---|---|\n| 🏡 **Backyard AI** | A real kirana owner's daily problem: voice *udhaar*, bill OCR, expiry/FEFO, festival nudges, run on his own books (demo video). |\n| 🟢 **Modal** | LLM + vision/OCR and STT + TTS hosted on Modal acros" }, { "repo_id": "build-small-hackathon/OpenMythos", "title": "Openmythos", "summary": "No summary available yet.", "url": "https://huggingface.co/spaces/build-small-hackathon/OpenMythos", "zone": "Experiment Lab", "track": "An Adventure in Thousand Token Wood", "tags": [ "achievement", "openmythos", "summary", "available", "yet.", "gradio" ], "difficulty": "Experimental", "quest": "🧪 Experiment Lab: Find one way this prototype could become more useful.", "likes": 16, "sdk": "gradio", "status": "unknown", "last_modified": "", "emoji": "🧪", "readme_text": "---\ntitle: OpenMythos\nemoji: 🛡️\ncolorFrom: gray\ncolorTo: indigo\nsdk: gradio\nsdk_version: 6.18.0\npython_version: '3.13'\napp_file: app.py\npinned: true\nshort_description: An Open Source Cyber Security Agent\nlicense: apache-2.0\ntags:\n - gradio\n - build-small-hackathon\n - backyard-ai\n - track:backyard\n - sponsor:modal\n - achievement:offbrand\n - achievement:welltuned\n - achievement:fieldnotes\n - achievement:offgrid\n---\n\n# OpenMythos 🌌\n\n**Paste your codebase. Our AI security agent audits the repository** a multi-level vulnerability analysis, a visual dependency risk path, a declared threat level then generates an instant, verifiable hotfix patch before threat actors can exploit it.\n\nBuilt during the **Hugging Face Small Gradio Hackathon** OpenMythos democratizes cutting-edge security auditing. It bridges an immersive retro terminal interface with the elite agentic reasoning and long-context preservation architecture of a fine-tuned dense model.\n\n> ⚠️ **Proactive Defense.** This platform is engineered for defensive security intelligence. It aims to discover flaws, memory leaks, security configurations, and input bugs instantly, empowering software engineering teams to deploy hotfixes long before a threat vector is weaponized.\n\n---\n\n## ▶️ See it in action\n\n- **Demo Video:** https://www.youtube.com/watch?v=EQyHawWfyZ0\n- **Social Post:** [X](https://x.com/kingnish24/status/2066602499356889493) [Linkedin](https://www.linkedin.com/feed/update/urn:li:activity:7472370083663765504) [Reddit](https://www.reddit.com/r/LocalLLaMA/comments/1u6qw5b/we_trained_a_cybersecurityfocused_mythos_like_llm/)\n- **Blog Post:** https://huggingface.co/blog/KingNish/openmythos\n\n---\n\n## 🏕️ Hackathon Categories\n\n| Category | Why OpenMythos Qualifies |\n|:---------|:-------------------------|\n| **Main Track: Backyard AI** | Solves a real, specific problem for real people: software teams need instant security auditing. The person is every developer who ships code and wants to catch vulnerabilities before attackers do. |\n| **🔌 Off the Grid** | **100% Local & Privacy-First.** The entire pipeline runs with zero cloud API dependencies just a local model endpoint. Your code never leaves your machine. |\n| **🎯 Well-Tuned** | Built on a **Qwen3.6-27B** base fine-tuned via SFT on cybersecurity dataset. The fine-tuned model: https://huggingface.co/build-small-hackathon/OpenMythos |\n| **🎨 Off-Brand** | Fully custom terminal-inspired UI all pushing far past the default Gradio look. |\n\n### Bonus Quests\n\n| Badge | Status | Notes |\n|:------|:-------|:------|\n| 🔌 Off the Grid | ✅ **Earned** | Local-first by design |\n| 🎯 Well-Tuned | ✅ **Earned** | SFT on cybersecurity data; model to be published |\n| 🎨 Off-Brand | ✅ **Earned** | Custom CSS, SVG, terminal theme |\n| 📓 Field Notes | ✅ **Earned** | Blog post: https://huggingface.co/blog/KingNish/openmythos |\n\n## Why it's worth a look\n\n- 🔌 **100% Local & Privacy-First.** Designed as a fully open-source alternative to proprietary security intelligence layers (like Claude's Mythos model). It can be run entirely locally, requiring zero internet connectivity or external dependencies to operate.\n\n## How it works\n\nA multi-stage engineering pipeline built around aggregated, industry-standard security sources:\n\n| Stage | Role | Source Data / Methodology |\n|:-----:|------|---------------------------|\n| **1** | **Data Prep & Aggregation** | Incident reports, GitHub Advisory, VulnHub, and papers. Rigorously trained on BigVul-Filtered and Arvix-Filtered sets. |\n| **2** | **Initial Fine-Tuning (SFT)** | Supervised Fine-Tuning on cybersecurity tasks. Qwen3.6-27B Base (Up to 32K+ token context window). |\n\nThe entire pipeline leverages highly specialized weights to ensure an elite vulnerability discovery rate. No massive API dependencies anywhere: a clever chain of targeted engineering delivers the whole security suite.\n\n## 🤝 Project Contributors\n\nDeveloped with ❤️ during the **Hugging Face Small Gradio Hackathon** by:\n\n- **KingNish** – [HuggingFace Profile](https://huggingface.co/KingNish)\n- **Himanshu** – [HuggingFace Profile](https://huggingface.co/himanshu17HF)\n\n*Built for the Build Small Hackathon. Model: [OpenMythos](https://huggingface.co/build-small-hackathon/OpenMythos) · Dataset: [CVE Vulnerabilities Detailed](https://huggingface.co/datasets/build-small-hackathon/CVE_Vulnerailities_Detailed) · [ArXiv cs.CR Filtered](https://huggingface.co/datasets/himanshu17HF/ArvixImport-Filtered-Final) · Space: [OpenMythos](https://huggingface.co/spaces/build-small-hackathon/OpenMythos)*" }, { "repo_id": "build-small-hackathon/LocalDuo", "title": "Localduo", "summary": "No summary available yet.", "url": "https://huggingface.co/spaces/build-small-hackathon/LocalDuo", "zone": "Experiment Lab", "track": "An Adventure in Thousand Token Wood", "tags": [ "achievement", "localduo", "summary", "available", "yet.", "gradio" ], "difficulty": "Experimental", "quest": "🧪 Experiment Lab: Try the demo and identify its core idea.", "likes": 20, "sdk": "gradio", "status": "unknown", "last_modified": "", "emoji": "🧪", "readme_text": "---\ntitle: LocalDuo\nemoji: 🔥\ncolorFrom: green\ncolorTo: pink\nsdk: gradio\nsdk_version: 6.16.0\npython_version: '3.12'\napp_file: app.py\npinned: true\nshort_description: 🇰🇷✨ LocalDuo - Learn Korean from Documents\npreload_from_hub:\n - Qwen/Qwen3.5-2B\nmodels:\n - Qwen/Qwen3.5-2B\n - CohereLabs/cohere-transcribe-03-2026\n - Supertone/supertonic-3\nthumbnail: >-\n https://raw.githubusercontent.com/ShayekhBinIslam/file-host/main/thumbnail.png\n\ntags:\n - track:backyard\n - achievement:offgrid\n - achievement:fieldnotes\n---\n\n# LocalDuo — Build Small Hackathon Field Notes\n\n**Author:** Shayekh Bin Islam, KAIST, South Korea \n**Date:** June 2026 \n**Stack:** Gradio · Qwen 3.5-9B VLM · Cohere ASR · Supertonic TTS · HuggingFace Spaces (ZeroGPU)\n\n**Live Demo:** https://huggingface.co/spaces/build-small-hackathon/LocalDuo/ \n**Recorded Demo:** https://youtu.be/PoZs9ltbdik \n**Social:** https://www.linkedin.com/posts/shayekhbinislam_hi-everyone-i-have-built-this-app-localduo-share-7472275977369210880--Q6i/ \n**Field Note:** https://huggingface.co/blog/build-small-hackathon/localduo \n\n---\n\n## What I Built\n\n**LocalDuo** is an end-to-end Korean language learning application that takes *any* Korean-language content — a PDF textbook, a live website, an audio recording, or a YouTube video — and automatically transforms it into interactive vocabulary flashcards with native-quality audio pronunciation.\n\nThe core idea: **instead of studying from generic word lists, learn vocabulary from content you actually care about.** Upload a chapter from your Korean textbook, paste a BBC Korean news article, or drop in a K-drama YouTube clip, and the app extracts the most useful Korean vocabulary, transliterates it into your native script, explains the grammar, generates TTS pronunciation audio, and packages everything into swipeable flashcards with a built-in quiz mode.\n\n### Feature Overview\n\n| Feature | Description |\n|---|---|\n| **Multi-Source Input** | Website URLs, PDF uploads, audio file uploads, YouTube links, and pre-saved deck imports — five distinct input pipelines unified into one interface |\n| **Vision-Language Extraction** | Qwen 3.5-9B processes both text *and* page images simultaneously, enabling vocabulary extraction from visual content (handwritten notes, textbook diagrams, infographics) |\n| **Speech-to-Text Pipeline** | Cohere ASR (`cohere-transcribe-03-2026`) transcribes Korean audio from YouTube videos and uploaded audio files, with Korean-only filtering to strip English artifacts |\n| **Text-to-Speech Pronunciation** | Supertonic-3 TTS generates natural Korean pronunciation for every extracted word, embedded as base64 audio data URIs directly in the flashcard HTML |\n| **Interactive Flashcard SPA** | A full single-page application embedded via `