Secured

Sleeping

App Files Files Community

Secured / docs /huggingface-article.md

gowtham0992

Add X post to article source

847abcc verified 18 days ago

preview code

Raw

History Blame Contribute Delete

7.69 kB

	# Jawbreaker: Private Scam Defense for Someone You Love

	<p align="center">
	<img src="https://huggingface.co/spaces/build-small-hackathon/jawbreaker/resolve/main/jawbreaker_logo.png" alt="Jawbreaker logo" width="140" />
	</p>

	Most scams do not begin with malware. They begin with a moment of pressure.

	A package is "held." A bank account is "locked." A recruiter offers too much money. A family member says they lost their phone and needs help right now.

	Jawbreaker is built for that moment before someone replies, clicks, shares a code, or sends money.

	It is a small-model scam defense app that turns a suspicious text, email, or DM into a plain safety card:

	- what the risk is
	- who the sender is pretending to be
	- how the message is pressuring you
	- what the sender wants
	- what could happen
	- what to do next
	- a note you can copy to someone you trust

	Live app:

	https://huggingface.co/spaces/build-small-hackathon/jawbreaker

	Watch the demo:

	<p align="center">
	<a href="https://www.youtube.com/watch?v=oh0GRKYXvGM">
	<img src="https://img.youtube.com/vi/oh0GRKYXvGM/hqdefault.jpg" alt="Watch the Jawbreaker demo" width="720" />
	</a>
	</p>

	[Open the demo on YouTube](https://www.youtube.com/watch?v=oh0GRKYXvGM)

	## Why We Built It

	The motivating user was a friend's grandmother who had already been affected by scam messages. We are not publishing private names, phone numbers, timestamps, or personal details, but that family story shaped the product.

	Jawbreaker is not a generic spam classifier for security experts. It is for the person who asks:

	> "Should I click this, reply to this, call this number, or ask someone I trust first?"

	That is why the app avoids a chatbot shape. Scammers already create urgency and back-and-forth. Jawbreaker tries to reduce decisions instead:

	1. Paste the message.
	2. Read the verdict.
	3. Check the warning signs.
	4. Follow one safe next step.
	5. Copy a note to someone trusted if you want a second opinion.

	The goal is not fear. The goal is a pause.

	## What The App Does

	Jawbreaker takes one message and renders a safety card.

	For example, a message like this:

	```text
	Hi Grandma, I lost my phone. This is my new number. Can you send $800 for rent today? Please don't tell Mom.
	```

	should not just be labeled "spam." The important part is the structure:

	- it pretends to be family
	- it asks for money
	- it uses secrecy
	- it creates urgency
	- it gives the user a reason not to verify

	Jawbreaker names those tactics and gives a safe action, such as not sending money or codes and checking through a known phone number or official channel.

	It also handles the opposite case. A normal dentist reminder should not be treated like a scam just because it asks for a reply. False alarms matter because people stop trusting tools that panic too often.

	## The Small Model

	Jawbreaker runs on MiniCPM5-1B with a custom Jawbreaker LoRA adapter.

	Base model:

	https://huggingface.co/openbmb/MiniCPM5-1B

	Final adapter:

	https://huggingface.co/build-small-hackathon/jawbreaker-minicpm5-1b-lora-v8

	The 1B model is deliberate. This is a narrow task, not a general assistant. We wanted a model that could produce reliable structured output for one safety workflow:

	- classify risk
	- explain scam DNA
	- recommend one safe action
	- stay inside a strict JSON schema

	The public Space runs the model directly through Transformers on Hugging Face ZeroGPU. The scam-analysis path does not call OpenAI, Anthropic, hosted MiniCPM APIs, or any other external LLM API.

	## Training And Eval

	We trained the adapter with PEFT/LoRA on Modal A100 and used Modal again for guarded eval runs.

	The data is synthetic and sanitized. It focuses on scam patterns that show up in everyday messages:

	- package phishing
	- bank and payment phishing
	- family impersonation
	- tech support scams
	- fake job and recruiter scams
	- prize and lottery scams
	- wrong-number investment grooming
	- legitimate reminders that should not be overcalled

	Public dataset and eval bundle:

	https://huggingface.co/datasets/build-small-hackathon/jawbreaker-scam-defense-data

	The final model decision was eval-gated. We did not pick the model only by top-line accuracy. For this product, the critical failure is marking a dangerous message as safe.

	<p align="center">
	<img src="https://huggingface.co/spaces/build-small-hackathon/jawbreaker/resolve/main/docs/model-selection-chart.svg" alt="Jawbreaker model selection chart" width="720" />
	</p>

	The chart shows why we chose the 1B v8 adapter for the final app: not the prettiest accuracy number, but the broadest completed safety gate with zero dangerous undercalls, zero unsafe actions, zero invalid JSON, and zero model errors.

	Final guarded Modal A100 eval on the 632-case hard v8 suite:

	\| Metric \| Result \|
	\| --- \| ---: \|
	\| Cases \| 632 \|
	\| Risk accuracy \| 579/632, 91.61% \|
	\| Scam type accuracy \| 561/632, 88.77% \|
	\| Mean tactic recall \| 90.69% \|
	\| Dangerous as safe \| 0 \|
	\| Dangerous as needs_check \| 0 \|
	\| Safe as dangerous or suspicious \| 0 \|
	\| Unsafe action violations \| 0 \|
	\| Invalid predictions \| 0 \|
	\| Model errors \| 0 \|

	That does not mean the model is perfect or exhaustive. It means the final submitted adapter cleared our hardest completed eval with zero dangerous undercalls and zero unsafe actions.

	## Defense In Depth

	Jawbreaker does not trust raw model text directly.

	The runtime pipeline is:

	1. MiniCPM5-1B + Jawbreaker LoRA produces structured JSON.
	2. The app parses the response.
	3. The schema validator checks every required field.
	4. A deterministic safety guard catches obvious high-risk patterns.
	5. If the model output is invalid or weak, the app falls back to a safer deterministic analysis.

	The UI also tells users to remove passwords, account numbers, ID numbers, addresses, and personal codes before pasting. Jawbreaker is a safety aid, not legal, financial, or cybersecurity advice.

	## What Made It Work

	The key product decision was to stop building a "spam detector" and build a family safety workflow.

	Spam detection says:

	> "This is spam."

	Jawbreaker says:

	> "This looks dangerous because it asks for money, uses secrecy, and pretends to be family. Do not send money or codes. Check through a known number. Here is a note you can send to someone you trust."

	That small shift made the model target, eval suite, UI, and demo much clearer.

	## Built With

	- Hugging Face Spaces
	- Gradio / Gradio Server
	- OpenBMB MiniCPM5-1B
	- PEFT/LoRA
	- Modal A100 training and eval
	- OpenAI Codex for implementation, eval design, UI iteration, and submission documentation

	Codex evidence:

	https://github.com/gowtham0992/jawbreaker/blob/main/CODEX_JUDGE_EVIDENCE.md

	Field notes:

	https://github.com/gowtham0992/jawbreaker/blob/main/FIELD_NOTES.md

	## Try It

	Space:

	https://huggingface.co/spaces/build-small-hackathon/jawbreaker

	Submission collection:

	https://huggingface.co/collections/build-small-hackathon/jawbreaker-6a263632dcd0b6d41ca914ff

	GitHub:

	https://github.com/gowtham0992/jawbreaker

	Social posts:

	https://www.reddit.com/r/huggingface/comments/1u48yt6/finetuned_a_1b_model_that_helps_families_check/

	https://www.linkedin.com/posts/gsarveswaran_jawbreaker-private-scam-defense-for-someone-share-7471351364137164800-cv__/

	https://x.com/GothamSarves/status/2065649294623813925?s=20

	Demo inputs:

	```text
	USPS: Your package is held due to an unpaid fee. Verify now: http://usps-track-secure.example
	```

	```text
	Hi Grandma, I lost my phone. This is my new number. Can you send $800 for rent today? Please don't tell Mom.
	```

	```text
	Your dentist appointment is confirmed for Tuesday at 2:00 PM. Reply C to confirm or R to reschedule.
	```

	Paste first. Act after.