codelion
/

dhara-70m

Text Generation

feature-extraction

Eval Results (legacy)

Model card Files Files and versions

codelion commited on Dec 25, 2025

Commit

823d575

·

verified ·

1 Parent(s): 531cfb6

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +0 -12

README.md CHANGED Viewed

@@ -255,18 +255,6 @@ for i, output in enumerate(outputs):
 5. **Canon Layers Help**: The depthwise causal convolutions (Canon layers) improve factuality and reasoning with only 0.13% parameter overhead.
-## When to Use Dhara
-**Choose Dhara when:**
-- Batch generation throughput matters
-- Factual accuracy is critical
-- You have an existing AR checkpoint to convert
-**Choose AR models when:**
-- Interactive latency is critical
-- Sequential reasoning is important (math, coding)
-- Memory is constrained
 ## Limitations
 - Lower performance on sequential reasoning tasks (GSM8K: 0.00%)

 5. **Canon Layers Help**: The depthwise causal convolutions (Canon layers) improve factuality and reasoning with only 0.13% parameter overhead.
 ## Limitations
 - Lower performance on sequential reasoning tasks (GSM8K: 0.00%)