Spaces:

durinn
/

dowser

Sleeping

App Files Files Community

vstrandmoe commited on Jan 2

Commit

60f8b77

1 Parent(s): 47076af

readme

Browse files

Files changed (1) hide show

README.md +34 -2

README.md CHANGED Viewed

@@ -1,10 +1,42 @@
 ---
 title: Dowser
-emoji: 📉
 colorFrom: red
 colorTo: yellow
 sdk: docker
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: Dowser
+emoji: ⏱️
 colorFrom: red
 colorTo: yellow
 sdk: docker
 pinned: false
 ---
+## Problem
+AI teams are data constrained, not model constrained and waste millions retraining models on data with little or negative impact.
+They spend most of their budget collecting, processing, and labeling data without knowing what actually improves performance.
+This leads to repeated failed retraining cycles, wasted GPU runs, and slow iteration because teams lack insights in which datasets improve the model and which degrade it.
+## Solution
+Influence guided training has been shown to halve the convergence time. [*Dowser by Durinn](http://durinn.ai/)* tells AI teams which training data improves model performance and which data hurts it, democratizing what big model providers are doing.
+## Product
+[*Dowser*](https://durinn-concept-explorer.azurewebsites.net/) doesn’t just recommend data or provide infrastructure — it directly  benchmarks models to produce confident influence scores, with sub-**2-minute** cached results and **10–30 minute** fresh evaluations across **100 open source datasets** on a 8gb RAM and 2 vCPU host.
+## How it works
+Teams define a target capability or task → *Dowser* identifies high impact datasets from [Huggingface](https://huggingface.co/) and suggests optimized training directions.
+## Why now?
+- Training costs are exploding while performance gains are flattening
+- Synthetic data is increasingly contaminating training pipelines
+- Teams need precision, not more data
+- Influence methods are now viable via proxy models and distillation
+## Market
+- Every company training or fine tuning LLMs
+- 59% of AI budgets go to training data
+- 40% of firms spend over 70% of AI budget on data
+- Initial wedge is small and mid sized model teams