Spaces:

dataframer
/

README

Configuration error

App Files Files Community

aimonp commited on Mar 18

Commit

1c1f573

verified ·

1 Parent(s): 89d9cd4

Update README.md

Browse files

Files changed (1) hide show

README.md +46 -41

README.md CHANGED Viewed

@@ -1,73 +1,78 @@
 ---
 tags:
-- evaluation
 - testing
 - privacy
-- llm
-- enterprise
-- anonymization
-- data-augmentation
-- simulation
 - regulated-industries
-- insurance
 pretty_name: DataFramer
 license: other
 ---
 # DataFramer
-**Realistic, privacy-safe data for AI testing, evals, and fine-tuning.**
-DataFramer helps AI teams create realistic, diverse datasets for **testing, evaluations, and fine-tuning** without exposing sensitive production data.
-Built for **complex enterprise workflows**, DataFramer supports document-heavy, multi-file, structured, and unstructured data so teams can validate AI systems against real-world variability, not just clean demo cases.
 ## Why teams use DataFramer
-AI projects often stall because the data needed for testing and evaluation is:
-- too sensitive to use directly
-- too limited to cover edge cases
-- too messy to recreate by hand
-- too unrealistic when manually mocked
-DataFramer helps teams generate better test and eval data so models can be assessed against the kinds of variation they will actually face in production.
-## Best-fit use cases
-- **LLM and AI evaluations**
-  Build eval datasets with stronger coverage across common cases, rare cases, and edge cases.
-- **Privacy-safe testing**
-  Work with realistic data for testing and iteration without exposing sensitive production records.
-- **Complex workflow validation**
-  Test systems that depend on long documents, multi-file inputs, nested structures, and business-specific constraints.
-- **Fine-tuning and dataset expansion**
-  Expand sparse datasets with more realistic variation while preserving the patterns your models depend on.
-## Built for enterprise data
-DataFramer is designed for workflows involving:
 - long-form documents and PDFs
-- structured and semi-structured data
-- nested and hierarchical records
-- multi-file samples
-- high-variability real-world business inputs
-## Who it is for
-DataFramer is especially useful for teams in **regulated and data-sensitive environments**, including:
-- insurance
-- financial services
-- healthcare
-- enterprise AI teams working with restricted or hard-to-access data
-## Learn more
-See product examples, use cases, and request access at:
-**https://www.dataframer.ai**

 ---
 tags:
+- synthetic-data
+- data-generation
+- data-anonymization
+- simulation
+- llm-evaluation
+- fine-tuning
 - testing
 - privacy
+- enterprise-ai
 - regulated-industries
 pretty_name: DataFramer
 license: other
 ---
 # DataFramer
+**Generate, anonymize, and simulate diverse datasets from your own data for testing, evals, and fine-tuning.**
+DataFramer helps AI teams take their own data further — creating realistic, privacy-safe datasets for **testing, evaluation, and post-training** without exposing sensitive production records.
+**DataFramer works from your data**, adding diversity while preserving the **structure, distributions, and constraints** your models depend on.
 ## Why teams use DataFramer
+AI teams often get blocked because:
+- **their seed data isn’t enough**
+  Generate diverse, scaled datasets without starting from scratch.
+- **their real data is off-limits**
+  Anonymize sensitive records while keeping structure intact.
+- **their data doesn’t cover what models will face in production**
+  Simulate edge cases, rare scenarios, and real-world variation missing from existing samples.
+## How it works
+DataFramer supports a seed-based workflow for enterprise AI data readiness:
+1. **Seed input** from manual samples or production data
+2. **Anonymize** sensitive records when needed
+3. **Analyze** schema, structure, distributions, and patterns
+4. **Configure** variation, volume, edge cases, and format mix
+5. **Generate** realistic datasets across complex formats
+6. **Use** the outputs for model evaluation, testing, and fine-tuning
+## Built for real enterprise data
+DataFramer works with **any textual dataset — any format, any domain, any complexity**, including:
 - long-form documents and PDFs
+- structured and semi-structured records
+- nested and hierarchical data
+- multi-file workflows
+- high-variability business inputs
+## Best-fit use cases
+- **LLM and AI evaluations**
+  Build stronger eval datasets with better coverage across common, rare, and edge-case scenarios.
+- **Privacy-safe testing**
+  Use realistic datasets for testing and iteration without exposing sensitive production data.
+- **Anonymization for AI workflows**
+  Transform restricted real-world data into safe seed inputs for downstream generation and evaluation.
+- **Fine-tuning and dataset expansion**
+  Extend sparse datasets with more realistic variation while preserving fidelity to source patterns.
+## Enterprise-ready
+Built for teams in regulated and data-sensitive environments.
+**Your data never has to leave.**
+Learn more at **https://www.dataframer.ai**