MEscriva commited on
Commit
8f85972
Β·
verified Β·
1 Parent(s): 17e196e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -73
README.md CHANGED
@@ -26,76 +26,4 @@ The core objectives include:
26
  - ensuring robustness across heterogeneous conditions (telephone bandwidth, online conferencing, real-world noise);
27
  - maintaining strong multilingual capabilities while specializing in French;
28
  - delivering reproducible and scientifically-rigorous evaluation frameworks;
29
- - enabling deployment on sovereign infrastructures with strict data governance constraints.
30
-
31
- ---
32
-
33
- ## 2. Research Themes
34
-
35
- ### **2.1 High-precision French ASR**
36
- Tailored to:
37
-
38
- - meetings and long-form speech
39
- - lectures and institutional discourse
40
- - spontaneous multi-speaker environments
41
- - regional and international accents
42
-
43
- ### **2.2 Multilingual preservation**
44
- Although optimized for French, models are evaluated to maintain strong cross-lingual performance.
45
-
46
- ### **2.3 Robustness, frugality, and domain adaptation**
47
- Research includes:
48
-
49
- - noise augmentation and channel variability
50
- - 8 kHz telephony adaptation
51
- - low-compute inference
52
- - distillation for edge deployment
53
-
54
- ### **2.4 Benchmarking and dataset engineering**
55
- Gilbert designs domain-specific datasets and benchmark suites for evaluating ASR in conditions representative of operational environments.
56
-
57
- ---
58
-
59
- ## 3. Current Model Family
60
-
61
- ### **Gilbert-FR-Source (2025)**
62
- Baseline model fine-tuned on curated high-quality French corpora; foundation for all domain-specific variants.
63
-
64
- ### Upcoming research releases:
65
-
66
- - **Gilbert-FR-Longform-v1** β€” extended-speech and meeting optimization
67
- - **Gilbert-FR-Accents-v1** β€” regional and international accent specialization
68
- - **Gilbert-FR-Telephone-v1** β€” 8 kHz telephony channel adaptation
69
- - **Gilbert-Edu-ASR-v1** β€” education-specific speech model
70
- - **Gilbert-Multilingual-v1** β€” French-centric multilingual enhancement
71
-
72
- ---
73
-
74
- ## 4. Baseline Performance (WER)
75
-
76
- | Dataset | WER |
77
- |--------|------|
78
- | MLS (FR) | **3.98%** |
79
- | Common Voice v13 (FR) | **7.28%** |
80
- | VoxPopuli (FR) | **8.91%** |
81
- | Fleurs (FR) | **4.84%** |
82
- | African-accent French | **4.20%** |
83
-
84
- These results position Gilbert-FR among the strongest open-source ASR models available for the French language.
85
-
86
- ---
87
-
88
- ## 5. Research Principles
89
-
90
- 1. **Reproducibility**
91
- 2. **Scientific rigor**
92
- 3. **Transparent evaluation**
93
- 4. **Sovereign infrastructure**
94
- 5. **Operational relevance**
95
-
96
- ---
97
-
98
- ## 6. Contact
99
-
100
- - Website: **https://gilbert-assistant.fr**
101
- - Contact: **mathis@lexiapro.fr**
 
26
  - ensuring robustness across heterogeneous conditions (telephone bandwidth, online conferencing, real-world noise);
27
  - maintaining strong multilingual capabilities while specializing in French;
28
  - delivering reproducible and scientifically-rigorous evaluation frameworks;
29
+ - enabling deployment on sovereign infrastructures with strict data governance constraints.