Optitransfer commited on
Commit
1d289b2
·
verified ·
1 Parent(s): 82a16a0

Corporate clean org page: no emojis, Swiss datasets only

Browse files
Files changed (1) hide show
  1. index.html +99 -80
index.html CHANGED
@@ -3,7 +3,7 @@
3
  <head>
4
  <meta charset="UTF-8" />
5
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
6
- <title>OptiTransferData — Sovereign AI Data for Europe</title>
7
  <style>
8
  * { box-sizing: border-box; margin: 0; padding: 0; }
9
  body {
@@ -12,48 +12,61 @@
12
  color: #1e293b;
13
  padding: 2rem 1rem;
14
  }
15
- .container { max-width: 820px; margin: 0 auto; }
16
  .hero {
17
  text-align: center;
18
  padding: 2.5rem 1rem 2rem;
19
  }
20
  .hero h1 { font-size: 2rem; font-weight: 700; color: #0f172a; margin-bottom: 0.5rem; }
21
- .hero p { font-size: 1.1rem; color: #475569; max-width: 600px; margin: 0 auto; line-height: 1.6; }
22
  .badge {
23
  display: inline-block;
24
- background: #e0f2fe;
25
- color: #0369a1;
 
26
  border-radius: 999px;
27
- padding: 0.25rem 0.75rem;
28
  font-size: 0.8rem;
29
  font-weight: 600;
30
  margin: 0.75rem 0.25rem 0;
31
  }
32
  hr { border: none; border-top: 1px solid #e2e8f0; margin: 2rem 0; }
33
- h2 { font-size: 1.25rem; font-weight: 700; color: #0f172a; margin-bottom: 1rem; }
34
  .section { margin-bottom: 2rem; }
35
  .features {
36
  display: grid;
37
- grid-template-columns: repeat(auto-fit, minmax(180px, 1fr));
38
  gap: 1rem;
39
  margin-top: 0.5rem;
40
  }
41
  .feature-card {
42
  background: #fff;
43
  border: 1px solid #e2e8f0;
44
- border-radius: 12px;
45
- padding: 1rem;
46
  }
47
- .feature-card .icon { font-size: 1.5rem; margin-bottom: 0.4rem; }
48
- .feature-card strong { display: block; font-size: 0.9rem; margin-bottom: 0.25rem; color: #0f172a; }
49
- .feature-card p { font-size: 0.82rem; color: #64748b; line-height: 1.4; }
50
  table { width: 100%; border-collapse: collapse; font-size: 0.9rem; }
51
- th { background: #f1f5f9; text-align: left; padding: 0.6rem 0.75rem; font-weight: 600; font-size: 0.8rem; color: #475569; text-transform: uppercase; letter-spacing: 0.05em; }
52
- td { padding: 0.65rem 0.75rem; border-top: 1px solid #f1f5f9; vertical-align: middle; }
53
- tr:hover td { background: #f8fafc; }
54
  a { color: #2563eb; text-decoration: none; }
55
  a:hover { text-decoration: underline; }
56
- .tag { display: inline-block; background: #f1f5f9; color: #475569; border-radius: 4px; padding: 0.15rem 0.45rem; font-size: 0.75rem; margin-right: 0.25rem; }
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  .pricing-grid {
58
  display: grid;
59
  grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
@@ -63,165 +76,171 @@
63
  .price-card {
64
  background: #fff;
65
  border: 1px solid #e2e8f0;
66
- border-radius: 12px;
67
  padding: 1.25rem;
68
  }
69
  .price-card.featured {
70
  border-color: #2563eb;
71
- box-shadow: 0 0 0 1px #2563eb20;
72
  }
73
- .price-card h3 { font-size: 1rem; font-weight: 700; margin-bottom: 0.5rem; }
74
- .price-card .price { font-size: 1.3rem; font-weight: 800; color: #2563eb; margin-bottom: 0.5rem; }
75
- .price-card p { font-size: 0.82rem; color: #64748b; line-height: 1.4; }
76
- .coming-soon { color: #94a3b8; font-size: 0.85rem; }
77
  .payment { display: flex; flex-wrap: wrap; gap: 0.5rem; margin-top: 0.75rem; }
78
  .payment-item {
79
  background: #fff;
80
  border: 1px solid #e2e8f0;
81
  border-radius: 8px;
82
  padding: 0.4rem 0.85rem;
83
- font-size: 0.85rem;
84
  color: #374151;
85
  }
86
  .contact-bar {
87
  background: #0f172a;
88
  color: #e2e8f0;
89
- border-radius: 12px;
90
  padding: 1.5rem;
91
  text-align: center;
92
  margin-top: 2rem;
93
  }
 
94
  .contact-bar a { color: #93c5fd; }
95
- .contact-bar p { font-size: 0.9rem; margin-top: 0.35rem; }
96
- .quality-list { list-style: none; }
97
- .quality-list li { padding: 0.35rem 0; font-size: 0.9rem; color: #374151; }
98
- .quality-list li::before { content: "✅ "; }
99
  </style>
100
  </head>
101
  <body>
102
  <div class="container">
103
 
104
  <div class="hero">
105
- <h1>🏔️ OptiTransferData</h1>
106
- <p>Sovereign AI Data for Europe — production-grade web corpora for LLM training, RAG pipelines, and NLP research.</p>
107
  <div>
108
- <span class="badge">🇨🇭 Curated in Switzerland</span>
109
- <span class="badge">EU AI Act Compliant</span>
110
- <span class="badge">🔒 Independently Audited</span>
 
111
  </div>
112
  </div>
113
 
114
  <hr />
115
 
116
  <div class="section">
117
- <h2>🎯 What We Do</h2>
118
  <div class="features">
119
  <div class="feature-card">
120
- <div class="icon">🤖</div>
121
  <strong>LLM Training</strong>
122
- <p>Sovereign national web corpora at scale for pre-training and fine-tuning</p>
123
  </div>
124
  <div class="feature-card">
125
- <div class="icon">🔍</div>
126
  <strong>RAG Pipelines</strong>
127
  <p>Pre-chunked, embedding-ready corpora with quality scores per chunk</p>
128
  </div>
129
  <div class="feature-card">
130
- <div class="icon">🏛️</div>
131
  <strong>Regulatory NLP</strong>
132
- <p>Domain-classified, jurisdiction-specific government &amp; institutional data</p>
133
  </div>
134
  <div class="feature-card">
135
- <div class="icon">📊</div>
136
  <strong>Research</strong>
137
- <p>Reproducible datasets with full metadata and provenance tracking</p>
138
  </div>
139
  </div>
140
  </div>
141
 
142
  <div class="section">
143
- <h2>📦 Available Datasets</h2>
144
  <table>
145
  <thead>
146
  <tr>
147
  <th>Dataset</th>
148
  <th>Records</th>
149
- <th>Format</th>
150
  <th>Access</th>
151
  </tr>
152
  </thead>
153
  <tbody>
154
  <tr>
155
  <td>
156
- 🇱🇮 <a href="https://huggingface.co/datasets/OptiTransferData/liechtenstein-ultra-premium-li" target="_blank">Liechtenstein Ultra Premium</a><br/>
157
- <span class="tag">German</span><span class="tag">Multilingual</span><span class="tag">37 fields</span>
158
  </td>
159
- <td>35,748</td>
160
- <td>JSONL</td>
161
- <td><a href="https://huggingface.co/datasets/OptiTransferData/liechtenstein-ultra-premium-li" target="_blank">Sample →</a></td>
162
- </tr>
163
- <tr>
164
  <td>
165
- 🇫🇷 <a href="https://huggingface.co/datasets/OptiTransferData/france-sovereign-rag-chunks" target="_blank">France Sovereign RAG Chunks</a><br/>
166
- <span class="tag">French</span><span class="tag">Pre-chunked</span><span class="tag">RAG-ready</span>
167
- </td>
168
- <td>348,829</td>
169
- <td>JSONL (7 shards)</td>
170
- <td><a href="https://huggingface.co/datasets/OptiTransferData/france-sovereign-rag-chunks" target="_blank">Sample →</a></td>
171
- </tr>
172
- <tr>
173
- <td colspan="4" style="color:#94a3b8; font-size:0.85rem; padding-top:0.75rem;">
174
- 🚀 <strong>Coming soon:</strong> 🇩🇪 Germany · 🇦🇹 Austria · 🇨🇭 Switzerland · 🇮🇹 Italy · 🇪🇸 Spain
175
  </td>
176
  </tr>
177
  </tbody>
178
  </table>
179
- <p style="font-size:0.82rem; color:#64748b; margin-top:0.75rem;">Free gated samples available on each dataset — request access to evaluate before purchasing.</p>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
180
  </div>
181
 
182
  <div class="section">
183
- <h2>Quality Standards</h2>
184
  <ul class="quality-list">
185
  <li>Independent QA audits with documented accuracy metrics</li>
186
  <li>SHA-256 integrity verification on all production files</li>
187
- <li>Quality scoring per record (0100 scale)</li>
188
  <li>Domain classification and language detection</li>
189
- <li>EU AI Act compliance full data provenance and licensing transparency</li>
190
- <li>Deduplication at content-level and URL-level</li>
191
- <li>PII detection and handling</li>
 
192
  </ul>
193
  </div>
194
 
195
  <div class="section">
196
- <h2>💼 Licensing &amp; Pricing</h2>
197
  <div class="pricing-grid">
198
  <div class="price-card">
199
- <h3>📋 Sample</h3>
200
  <div class="price">Free</div>
201
- <p>Gated access evaluate data quality before committing</p>
202
  </div>
203
  <div class="price-card featured">
204
- <h3>📦 Full Dataset</h3>
205
  <div class="price">Commercial</div>
206
- <p>Complete production data with commercial licence</p>
207
  </div>
208
  <div class="price-card">
209
- <h3>🏢 Enterprise</h3>
210
  <div class="price">Custom</div>
211
- <p>Dedicated support, SLA, bespoke corpora, bulk pricing</p>
212
  </div>
213
  </div>
214
- <p style="margin-top:1rem; font-size:0.9rem; color:#374151;">📧 Contact us for a quote: <a href="mailto:data@optitransfer.ch">data@optitransfer.ch</a></p>
215
  <div class="payment">
216
- <div class="payment-item">🏦 Bank Transfer (SEPA/SWIFT)</div>
217
- <div class="payment-item">📱 TWINT (Swiss)</div>
218
- <div class="payment-item">Crypto (BTC/ETH/SOL)</div>
219
  </div>
220
  </div>
221
 
222
  <div class="contact-bar">
223
- <strong>🏔️ OptiTransferData</strong>
224
- <p>Curated in Switzerland · <a href="https://optitransfer.ch">optitransfer.ch</a> · <a href="mailto:data@optitransfer.ch">data@optitransfer.ch</a></p>
225
  </div>
226
 
227
  </div>
 
3
  <head>
4
  <meta charset="UTF-8" />
5
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
6
+ <title>OptiTransfer Data</title>
7
  <style>
8
  * { box-sizing: border-box; margin: 0; padding: 0; }
9
  body {
 
12
  color: #1e293b;
13
  padding: 2rem 1rem;
14
  }
15
+ .container { max-width: 860px; margin: 0 auto; }
16
  .hero {
17
  text-align: center;
18
  padding: 2.5rem 1rem 2rem;
19
  }
20
  .hero h1 { font-size: 2rem; font-weight: 700; color: #0f172a; margin-bottom: 0.5rem; }
21
+ .hero p { font-size: 1.05rem; color: #475569; max-width: 640px; margin: 0 auto; line-height: 1.7; }
22
  .badge {
23
  display: inline-block;
24
+ background: #f1f5f9;
25
+ color: #334155;
26
+ border: 1px solid #e2e8f0;
27
  border-radius: 999px;
28
+ padding: 0.25rem 0.85rem;
29
  font-size: 0.8rem;
30
  font-weight: 600;
31
  margin: 0.75rem 0.25rem 0;
32
  }
33
  hr { border: none; border-top: 1px solid #e2e8f0; margin: 2rem 0; }
34
+ h2 { font-size: 1.2rem; font-weight: 700; color: #0f172a; margin-bottom: 1rem; letter-spacing: -0.01em; }
35
  .section { margin-bottom: 2rem; }
36
  .features {
37
  display: grid;
38
+ grid-template-columns: repeat(auto-fit, minmax(190px, 1fr));
39
  gap: 1rem;
40
  margin-top: 0.5rem;
41
  }
42
  .feature-card {
43
  background: #fff;
44
  border: 1px solid #e2e8f0;
45
+ border-radius: 10px;
46
+ padding: 1.1rem;
47
  }
48
+ .feature-card strong { display: block; font-size: 0.9rem; margin-bottom: 0.3rem; color: #0f172a; }
49
+ .feature-card p { font-size: 0.82rem; color: #64748b; line-height: 1.5; }
 
50
  table { width: 100%; border-collapse: collapse; font-size: 0.9rem; }
51
+ th { background: #f1f5f9; text-align: left; padding: 0.6rem 0.75rem; font-weight: 600; font-size: 0.78rem; color: #475569; text-transform: uppercase; letter-spacing: 0.04em; }
52
+ td { padding: 0.65rem 0.75rem; border-top: 1px solid #f1f5f9; vertical-align: top; }
53
+ tr:hover td { background: #fafbfc; }
54
  a { color: #2563eb; text-decoration: none; }
55
  a:hover { text-decoration: underline; }
56
+ .tag {
57
+ display: inline-block;
58
+ background: #f1f5f9;
59
+ color: #475569;
60
+ border-radius: 4px;
61
+ padding: 0.15rem 0.5rem;
62
+ font-size: 0.73rem;
63
+ margin: 0.15rem 0.15rem 0.15rem 0;
64
+ white-space: nowrap;
65
+ }
66
+ .tag-group { margin-top: 0.5rem; line-height: 1.9; }
67
+ .quality-list { list-style: none; }
68
+ .quality-list li { padding: 0.35rem 0; font-size: 0.88rem; color: #374151; padding-left: 1.2rem; position: relative; }
69
+ .quality-list li::before { content: "\2713"; position: absolute; left: 0; color: #16a34a; font-weight: 700; }
70
  .pricing-grid {
71
  display: grid;
72
  grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
 
76
  .price-card {
77
  background: #fff;
78
  border: 1px solid #e2e8f0;
79
+ border-radius: 10px;
80
  padding: 1.25rem;
81
  }
82
  .price-card.featured {
83
  border-color: #2563eb;
84
+ box-shadow: 0 0 0 1px rgba(37,99,235,0.1);
85
  }
86
+ .price-card h3 { font-size: 0.95rem; font-weight: 700; margin-bottom: 0.5rem; color: #0f172a; }
87
+ .price-card .price { font-size: 1.2rem; font-weight: 800; color: #2563eb; margin-bottom: 0.5rem; }
88
+ .price-card p { font-size: 0.82rem; color: #64748b; line-height: 1.5; }
 
89
  .payment { display: flex; flex-wrap: wrap; gap: 0.5rem; margin-top: 0.75rem; }
90
  .payment-item {
91
  background: #fff;
92
  border: 1px solid #e2e8f0;
93
  border-radius: 8px;
94
  padding: 0.4rem 0.85rem;
95
+ font-size: 0.84rem;
96
  color: #374151;
97
  }
98
  .contact-bar {
99
  background: #0f172a;
100
  color: #e2e8f0;
101
+ border-radius: 10px;
102
  padding: 1.5rem;
103
  text-align: center;
104
  margin-top: 2rem;
105
  }
106
+ .contact-bar strong { font-size: 1rem; }
107
  .contact-bar a { color: #93c5fd; }
108
+ .contact-bar p { font-size: 0.88rem; margin-top: 0.35rem; }
109
+ .dataset-detail { margin-top: 0.75rem; }
110
+ .dataset-detail p { font-size: 0.84rem; color: #64748b; line-height: 1.5; margin-bottom: 0.4rem; }
 
111
  </style>
112
  </head>
113
  <body>
114
  <div class="container">
115
 
116
  <div class="hero">
117
+ <h1>OptiTransfer Data</h1>
118
+ <p>Premium web corpora for LLM pre-training, fine-tuning, RAG, and multilingual NLP. Swiss-registered. EU AI Act compliant. Quality-scored, PII-redacted, SHA256-verified.</p>
119
  <div>
120
+ <span class="badge">Swiss-Registered</span>
121
+ <span class="badge">EU AI Act Compliant</span>
122
+ <span class="badge">SHA256 Verified</span>
123
+ <span class="badge">PII Redacted</span>
124
  </div>
125
  </div>
126
 
127
  <hr />
128
 
129
  <div class="section">
130
+ <h2>Capabilities</h2>
131
  <div class="features">
132
  <div class="feature-card">
 
133
  <strong>LLM Training</strong>
134
+ <p>Sovereign national web corpora at scale for pre-training and supervised fine-tuning</p>
135
  </div>
136
  <div class="feature-card">
 
137
  <strong>RAG Pipelines</strong>
138
  <p>Pre-chunked, embedding-ready corpora with quality scores per chunk</p>
139
  </div>
140
  <div class="feature-card">
 
141
  <strong>Regulatory NLP</strong>
142
+ <p>Domain-classified, jurisdiction-specific government and institutional data</p>
143
  </div>
144
  <div class="feature-card">
 
145
  <strong>Research</strong>
146
+ <p>Reproducible datasets with full metadata, provenance tracking, and QA reports</p>
147
  </div>
148
  </div>
149
  </div>
150
 
151
  <div class="section">
152
+ <h2>Available Datasets</h2>
153
  <table>
154
  <thead>
155
  <tr>
156
  <th>Dataset</th>
157
  <th>Records</th>
158
+ <th>Formats</th>
159
  <th>Access</th>
160
  </tr>
161
  </thead>
162
  <tbody>
163
  <tr>
164
  <td>
165
+ <strong><a href="https://huggingface.co/datasets/OptiTransferData/swiss-web-premium-ch" target="_blank">*.ch Swiss Web Premium (A+)</a></strong>
 
166
  </td>
167
+ <td>110,491</td>
168
+ <td>Parquet, JSONL, Language Splits, RAG Chunks</td>
 
 
 
169
  <td>
170
+ <a href="https://huggingface.co/datasets/OptiTransferData/swiss-web-premium-ch" target="_blank">Sample</a> |
171
+ <a href="https://huggingface.co/datasets/OptiTransferData/swiss-web-premium-ch-full" target="_blank">Full</a>
 
 
 
 
 
 
 
 
172
  </td>
173
  </tr>
174
  </tbody>
175
  </table>
176
+
177
+ <div class="dataset-detail">
178
+ <p>Flagship Swiss web corpus from the .ch ccTLD. 112.4M tokens across 78 fields. Multilingual coverage: German (61.2%), French (19.0%), English (10.5%), Italian (4.7%), and 25 additional languages. Nine-component quality model, full provenance chain, and independent QA report.</p>
179
+ <div class="tag-group">
180
+ <span class="tag">LLM Pre-Training</span>
181
+ <span class="tag">Supervised Fine-Tuning (SFT)</span>
182
+ <span class="tag">Retrieval-Augmented Generation</span>
183
+ <span class="tag">Multilingual NLP</span>
184
+ <span class="tag">German Language Models</span>
185
+ <span class="tag">French Language Models</span>
186
+ <span class="tag">Swiss Market AI</span>
187
+ <span class="tag">EU AI Act Compliance</span>
188
+ <span class="tag">Domain-Specific Training</span>
189
+ <span class="tag">Web Corpus Research</span>
190
+ <span class="tag">Text Classification</span>
191
+ <span class="tag">Summarisation</span>
192
+ <span class="tag">Question Answering</span>
193
+ <span class="tag">Translation</span>
194
+ </div>
195
+ </div>
196
+
197
+ <p style="font-size:0.82rem; color:#64748b; margin-top:1rem;">Free gated samples available on each dataset. Request access to evaluate before purchasing.</p>
198
  </div>
199
 
200
  <div class="section">
201
+ <h2>Quality Standards</h2>
202
  <ul class="quality-list">
203
  <li>Independent QA audits with documented accuracy metrics</li>
204
  <li>SHA-256 integrity verification on all production files</li>
205
+ <li>Quality scoring per record (0 to 100 scale, nine components)</li>
206
  <li>Domain classification and language detection</li>
207
+ <li>EU AI Act compliance with full data provenance and licensing transparency</li>
208
+ <li>Content-level and URL-level deduplication</li>
209
+ <li>PII detection and redaction (email, phone, IBAN, AHV, credit card)</li>
210
+ <li>Croissant metadata for ML interoperability</li>
211
  </ul>
212
  </div>
213
 
214
  <div class="section">
215
+ <h2>Licensing and Pricing</h2>
216
  <div class="pricing-grid">
217
  <div class="price-card">
218
+ <h3>Sample</h3>
219
  <div class="price">Free</div>
220
+ <p>Gated access. Evaluate data quality, schema, and documentation before committing.</p>
221
  </div>
222
  <div class="price-card featured">
223
+ <h3>Full Dataset</h3>
224
  <div class="price">Commercial</div>
225
+ <p>Complete production data with commercial licence. All formats included.</p>
226
  </div>
227
  <div class="price-card">
228
+ <h3>Enterprise</h3>
229
  <div class="price">Custom</div>
230
+ <p>Dedicated support, SLA, bespoke corpora, volume pricing.</p>
231
  </div>
232
  </div>
233
+ <p style="margin-top:1rem; font-size:0.88rem; color:#374151;">Contact us for a quote: <a href="mailto:data@optitransfer.ch">data@optitransfer.ch</a></p>
234
  <div class="payment">
235
+ <div class="payment-item">Bank Transfer (SEPA/SWIFT)</div>
236
+ <div class="payment-item">TWINT (Swiss)</div>
237
+ <div class="payment-item">Crypto (BTC / ETH / SOL)</div>
238
  </div>
239
  </div>
240
 
241
  <div class="contact-bar">
242
+ <strong>OptiTransfer Data</strong>
243
+ <p>Swiss-registered | <a href="https://optitransfer.ch">optitransfer.ch</a> | <a href="mailto:data@optitransfer.ch">data@optitransfer.ch</a></p>
244
  </div>
245
 
246
  </div>