ExecuTorch
swipe-typing
gesture-typing
shape-writing
keyboard
on-device
ctc
mobile
lee-futo commited on
Commit
fbc41ee
·
verified ·
1 Parent(s): 7c00d97

Model card: drop pipeline_tag (no HF hosted inference for .pte)

Browse files
Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -3,7 +3,6 @@ license: other
3
  license_name: futo-model-weights-license-1.0
4
  license_link: LICENSE.md
5
  library_name: executorch
6
- pipeline_tag: token-classification
7
  tags:
8
  - swipe-typing
9
  - gesture-typing
@@ -23,13 +22,13 @@ Mobile-oriented models for decoding swipe gestures into text.
23
  <img src="https://huggingface.co/futo-org/futo-swipe/resolve/main/animations/swipe_demo_computer.gif" width="640" alt="Swipe decode of the word 'computer'">
24
  </p>
25
 
26
- See the paper, [*coming soon*](https://huggingface.co/futo-org/futo-swipe),
27
- for details.
28
 
29
  ## Models
30
 
31
- This repository contains 3 models that compose together.
32
- Only the encoder is required; the decoder and language model are
33
  additional refinements, leveraging specific layout and language information.
34
  The encoder can decode for **any** keyboard layout,
35
  while the decoder is English/QWERTY-only and the language model is English-only.
@@ -161,8 +160,8 @@ greedy decode: computer
161
  | **input** | `layout_keys` | `[1, 64, 2]` | Per-key `(x, y)` centers, padded to 64 keys |
162
  | **input** | `layout_mask` | `[1, 64]` | Boolean mask of valid keys |
163
  | **output** | `log_emissions` | `[1, 32, 65]` | Log-probabilities over 64 keys + blank |
164
- | **output** | `coefficients` | `[1, 32, 64]` | Spectral coefficients |
165
- | **output** | `lambda` | `[1, 32, 1]` | *Intention* gate |
166
 
167
  The output time dimension is 32, half the 64 input points. The encoder
168
  applies a 2× temporal downsample (a stride-2 adapter) inside the network, so
 
3
  license_name: futo-model-weights-license-1.0
4
  license_link: LICENSE.md
5
  library_name: executorch
 
6
  tags:
7
  - swipe-typing
8
  - gesture-typing
 
22
  <img src="https://huggingface.co/futo-org/futo-swipe/resolve/main/animations/swipe_demo_computer.gif" width="640" alt="Swipe decode of the word 'computer'">
23
  </p>
24
 
25
+ See the paper [(*coming soon*)](https://huggingface.co/futo-org/futo-swipe)
26
+ for more details.
27
 
28
  ## Models
29
 
30
+ This repository contains 3 CNN models that compose together.
31
+ Only the encoder is required. The decoder and language model are
32
  additional refinements, leveraging specific layout and language information.
33
  The encoder can decode for **any** keyboard layout,
34
  while the decoder is English/QWERTY-only and the language model is English-only.
 
160
  | **input** | `layout_keys` | `[1, 64, 2]` | Per-key `(x, y)` centers, padded to 64 keys |
161
  | **input** | `layout_mask` | `[1, 64]` | Boolean mask of valid keys |
162
  | **output** | `log_emissions` | `[1, 32, 65]` | Log-probabilities over 64 keys + blank |
163
+ | **output** | `coefficients` | `[1, 32, 64]` | Spectral coefficients (decoder features) |
164
+ | **output** | `lambda` | `[1, 32, 1]` | *Intention* gate (decoder features) |
165
 
166
  The output time dimension is 32, half the 64 input points. The encoder
167
  applies a 2× temporal downsample (a stride-2 adapter) inside the network, so