Feature Extraction
Transformers
gpt2
nielsr HF Staff commited on
Commit
12eb488
·
verified ·
1 Parent(s): 2fe5112

Improve model card: Add metadata and GitHub link

Browse files

This PR enhances the model card for the `Bochkov/bvv241-max` tokenizer by:
- Adding `license: apache-2.0`, `library_name: transformers`, and `pipeline_tag: feature-extraction` to the YAML metadata. This improves discoverability on the Hugging Face Hub and ensures the correct "how to use" widget appears for the tokenizer.
- Adding a link to the associated research paper for easy reference.
- Including a direct link to the GitHub repository where the code and research resources are hosted.

These changes provide more comprehensive information for users and integrate the model better within the Hugging Face ecosystem.

Files changed (1) hide show
  1. README.md +8 -4
README.md CHANGED
@@ -1,11 +1,15 @@
1
  ---
2
- # For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
3
- # Doc / guide: https://huggingface.co/docs/hub/model-cards
4
- {}
5
  ---
6
 
7
  # bvv241-max: Unified Unicode Tokenizer (SOTA Intersection) with Frozen Embeddings
8
 
 
 
 
 
9
  ## Tokenizer Description
10
 
11
  <!-- Provide a longer summary of what this model is. -->
@@ -76,4 +80,4 @@ If you use this model or the underlying concepts in your research, please cite o
76
  }
77
  ```
78
 
79
- This work demonstrates that transformer blocks, not token embeddings, carry the semantic burden in LLMs — a step toward modular, fusable, multilingual LMs.
 
1
  ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: feature-extraction
5
  ---
6
 
7
  # bvv241-max: Unified Unicode Tokenizer (SOTA Intersection) with Frozen Embeddings
8
 
9
+ This model was presented in the paper [Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate](https://arxiv.org/abs/2507.07129).
10
+
11
+ Code: [https://github.com/Bochkov/BVV241](https://github.com/Bochkov/BVV241)
12
+
13
  ## Tokenizer Description
14
 
15
  <!-- Provide a longer summary of what this model is. -->
 
80
  }
81
  ```
82
 
83
+ This work demonstrates that transformer blocks, not token embeddings, carry the semantic burden in LLMs — a step toward modular, fusable, multilingual LMs.