asi
/

albert-act-base

Eval Results (legacy)

Model card Files Files and versions

Metrics Training metrics Community

asi commited on Oct 18, 2022

Commit

d0331cb

·

1 Parent(s): d2e72e4

Update README.md

Files changed (1) hide show

README.md +4 -5

README.md CHANGED Viewed

@@ -10,20 +10,20 @@ Implementation of the paper "How Many Layers and Why? An Analysis of the Model D
 ## Model architecture
-We augment a multi-layer transformer encoder with a halting mechanism, which allows dynamically adjusting the number of layers for each token.
 We directly adapted this mechanism from Graves ([2016](#graves-2016)). At each iteration, we compute a probability for each token to stop updating its state.
 ## Model use
-The architecture is not yet directly included in the Transformers library. So you shoud install the code implementation first:
 ```bash
 pip install git+https://github.com/AntoineSimoulin/adaptive-depth-transformers
 ```
-Then You can you se model directly
-```pyhton
 import sys
 sys.path.append('adaptative-depth-transformers')
@@ -41,7 +41,6 @@ outputs.updates
 # tensor([[[[15.,  9., 10.,  7.,  3.,  8.,  5.,  7., 12., 10.,  6.,  8.,  8.,  9., 5.,  8.]]]])
 ```
 ## Citations
 ### BibTeX entry and citation info

 ## Model architecture
+We augment a multi-layer transformer encoder with a halting mechanism, which dynamically adjusts the number of layers for each token.
 We directly adapted this mechanism from Graves ([2016](#graves-2016)). At each iteration, we compute a probability for each token to stop updating its state.
 ## Model use
+The architecture is not yet directly included in the Transformers library. The code used for pre-training is available in the following [github repository](https://github.com/AntoineSimoulin/adaptive-depth-transformers). So you should install the code implementation first:
 ```bash
 pip install git+https://github.com/AntoineSimoulin/adaptive-depth-transformers
 ```
+Then you can use the model directly.
+```python
 import sys
 sys.path.append('adaptative-depth-transformers')
 # tensor([[[[15.,  9., 10.,  7.,  3.,  8.,  5.,  7., 12., 10.,  6.,  8.,  8.,  9., 5.,  8.]]]])
 ```
 ## Citations
 ### BibTeX entry and citation info