GeoV
/

GeoV-9b-r2

 ---
+language:
+- en
+tags:
+- pytorch
+- causal-lm
 license: bigscience-openrail-m
 ---
+[GeoV](https://github.com/geov-ai/geov)-9B-r2 is a 9 billion parameter causal language model.
+It is still being trained and has the same architecture as the [GeoV-9b](https://huggingface.co/GeoV/GeoV-9b) model, but
+the training data is sampled without replacement; (GeoV-9b models training data was sampled with replacement).
+The GeoV model was designed by Georges Harik and uses
+[Rotary Positional Embeddings with Relative distances (RoPER)](https://research.labml.ai/RoPER.html)
+by [Georges Harik](https://twitter.com/gharik) and [Varuna Jayasiri](https://twitter.com/vpj).
+[RoPER](https://research.labml.ai/RoPER.html),
+in addition to using relative positions in the attention score calculation by RoPE embeddings,
+adds relative positional information explicitly to value embeddings.
+Specifically, it incorporates the relative positions of the tokens paid attention to.
+RoPER has given better performance in some algorithmic tasks, and seems comparable to RoPE in language modeling.
+## Model details
+- Developed by: [Georges Harik](http://twitter.com/gharik)
+- Model type: Transformer-based Language Model
+- Language: English
+<figure style="width:30em">
+| Hyperparameter         | Value       |
+| ---------------------- | ----------- |
+| n<sub>parameters</sub> | 9B          |
+| n<sub>layers</sub>     | 32          |
+| d<sub>model</sub>      | 5120        |
+| n<sub>heads</sub>      | 40          |
+| d<sub>head</sub>       | 128         |
+| n<sub>vocab</sub>      | 65500       |
+| Sequence Length        | 2048        |
+</figure>
+The current released weights were trained on ~39 billion tokens.
+We plan to continue training up to 300 billion tokens.
+This training run is monolingual and uses c4en and english wikipedia datasets.
+## Test results
+These are the results from [EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) at 39B (tokens trained) checkpoint.
+|     Task     |Version| Metric | Value |   |Stderr|
+|--------------|------:|--------|------:|---|-----:|
+|anli_r1       |      0|acc     | 0.3390|±  |0.0150|
+|anli_r2       |      0|acc     | 0.3350|±  |0.0149|
+|anli_r3       |      0|acc     | 0.3400|±  |0.0137|
+|hellaswag     |      0|acc     | 0.4332|±  |0.0049|
+|              |       |acc_norm| 0.5628|±  |0.0050|
+|lambada_openai|      0|ppl     |13.2084|±  |0.4599|
+|              |       |acc     | 0.4890|±  |0.0070|
+|mathqa        |      0|acc     | 0.2235|±  |0.0076|
+|              |       |acc_norm| 0.2275|±  |0.0077|
+|piqa          |      0|acc     | 0.7361|±  |0.0103|
+|              |       |acc_norm| 0.7399|±  |0.0102|
+|winogrande    |      0|acc     | 0.5596|±  |0.0140|
+|wsc           |      0|acc     | 0.3942|±  |0.0482|
+## Installation
+```shell
+pip install geov
+```
+## Generation
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/geov-ai/geov/blob/master/notebooks/generate.ipynb)
+```python
+from geov import GeoVForCausalLM, GeoVTokenizer
+model = GeoVForCausalLM.from_pretrained("GeoV/GeoV-9b-r2")
+tokenizer = GeoVTokenizer.from_pretrained("GeoV/GeoV-9b-r2")
+prompt = "In mathematics, topology is the study of"
+input_ids = tokenizer(prompt, return_tensors="pt").input_ids
+gen_tokens = model.generate(
+    input_ids,
+    do_sample=True,
+    temperature=0.9,
+    max_length=100,
+)
+gen_text = tokenizer.batch_decode(gen_tokens)[0]
+```