Upload folder using huggingface_hub

Files changed (10) hide show

README.md ADDED Viewed

+---
+language:
+- en
+license: apache-2.0
+tags:
+- pytorch
+- causal-lm
+- arxiv
+- lfm2
+datasets:
+- FlameF0X/arXiv-AI-ML
+---
+# LFM2 Model - Pretrained on arXiv AI/ML Papers
+## Model Description
+This is a LFM2 (Liquid Foundation Model 2) architecture pretrained on the arXiv AI/ML dataset.
+### Model Architecture
+- **Model Type:** Causal Language Model
+- **Architecture:** LFM2
+- **Hidden Size:** 512
+- **Layers:** 8
+- **Attention Heads:** 8
+- **KV Heads:** 4 (Grouped Query Attention)
+- **Max Sequence Length:** 2048
+- **Vocabulary Size:** 50257
+### Training Details
+- **Dataset:** FlameF0X/arXiv-AI-ML
+- **Training Samples:** 2500
+- **Batch Size:** 4
+- **Learning Rate:** 0.0003
+- **Epochs:** 3
+- **Final Loss:** 3.8660
+### Usage
+```python
+from transformers import AutoTokenizer
+import torch
+# Load tokenizer
+tokenizer = AutoTokenizer.from_pretrained("./lfm2-arxiv-model")
+# Load model (you'll need the model class definition)
+# model = Lfm2ForCausalLM.from_pretrained("./lfm2-arxiv-model")
+# Generate text
+# input_text = "Attention mechanisms in neural networks"
+# inputs = tokenizer(input_text, return_tensors="pt")
+# outputs = model.generate(**inputs, max_length=100)
+# print(tokenizer.decode(outputs[0]))
+```
+### Citation
+If you use this model, please cite:
+```bibtex
+@misc{lfm2-arxiv-2025,
+  title={LFM2 Pretrained on arXiv AI/ML},
+  author={Your Name},
+  year={2025},
+  publisher={HuggingFace},
+}
+```

config.json ADDED Viewed

+{
+  "model_type": "lfm2",
+  "architectures": [
+    "Lfm2ForCausalLM"
+  ],
+  "vocab_size": 50257,
+  "hidden_size": 512,
+  "intermediate_size": 2048,
+  "num_hidden_layers": 8,
+  "num_attention_heads": 8,
+  "num_key_value_heads": 4,
+  "max_position_embeddings": 2048,
+  "initializer_range": 0.02,
+  "norm_eps": 1e-05,
+  "pad_token_id": 0,
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "rope_theta": 1000000.0,
+  "block_multiple_of": 256,
+  "block_ffn_dim_multiplier": 1.0,
+  "block_auto_adjust_ff_dim": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.36.0"
+}

generation_config.json ADDED Viewed

+{
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "pad_token_id": 0,
+  "max_length": 2048,
+  "do_sample": true,
+  "temperature": 0.7,
+  "top_p": 0.9
+}

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

model.safetensors ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:8ed7ec33289f4e1b601bd300c9cd697d65283618f7085e9abedf2b45a97aa8e3
+size 203632480

special_tokens_map.json ADDED Viewed

+{
+  "bos_token": "<|endoftext|>",
+  "eos_token": "<|endoftext|>",
+  "pad_token": "<|endoftext|>",
+  "unk_token": "<|endoftext|>"
+}

tied_weights.json ADDED Viewed

+{
+  "lm_head.weight": "model.embed_tokens.weight"
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "50256": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<|endoftext|>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|endoftext|>",
+  "extra_special_tokens": {},
+  "model_max_length": 1024,
+  "pad_token": "<|endoftext|>",
+  "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": "<|endoftext|>"
+}

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff