PicoKittens commited on
Commit
6cdd078
·
verified ·
1 Parent(s): 331104e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -3
README.md CHANGED
@@ -1,3 +1,63 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - arxiv_abstracts
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - tiny
10
+ - pico
11
+ - scratch
12
+ - llama-2
13
+ - academic
14
+ ---
15
+
16
+ # AbstractsLlama-8M
17
+
18
+ AbstractsLlama-8M is an ultra-compact, "pico-sized" language model **trained from scratch** by **Pico-Kittens**. It utilizes the **Llama 2 architecture** and is specifically optimized for generating scientific and academic text.
19
+
20
+ ## Model Details
21
+
22
+ - **Developed by:** Pico-Kittens
23
+ - **Model type:** Llama 2-based Causal Language Model
24
+ - **Training Status:** Trained from scratch (Not a fine-tune)
25
+ - **Parameters:** ~8 Million
26
+ - **Language(s):** English
27
+ - **License:** apache-2.0
28
+
29
+ ## Training Data
30
+
31
+ The model was trained on a large-scale collection of **ArXiv abstracts**. The training objective was to compress the structural patterns, technical nomenclature, and "academic tone" of scientific research into a minimal parameter budget.
32
+
33
+ ## Capabilities & Limitations
34
+
35
+ AbstractsLlama-8M is an experimental model. While it effectively mimics the syntax of research papers, users should be aware of the following:
36
+
37
+ * **Scientific Syntax:** Highly competent; it excels at producing the "feel" of a formal research proposal or abstract.
38
+ * **Architecture:** Implements the Llama 2 transformer block structure at a micro scale.
39
+ * **Hallucinations:** Extremely high. The model will invent methodologies, chemical structures, and mathematical frameworks that do not exist.
40
+ * **Context:** Limited. It is best suited for short-form generation (under 128 tokens).
41
+
42
+ ---
43
+
44
+ ## Generation Sample
45
+
46
+ **User:** *We propose*
47
+
48
+ **AbstractsLlama-8M:**
49
+ > We propose a unified framework for modeling large-scale non-linearity of Cancer (NCI) problems with a variable-scale dataset for the linearized dynamics of polynomial conjugal structure. Our key idea of a multi-objective-centile-based model with a fixed, non-preferred variational autoencoder (NMAE) for feature extraction, which includes ax-aware, non-convex optimization formulation for both a single
50
+
51
+ ---
52
+
53
+ ## How to Get Started
54
+
55
+ ```python
56
+ import torch
57
+ from transformers import pipeline
58
+
59
+ device = 0 if torch.cuda.is_available() else -1
60
+ pipe = pipeline("text-generation", model="PicoKittens/AbstractsLlama-8M", device=device)
61
+
62
+ output = pipe("We propose", max_new_tokens=100, do_sample=True)
63
+ print(output[0]['generated_text'])