Update README.md
Browse files
README.md
CHANGED
|
@@ -1,15 +1,99 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
- character
|
| 6 |
-
pipeline_tag: text-generation
|
| 7 |
-
library_name: transformers
|
| 8 |
tags:
|
| 9 |
-
-
|
| 10 |
-
- Qwen
|
| 11 |
-
- unsloth
|
| 12 |
- LLM
|
| 13 |
- PyTorch
|
| 14 |
-
-
|
| 15 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
model_name: Avern Prism 1.0X
|
| 4 |
+
version: 1.0X
|
|
|
|
|
|
|
|
|
|
| 5 |
tags:
|
| 6 |
+
- text-generation
|
|
|
|
|
|
|
| 7 |
- LLM
|
| 8 |
- PyTorch
|
| 9 |
+
- unsloth
|
| 10 |
+
- code
|
| 11 |
+
- Qwen
|
| 12 |
+
- Qwen2.5
|
| 13 |
+
- reasoning
|
| 14 |
+
- general-intelligence
|
| 15 |
+
- programming
|
| 16 |
+
- avern
|
| 17 |
+
- uk
|
| 18 |
+
library_name: transformers
|
| 19 |
+
pipeline_tag: text-generation
|
| 20 |
+
metrics:
|
| 21 |
+
- accuracy
|
| 22 |
+
- perplexity
|
| 23 |
+
- character
|
| 24 |
+
|
| 25 |
+
---
|
| 26 |
+
|
| 27 |
+
# Avern Prism 1.0X
|
| 28 |
+
|
| 29 |
+
**Avern Prism 1.0X** is a state-of-the-art language model developed by **Avern Technology UKI**, built on the **Qwen2.5 14B** architecture. Optimized using the **Unsloth** framework, Prism 1.0X is designed to perform at the intersection of **reasoning**, **coding**, and **general intelligence**, making it suitable for complex problem-solving, logical tasks, and a wide range of applications from software development to AI-driven research and creative tasks.
|
| 30 |
+
|
| 31 |
+
## Model Description
|
| 32 |
+
|
| 33 |
+
- **Base Model**: Qwen2.5 14B
|
| 34 |
+
- **Architecture**: Transformer (Decoder-only)
|
| 35 |
+
- **Training Framework**: PyTorch + Unsloth
|
| 36 |
+
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
|
| 37 |
+
- **Context Length**: Up to 4096 tokens
|
| 38 |
+
- **Use Cases**: Advanced reasoning, problem-solving, code generation, creative content generation, AI research, knowledge extraction, and more.
|
| 39 |
+
|
| 40 |
+
## Key Features
|
| 41 |
+
|
| 42 |
+
- **Reasoning**: Prism 1.0X is optimized for solving complex logical problems, answering deep conceptual questions, and providing step-by-step reasoning for math and algorithmic problems.
|
| 43 |
+
- **Code Generation**: It supports multi-language code generation (Python, JavaScript, C++, etc.), making it ideal for helping developers write, debug, and optimize code.
|
| 44 |
+
- **General Intelligence**: Prism 1.0X is designed with broad capabilities for general-purpose AI tasks such as understanding abstract concepts, creating creative content, and answering domain-specific queries across multiple fields.
|
| 45 |
+
- **Size**: 14B parameters, striking an optimal balance between computational power and versatility.
|
| 46 |
+
- **Adaptability**: Capable of being fine-tuned for specific domains, allowing customization for different applications in research, business, education, or entertainment.
|
| 47 |
+
|
| 48 |
+
## Intended Use
|
| 49 |
+
|
| 50 |
+
This model is ideal for:
|
| 51 |
+
- **Developers**: Assisting with code generation, algorithmic problem solving, and software development tasks.
|
| 52 |
+
- **Researchers**: Leveraging its broad general intelligence to assist with exploratory research, hypothesis generation, and complex problem-solving.
|
| 53 |
+
- **Educators and Students**: Providing tools for learning programming, mathematics, and critical thinking.
|
| 54 |
+
- **Creative Applications**: Writing, brainstorming, and idea generation for creative work.
|
| 55 |
+
- **AI Enthusiasts**: Building custom AI-driven applications with advanced reasoning and coding capabilities.
|
| 56 |
+
|
| 57 |
+
## Training Data
|
| 58 |
+
|
| 59 |
+
Prism 1.0X was fine-tuned on a combination of datasets:
|
| 60 |
+
- **Code**: Datasets featuring a wide variety of programming languages and coding tasks.
|
| 61 |
+
- **Reasoning**: Datasets for logical reasoning, problem-solving, mathematics, and algorithm design.
|
| 62 |
+
- **General Knowledge**: General-domain knowledge, creative writing, and abstract reasoning datasets, including encyclopedic knowledge and instructional content.
|
| 63 |
+
|
| 64 |
+
**Note**: The training data excludes proprietary or private data.
|
| 65 |
+
|
| 66 |
+
## Limitations
|
| 67 |
+
|
| 68 |
+
- **Reasoning and Accuracy**: While Prism 1.0X excels at reasoning, it may not always provide perfect solutions to highly specialized problems or new, unseen domains.
|
| 69 |
+
- **Hallucination Risk**: As with most large language models, Prism 1.0X may generate hallucinated or incorrect information, especially in highly abstract or speculative scenarios.
|
| 70 |
+
- **Context**: Though highly capable, it can still struggle with maintaining perfect context over long conversations or complex multi-step tasks without fine-tuning.
|
| 71 |
+
|
| 72 |
+
## How to Use
|
| 73 |
+
|
| 74 |
+
```python
|
| 75 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 76 |
+
|
| 77 |
+
model = AutoModelForCausalLM.from_pretrained("avernai/prism-1.0x")
|
| 78 |
+
tokenizer = AutoTokenizer.from_pretrained("avernai/prism-1.0x")
|
| 79 |
+
|
| 80 |
+
# Example: Code generation
|
| 81 |
+
prompt = "Write a Python function that calculates the Fibonacci sequence up to n."
|
| 82 |
+
inputs = tokenizer(prompt, return_tensors="pt")
|
| 83 |
+
outputs = model.generate(**inputs, max_new_tokens=150)
|
| 84 |
+
|
| 85 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 86 |
+
|
| 87 |
+
# Example: Logical reasoning
|
| 88 |
+
prompt = "What is the next number in the sequence: 2, 4, 8, 16, ?"
|
| 89 |
+
inputs = tokenizer(prompt, return_tensors="pt")
|
| 90 |
+
outputs = model.generate(**inputs, max_new_tokens=150)
|
| 91 |
+
|
| 92 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 93 |
+
|
| 94 |
+
# Example: General intelligence application
|
| 95 |
+
prompt = "Explain the theory of relativity in simple terms."
|
| 96 |
+
inputs = tokenizer(prompt, return_tensors="pt")
|
| 97 |
+
outputs = model.generate(**inputs, max_new_tokens=200)
|
| 98 |
+
|
| 99 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|