shafire commited on
Commit
87efaf9
·
verified ·
1 Parent(s): 5c2098f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -22
README.md CHANGED
@@ -1,26 +1,27 @@
1
- ---
2
- tags:
3
- - autotrain
4
- - text-generation-inference
5
- - text-generation
6
- - peft
7
- library_name: transformers
8
- base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
9
- widget:
10
- - messages:
11
- - role: user
12
- content: What is your favorite condiment?
13
- license: other
14
- ---
15
 
16
- # Model Trained Using AutoTrain
17
 
18
- This model was trained using AutoTrain. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain).
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
- # Usage
21
-
22
- ```python
23
 
 
 
24
  from transformers import AutoModelForCausalLM, AutoTokenizer
25
 
26
  model_path = "PATH_TO_THIS_REPO"
@@ -32,7 +33,7 @@ model = AutoModelForCausalLM.from_pretrained(
32
  torch_dtype='auto'
33
  ).eval()
34
 
35
- # Prompt content: "hi"
36
  messages = [
37
  {"role": "user", "content": "hi"}
38
  ]
@@ -41,6 +42,34 @@ input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True,
41
  output_ids = model.generate(input_ids.to('cuda'))
42
  response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)
43
 
44
- # Model response: "Hello! How can I assist you today?"
45
  print(response)
46
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ **QuantumAI: Zero LLM Quantum AI Model**
2
+ This is QuantumAI, a cutting-edge text generation model based on Meta-Llama-3.1-8B-Instruct, fine-tuned for conversational tasks using AutoTrain. The model is designed to handle a variety of natural language processing tasks, with a special focus on interactive dialogue, text generation, and inference.
 
 
 
 
 
 
 
 
 
 
 
 
3
 
4
+ ![Zero LLM Quantum AI](https://huggingface.co/shafire/QuantumAI/blob/main/ZeroQuantumAI.png)
5
 
6
+ *Model Information**
7
+ Base Model: meta-llama/Meta-Llama-3.1-8B
8
+ Fine-tuned Model: meta-llama/Meta-Llama-3.1-8B-Instruct
9
+ Training Framework: AutoTrain
10
+ Training Data: Conversational and text-generation focused dataset
11
+ Tech Stack:
12
+ Transformers
13
+ PEFT (Parameter-Efficient Fine-Tuning)
14
+ TensorBoard (for logging and metrics)
15
+ Safetensors
16
+ Language Model Task: Conversational and Text Generation
17
+ Usage Type: Interactive dialogue and text generation applications
18
+ Quantization: Model supports 4-bit quantization for efficient inference
19
 
20
+ **Installation and Usage**
21
+ To use this model in your code, follow the instructions below:
 
22
 
23
+ python
24
+ Copy code
25
  from transformers import AutoModelForCausalLM, AutoTokenizer
26
 
27
  model_path = "PATH_TO_THIS_REPO"
 
33
  torch_dtype='auto'
34
  ).eval()
35
 
36
+ # Example usage
37
  messages = [
38
  {"role": "user", "content": "hi"}
39
  ]
 
42
  output_ids = model.generate(input_ids.to('cuda'))
43
  response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)
44
 
45
+ # Output
46
  print(response)
47
+ Inference API
48
+ This model is not yet deployed to the Hugging Face Inference API. However, you can deploy it to Inference Endpoints for dedicated serverless inference.
49
+
50
+ Training Process
51
+ The QuantumAI model was trained using AutoTrain with the following configuration:
52
+
53
+ Hardware: CUDA 12.1
54
+ Training Precision: Mixed FP16
55
+ Batch Size: 2
56
+ Learning Rate: 3e-05
57
+ Epochs: 5
58
+ Optimizer: AdamW
59
+ PEFT: Enabled (LoRA with lora_r=16, lora_alpha=32)
60
+ Quantization: Int4 for efficient deployment
61
+ Scheduler: Linear with warmup
62
+ Gradient Accumulation: 4 steps
63
+ Max Sequence Length: 2048 tokens
64
+ Training Metrics
65
+ The model was monitored using TensorBoard during training. Key training metrics included:
66
+
67
+ Training Loss: 1.74
68
+ Learning Rate: Adjusted per epoch, starting at 3e-05.
69
+ Model Features
70
+ Text Generation: Handles various types of user queries and provides coherent responses.
71
+ Conversational AI: Optimized for dialogue generation.
72
+ Efficient Inference: Supports Int4 quantization for faster inference on limited hardware.
73
+ License
74
+ This model is governed under a custom license. Please refer to QuantumAI License. (llama 3.1 license)
75
+