Improve language tag

#2
by lbourdois - opened
Files changed (1) hide show
  1. README.md +107 -95
README.md CHANGED
@@ -1,96 +1,108 @@
1
- ---
2
- license: creativeml-openrail-m
3
- datasets:
4
- - prithivMLmods/Prompt-Enhancement-Mini
5
- - gokaygokay/prompt-enhancement-75k
6
- - gokaygokay/prompt-enhancer-dataset
7
- language:
8
- - en
9
- base_model:
10
- - Qwen/Qwen2.5-7B-Instruct
11
- pipeline_tag: text-generation
12
- library_name: transformers
13
- tags:
14
- - Qwen2.5
15
- - Prompt_Enhance
16
- - 7B
17
- - Instruct
18
- - safetensors
19
- - pytorch
20
- - Promptist-Instruct
21
- - text-generation-inference
22
- - art
23
- ---
24
-
25
- ### Novaeus-Promptist-7B-Instruct Uploaded Model Files
26
-
27
- The **Novaeus-Promptist-7B-Instruct** is a fine-tuned large language model derived from the **Qwen2.5-7B-Instruct** base model. It is optimized for **prompt enhancement, text generation**, and **instruction-following tasks**, providing high-quality outputs tailored to various applications.
28
-
29
- | **File Name [ Uploaded Files ]** | **Size** | **Description** | **Upload Status** |
30
- |--------------------------------------------|---------------|------------------------------------------|-------------------|
31
- | `.gitattributes` | 1.57 kB | Git attributes configuration for LFS. | Uploaded |
32
- | `README.md` | 400 Bytes | Documentation about the model. | Updated |
33
- | `added_tokens.json` | 657 Bytes | Custom tokens for tokenizer. | Uploaded |
34
- | `config.json` | 860 Bytes | Configuration for the model. | Uploaded |
35
- | `generation_config.json` | 281 Bytes | Configuration for text generation. | Uploaded |
36
- | `merges.txt` | 1.82 MB | Byte-pair encoding (BPE) merge rules. | Uploaded |
37
- | `pytorch_model-00001-of-00004.bin` | 4.88 GB | Model weights (split part 1). | Uploaded (LFS) |
38
- | `pytorch_model-00002-of-00004.bin` | 4.93 GB | Model weights (split part 2). | Uploaded (LFS) |
39
- | `pytorch_model-00003-of-00004.bin` | 4.33 GB | Model weights (split part 3). | Uploaded (LFS) |
40
- | `pytorch_model-00004-of-00004.bin` | 1.09 GB | Model weights (split part 4). | Uploaded (LFS) |
41
- | `pytorch_model.bin.index.json` | 28.1 kB | Index file for model weights. | Uploaded |
42
- | `special_tokens_map.json` | 644 Bytes | Map of special tokens for tokenizer. | Uploaded |
43
- | `tokenizer.json` | 11.4 MB | Tokenizer data in JSON format. | Uploaded (LFS) |
44
- | `tokenizer_config.json` | 7.73 kB | Tokenizer configuration file. | Uploaded |
45
- | `vocab.json` | 2.78 MB | Vocabulary for tokenizer. | Uploaded |
46
-
47
- ---
48
- ![Screenshot 2024-12-07 113150.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/pqFaT-78hssi106bfJwpN.png)
49
- ### **Key Features:**
50
-
51
- 1. **Prompt Refinement:**
52
- Designed to enhance input prompts by rephrasing, clarifying, and optimizing for more precise outcomes.
53
-
54
- 2. **Instruction Following:**
55
- Accurately follows complex user instructions for various generation tasks, including creative writing, summarization, and question answering.
56
-
57
- 3. **Customization and Fine-Tuning:**
58
- Incorporates datasets specifically curated for prompt optimization, enabling seamless adaptation to specific user needs.
59
-
60
- ---
61
- ### **Training Details:**
62
- - **Base Model:** [Qwen2.5-7B-Instruct](#)
63
- - **Datasets Used for Fine-Tuning:**
64
- - **gokaygokay/prompt-enhancer-dataset:** Focuses on prompt engineering with 17.9k samples.
65
- - **gokaygokay/prompt-enhancement-75k:** Encompasses a wider array of prompt styles with 73.2k samples.
66
- - **prithivMLmods/Prompt-Enhancement-Mini:** A compact dataset (1.16k samples) for iterative refinement.
67
-
68
- ---
69
- ### **Capabilities:**
70
-
71
- - **Prompt Optimization:**
72
- Automatically refines and enhances user-input prompts for better generation results.
73
-
74
- - **Instruction-Based Text Generation:**
75
- Supports diverse tasks, including:
76
- - Creative writing (stories, poems, scripts).
77
- - Summaries and paraphrasing.
78
- - Custom Q&A systems.
79
-
80
- - **Efficient Fine-Tuning:**
81
- Adaptable to additional fine-tuning tasks by leveraging the model's existing high-quality instruction-following capabilities.
82
-
83
- ---
84
-
85
- ### **Usage Instructions:**
86
-
87
- 1. **Setup:**
88
- - Ensure all necessary model files, including shards, tokenizer configurations, and index files, are downloaded and placed in the correct directory.
89
-
90
- 2. **Load Model:**
91
- Use PyTorch or Hugging Face Transformers to load the model and tokenizer. Ensure `pytorch_model.bin.index.json` is correctly set for efficient shard-based loading.
92
-
93
- 3. **Customize Generation:**
94
- Adjust parameters in `generation_config.json` to control aspects such as temperature, top-p sampling, and maximum sequence length.
95
-
 
 
 
 
 
 
 
 
 
 
 
 
96
  ---
 
1
+ ---
2
+ license: creativeml-openrail-m
3
+ datasets:
4
+ - prithivMLmods/Prompt-Enhancement-Mini
5
+ - gokaygokay/prompt-enhancement-75k
6
+ - gokaygokay/prompt-enhancer-dataset
7
+ language:
8
+ - zho
9
+ - eng
10
+ - fra
11
+ - spa
12
+ - por
13
+ - deu
14
+ - ita
15
+ - rus
16
+ - jpn
17
+ - kor
18
+ - vie
19
+ - tha
20
+ - ara
21
+ base_model:
22
+ - Qwen/Qwen2.5-7B-Instruct
23
+ pipeline_tag: text-generation
24
+ library_name: transformers
25
+ tags:
26
+ - Qwen2.5
27
+ - Prompt_Enhance
28
+ - 7B
29
+ - Instruct
30
+ - safetensors
31
+ - pytorch
32
+ - Promptist-Instruct
33
+ - text-generation-inference
34
+ - art
35
+ ---
36
+
37
+ ### Novaeus-Promptist-7B-Instruct Uploaded Model Files
38
+
39
+ The **Novaeus-Promptist-7B-Instruct** is a fine-tuned large language model derived from the **Qwen2.5-7B-Instruct** base model. It is optimized for **prompt enhancement, text generation**, and **instruction-following tasks**, providing high-quality outputs tailored to various applications.
40
+
41
+ | **File Name [ Uploaded Files ]** | **Size** | **Description** | **Upload Status** |
42
+ |--------------------------------------------|---------------|------------------------------------------|-------------------|
43
+ | `.gitattributes` | 1.57 kB | Git attributes configuration for LFS. | Uploaded |
44
+ | `README.md` | 400 Bytes | Documentation about the model. | Updated |
45
+ | `added_tokens.json` | 657 Bytes | Custom tokens for tokenizer. | Uploaded |
46
+ | `config.json` | 860 Bytes | Configuration for the model. | Uploaded |
47
+ | `generation_config.json` | 281 Bytes | Configuration for text generation. | Uploaded |
48
+ | `merges.txt` | 1.82 MB | Byte-pair encoding (BPE) merge rules. | Uploaded |
49
+ | `pytorch_model-00001-of-00004.bin` | 4.88 GB | Model weights (split part 1). | Uploaded (LFS) |
50
+ | `pytorch_model-00002-of-00004.bin` | 4.93 GB | Model weights (split part 2). | Uploaded (LFS) |
51
+ | `pytorch_model-00003-of-00004.bin` | 4.33 GB | Model weights (split part 3). | Uploaded (LFS) |
52
+ | `pytorch_model-00004-of-00004.bin` | 1.09 GB | Model weights (split part 4). | Uploaded (LFS) |
53
+ | `pytorch_model.bin.index.json` | 28.1 kB | Index file for model weights. | Uploaded |
54
+ | `special_tokens_map.json` | 644 Bytes | Map of special tokens for tokenizer. | Uploaded |
55
+ | `tokenizer.json` | 11.4 MB | Tokenizer data in JSON format. | Uploaded (LFS) |
56
+ | `tokenizer_config.json` | 7.73 kB | Tokenizer configuration file. | Uploaded |
57
+ | `vocab.json` | 2.78 MB | Vocabulary for tokenizer. | Uploaded |
58
+
59
+ ---
60
+ ![Screenshot 2024-12-07 113150.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/pqFaT-78hssi106bfJwpN.png)
61
+ ### **Key Features:**
62
+
63
+ 1. **Prompt Refinement:**
64
+ Designed to enhance input prompts by rephrasing, clarifying, and optimizing for more precise outcomes.
65
+
66
+ 2. **Instruction Following:**
67
+ Accurately follows complex user instructions for various generation tasks, including creative writing, summarization, and question answering.
68
+
69
+ 3. **Customization and Fine-Tuning:**
70
+ Incorporates datasets specifically curated for prompt optimization, enabling seamless adaptation to specific user needs.
71
+
72
+ ---
73
+ ### **Training Details:**
74
+ - **Base Model:** [Qwen2.5-7B-Instruct](#)
75
+ - **Datasets Used for Fine-Tuning:**
76
+ - **gokaygokay/prompt-enhancer-dataset:** Focuses on prompt engineering with 17.9k samples.
77
+ - **gokaygokay/prompt-enhancement-75k:** Encompasses a wider array of prompt styles with 73.2k samples.
78
+ - **prithivMLmods/Prompt-Enhancement-Mini:** A compact dataset (1.16k samples) for iterative refinement.
79
+
80
+ ---
81
+ ### **Capabilities:**
82
+
83
+ - **Prompt Optimization:**
84
+ Automatically refines and enhances user-input prompts for better generation results.
85
+
86
+ - **Instruction-Based Text Generation:**
87
+ Supports diverse tasks, including:
88
+ - Creative writing (stories, poems, scripts).
89
+ - Summaries and paraphrasing.
90
+ - Custom Q&A systems.
91
+
92
+ - **Efficient Fine-Tuning:**
93
+ Adaptable to additional fine-tuning tasks by leveraging the model's existing high-quality instruction-following capabilities.
94
+
95
+ ---
96
+
97
+ ### **Usage Instructions:**
98
+
99
+ 1. **Setup:**
100
+ - Ensure all necessary model files, including shards, tokenizer configurations, and index files, are downloaded and placed in the correct directory.
101
+
102
+ 2. **Load Model:**
103
+ Use PyTorch or Hugging Face Transformers to load the model and tokenizer. Ensure `pytorch_model.bin.index.json` is correctly set for efficient shard-based loading.
104
+
105
+ 3. **Customize Generation:**
106
+ Adjust parameters in `generation_config.json` to control aspects such as temperature, top-p sampling, and maximum sequence length.
107
+
108
  ---