Davidsv commited on
Commit
b0e19cd
·
verified ·
1 Parent(s): cc033ff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -27
README.md CHANGED
@@ -1,22 +1,49 @@
1
  ---
 
2
  base_model:
3
  - mistralai/Mistral-7B-v0.1
4
  - HuggingFaceH4/zephyr-7b-beta
5
  tags:
6
  - merge
7
  - mergekit
8
- - lazymergekit
9
- - mistralai/Mistral-7B-v0.1
10
- - HuggingFaceH4/zephyr-7b-beta
 
 
11
  ---
12
 
13
- # CosmeticVenture
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
- CosmeticVenture is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
16
- * [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
17
- * [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)
18
 
19
- ## 🧩 Configuration
 
 
 
 
 
 
 
 
 
 
20
 
21
  ```yaml
22
  slices:
@@ -35,29 +62,43 @@ parameters:
35
  value: [1, 0.5, 0.7, 0.3, 0]
36
  - value: 0.5
37
  dtype: bfloat16
 
38
  ```
39
 
40
- ## 💻 Usage
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
- ```python
43
- !pip install -qU transformers accelerate
 
 
 
 
44
 
45
- from transformers import AutoTokenizer
46
- import transformers
47
- import torch
48
 
49
- model = "Davidsv/CosmeticVenture"
50
- messages = [{"role": "user", "content": "What is a large language model?"}]
 
 
51
 
52
- tokenizer = AutoTokenizer.from_pretrained(model)
53
- prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
54
- pipeline = transformers.pipeline(
55
- "text-generation",
56
- model=model,
57
- torch_dtype=torch.float16,
58
- device_map="auto",
59
- )
60
 
61
- outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
62
- print(outputs[0]["generated_text"])
63
- ```
 
1
  ---
2
+ license: apache-2.0
3
  base_model:
4
  - mistralai/Mistral-7B-v0.1
5
  - HuggingFaceH4/zephyr-7b-beta
6
  tags:
7
  - merge
8
  - mergekit
9
+ - mistral
10
+ - zephyr
11
+ - slerp
12
+ - cosmetic
13
+ - business
14
  ---
15
 
16
+ # CosmetiGuide-7B
17
+
18
+ This is a specialized model merge created using MergeKit, combining the French language expertise of Mistral-7B with the business acumen of Zephyr-7B through an advanced SLERP fusion technique with variable interpolation values, optimized for the cosmetic industry.
19
+
20
+ ## About Me
21
+
22
+ I'm David Soeiro-Vuong, a third-year Computer Science student at STS1 working on specialized language models. Passionate about artificial intelligence and its business applications, I focus on creating domain-specific model merges that enhance performance for targeted industry use cases.
23
+
24
+ 🔗 [Connect with me on LinkedIn](https://www.linkedin.com/in/david-soeiro-vuong-a28b582ba/)
25
+
26
+ ## Project Overview
27
+
28
+ CosmetiGuide-7B is designed as a specialized LLM for the cosmetic industry, capable of providing expert business guidance for entrepreneurs and professionals in this sector. The model combines Mistral's excellent French language capabilities with Zephyr's business understanding to create a powerful advisor for cosmetic business development.
29
+
30
+ ## Merge Details
31
+
32
+ ### Merge Method
33
 
34
+ This model uses SLERP (Spherical Linear Interpolation) with carefully calibrated parameters:
 
 
35
 
36
+ - **Attention Layers**: Variable interpolation values [0, 0.5, 0.3, 0.7, 1] leveraging Zephyr's instruction-following and business capabilities
37
+ - **MLP Layers**: Variable interpolation values [1, 0.5, 0.7, 0.3, 0] maintaining Mistral's French language expertise and reasoning
38
+ - **Other Parameters**: 0.5 interpolation value creating a balanced fusion
39
+ - **Format**: bfloat16 precision for efficient memory usage
40
+
41
+ ### Models Merged
42
+
43
+ * [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) - A powerful French-origin base model with excellent multilingual capabilities
44
+ * [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) - An instruction-tuned model with strong business reasoning capabilities
45
+
46
+ ### Configuration
47
 
48
  ```yaml
49
  slices:
 
62
  value: [1, 0.5, 0.7, 0.3, 0]
63
  - value: 0.5
64
  dtype: bfloat16
65
+ MODEL_NAME: "CosmetiGuide-7B"
66
  ```
67
 
68
+ ## Model Capabilities
69
+
70
+ This specialized merge combines:
71
+
72
+ - Mistral's excellent French language understanding and multilingual capabilities
73
+ - Zephyr's instruction-following and business reasoning abilities
74
+ - Domain adaptation for the cosmetic industry through strategic parameter fusion
75
+
76
+ The resulting model is optimized for tasks in the cosmetic business sector, such as:
77
+
78
+ - Regulatory guidance for cosmetic products in various markets
79
+ - Business planning and market analysis for cosmetic startups
80
+ - Ingredient formulation and technical documentation assistance
81
+ - Marketing strategy and brand positioning advice
82
+ - Trend analysis and innovation forecasting for beauty products
83
+
84
+ ## Future Development
85
+
86
+ This interim model will be further enhanced once access to Meta-Llama-3-8B is granted, providing even stronger business reasoning capabilities. The model will also be fine-tuned on a specialized dataset consisting of:
87
 
88
+ - Regulatory documentation (EU regulations, BPF, ANSM guidelines)
89
+ - Market research and industry analyses (FEBEA reports, segment studies)
90
+ - Technical documentation (ingredient specifications, manufacturing processes)
91
+ - Business resources (business plans, pricing strategies, case studies)
92
+ - Industry trends (clean beauty, solid cosmetics, circular economy)
93
+ - Marketing strategies (brand positioning, digital strategies)
94
 
95
+ ## Limitations
 
 
96
 
97
+ - Limited domain-specific training beyond parameter merging
98
+ - May require additional fine-tuning for highly specialized cosmetic industry tasks
99
+ - General limitations inherited from the base 7B parameter models
100
+ - Pending enhancement with Meta-Llama-3-8B once access is granted
101
 
102
+ ## License
 
 
 
 
 
 
 
103
 
104
+ This model is released under the Apache 2.0 license, consistent with the underlying models' licenses.