File size: 1,316 Bytes
e35ac6f
d99a0e7
e35ac6f
 
 
 
 
b04cd96
3a93634
 
d99a0e7
3ea819c
89a740b
 
 
 
 
 
 
e987967
 
89a740b
 
 
 
 
 
e987967
89a740b
 
 
 
 
 
 
 
1801754
89a740b
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
license: mit
datasets:
- mengmeong/coding-skill-real-world-needs
language:
- en
base_model: nisten/Biggie-SmoLlm-0.15B-Base
pipeline_tag: text-generation
inference:
  parameters:
    model_file: meng-coding-skill.gguf
    temperature: 1
---
# Programming Skills Learning Path Model

This model is a fine-tuned version of the base mdoel designed to generate path of learning a skill based on input text. It's particularly useful for identifying emerging trends and skill combinations in the rapidly evolving tech landscape.

## Usage & Limitations

![llama.cpp demo](meng-cli.gif)

The model is intended for:
 - Deploying in limited CPU resource, with average about 40 tps on 1 CPU core

The model has limits:
 - The dataset might not capture the very latest tools development in programming world
 - Chatbot usecase does not fit the model usecase
 - The model only return the response as JSON list.

Please note that this model was trained on a custom dataset and may reflect biases present in that data.

### Training Hyperparameters

- **Batch Size:** 4
- **Optimizer:** Experimental GrokAdamW

## Little Training Metrics

![Eval Loss](eval_loss.png)
![Eval Runtime](eval_runtime.png)
![Eval Sample Per Seconds](eval_sample_per_secs.png)
![Eval Steps per Seconds](eval_sps.png)
![Loss on Train](train_loss.png)