File size: 5,880 Bytes
daf6d21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
832ea2d
 
 
 
06427c3
daf6d21
832ea2d
 
daf6d21
 
 
 
 
2d0f692
 
 
 
 
e8acada
daf6d21
36d0ac0
 
06427c3
36d0ac0
 
 
 
 
 
 
 
 
 
 
 
daf6d21
 
 
 
 
 
 
 
 
 
 
 
2d0f692
daf6d21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
06427c3
daf6d21
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
license: apache-2.0
library_name: transformers
tags:
- text-generation
- sequential-fine-tuning
- lora
- multi-dataset-model
pipeline_tag: text-generation
widget:
- text: "Once upon a time, in a land far away,"
---
# Sequentially Fine-Tuned Language Model: jnjj/xd_v1
## Model Description
This repository hosts a language model that is being **sequentially fine-tuned** using Low-Rank Adaptation (LoRA) on a diverse range of datasets from the Hugging Face Hub.
The process starts from the base model `jnjj/multi-dataset-model` (or its last fine-tuned state from this repository) and continuously adapts by merging LoRA weights after each dataset training cycle.
This experiment aims to build a model with broad, cumulatively acquired knowledge.
**Current Base for Fine-Tuning:** [jnjj/multi-dataset-model](https://huggingface.co/jnjj/multi-dataset-model)
The fully merged model weights and tokenizer are updated periodically at the root of this repository.
## Training Methodology
- **Iterative Fine-Tuning:** The model undergoes cycles of training on different dataset configurations.
- **LoRA Integration:** PEFT's LoRA is employed for parameter-efficient fine-tuning. Adapters are merged post-training.
- **Dynamic Dataset Source:** The script iterates through a wide array of Hugging Face Hub datasets.
- **Rapid Iteration Strategy:** Training per dataset configuration is brief (`max_steps=1`), prioritizing breadth of exposure over depth on any single dataset.
## Training Progress
- **Datasets Processed (Successfully trained on at least one config):** 1
- **Text Examples Streamed (Total):** 6
- **Tokens Processed (Total):** 3072
- **Last Successful Model Update:** 2025-05-08 18:02:08 UTC
### Evaluation Snapshot (Approximate)

- **Current Perplexity (wikitext Subset):** 282.70
- **Perplexity Change:** `-0.51` ⬇️ (vs previous cycle's perplexity)

#### Generated Examples (Qualitative Assessment)

| Category                   | Input Prompt Snippet                   | Generated Output Snippet                     |
|----------------------------|----------------------------------------|----------------------------------------------|
| Story Continuation         | `Once upon a time, in a small villag...` | `How do I get the best picture of what we... ` |
| Simple Instruction         | `Explain in one sentence why trees a...` | `I have been trying to make progress and ... ` |
| Creative Prompt            | `Describe a friendly robot that love...` | `We are pleased to announce the launch of... ` |
| Question Answering (Basic) | `What is the main color of a ripe ba...` | `As an example we've been using the same ... ` |
| Code Generation (Simple Python) | `Write a Python function that takes ...` | `We are looking forward to seeing us in t... ` |
| Reasoning (Simple)         | `If a train leaves station A at 10:0...` | `The time of day we were trying to get ou... ` |

#### Standard Benchmarks (via `lighteval`)
_Note: Running standard benchmarks requires a dedicated setup using the `lighteval` harness. The table below shows scores if available in `evaluation_stats.json`, otherwise `N/A`._

**Common Benchmarks**
| Category              | Benchmark         | # Shots | Metric         | This Model (`xd_v1`) | Llama 3.1 70B (Ref) |
|-----------------------|-------------------|---------|----------------|----------------------|---------------------|
| Reasoning & Knowledge | MMLU (Avg)        | 5       | acc_norm       | `N/A`                | 79.3                |
| Reasoning & Knowledge | MMLU-Pro          | 5       | acc            | `N/A`                | 53.8                |
| Reasoning & Knowledge | MATH              | 4       | acc            | `N/A`                | 41.6                |
| Reasoning & Knowledge | TruthfulQA (MC2)  | 0       | mc2            | `N/A`                | -                   |
| Reasoning & Knowledge | GPQA Diamond      | 0       | acc            | `N/A`                | 50.5                |
| Code                  | MBPP              | 3       | pass@1         | `N/A`                | 66.4                |
| Code                  | LiveCodeBench     | 0       | pass@1         | `N/A`                | 33.3                |
| Multilingual          | TydiQA            | 1       | f1             | `N/A`                | 29.9                |
| Multilingual          | MGSM              | 0       | acc            | `N/A`                | 91.1                |

## How to Use
Load the model and tokenizer via `transformers`:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "jnjj/xd_v1"
# For local usage after downloading:
# model_id = "./model_files"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# model.to("cuda") # if GPU is available
prompt = "Explain the concept of photosynthesis in simple terms:"
inputs = tokenizer(prompt, return_tensors="pt") # .to("cuda" if GPU available)
output_sequences = model.generate(
    **inputs,
    max_new_tokens=100,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    pad_token_id=tokenizer.eos_token_id
)
generated_text = tokenizer.decode(output_sequences[0], skip_special_tokens=True)
print(generated_text)
```
## Limitations & Considerations
- This model is an experimental artifact of continuous learning; quality and coherence may vary.
- Biases present in the underlying datasets may be reflected or amplified.
- Performance on specific tasks is not guaranteed and may fluctuate as new datasets are incorporated.
- Intended for research and exploration of sequential fine-tuning dynamics. For rigorous benchmarking, consider using tools like [`lighteval`](https://github.com/huggingface/lighteval).
## Disclaimer
This model is provided as-is. It may generate inaccurate, biased, or otherwise problematic content. Users should exercise discretion.