End of training
Browse files- README.md +20 -87
- TESTE_RAPIDO.md +55 -0
- config.json +45 -0
- generation_config.json +6 -0
- merges.txt +0 -0
- model.safetensors +3 -0
- runs/Aug18_14-40-34_estudio-6b/events.out.tfevents.1755538840.estudio-6b.2677472.0 +3 -0
- runs/Aug18_14-41-19_estudio-6b/events.out.tfevents.1755538887.estudio-6b.2677472.1 +3 -0
- runs/Aug18_14-42-45_estudio-6b/events.out.tfevents.1755538967.estudio-6b.2677472.2 +3 -0
- runs/Aug18_15-34-14_estudio-6b/events.out.tfevents.1755542059.estudio-6b.2677472.3 +3 -0
- runs/Aug18_15-34-44_estudio-6b/events.out.tfevents.1755542085.estudio-6b.2677472.4 +3 -0
- runs/Aug18_15-35-06_estudio-6b/events.out.tfevents.1755542107.estudio-6b.2677472.5 +3 -0
- runs/Aug18_15-35-09_estudio-6b/events.out.tfevents.1755542110.estudio-6b.2677472.6 +3 -0
- runs/Aug18_15-35-11_estudio-6b/events.out.tfevents.1755542112.estudio-6b.2677472.7 +3 -0
- runs/Aug18_15-35-14_estudio-6b/events.out.tfevents.1755542114.estudio-6b.2677472.8 +3 -0
- runs/Aug18_15-35-16_estudio-6b/events.out.tfevents.1755542116.estudio-6b.2677472.9 +3 -0
- runs/Aug18_15-35-18_estudio-6b/events.out.tfevents.1755542118.estudio-6b.2677472.10 +3 -0
- runs/Aug18_15-35-20_estudio-6b/events.out.tfevents.1755542121.estudio-6b.2677472.11 +3 -0
- runs/Aug18_15-35-22_estudio-6b/events.out.tfevents.1755542123.estudio-6b.2677472.12 +3 -0
- runs/Aug18_15-35-28_estudio-6b/events.out.tfevents.1755542129.estudio-6b.2677472.13 +3 -0
- runs/Aug18_15-35-30_estudio-6b/events.out.tfevents.1755542132.estudio-6b.2677472.14 +3 -0
- runs/Aug18_15-35-48_estudio-6b/events.out.tfevents.1755542149.estudio-6b.2677472.15 +3 -0
- runs/Aug18_15-35-51_estudio-6b/events.out.tfevents.1755542152.estudio-6b.2677472.16 +3 -0
- runs/Aug18_15-35-55_estudio-6b/events.out.tfevents.1755542155.estudio-6b.2677472.17 +3 -0
- runs/Aug18_15-40-59_estudio-6b/events.out.tfevents.1755542461.estudio-6b.2677472.18 +3 -0
- runs/Aug18_15-40-59_estudio-6b/events.out.tfevents.1755549261.estudio-6b.2677472.19 +3 -0
- special_tokens_map.json +6 -0
- test_inference.py +92 -0
- tokenizer.json +0 -0
- tokenizer_config.json +21 -0
- training_args.bin +3 -0
- vocab.json +0 -0
README.md
CHANGED
|
@@ -1,97 +1,49 @@
|
|
| 1 |
---
|
| 2 |
-
license: cc-by-4.0
|
| 3 |
-
language: pt
|
| 4 |
library_name: transformers
|
|
|
|
| 5 |
base_model: distilbert/distilgpt2
|
| 6 |
-
tags:
|
| 7 |
- generated_from_trainer
|
| 8 |
model-index:
|
| 9 |
- name: eli5_clm-model
|
| 10 |
results: []
|
| 11 |
---
|
| 12 |
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
Modelo de Linguagem Causal (Causal Language Model, CLM) fine-tunado a partir de [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2).
|
| 16 |
|
| 17 |
-
|
| 18 |
-
https://huggingface.co/docs/transformers/tasks/language_modeling#causal-language-modeling
|
| 19 |
|
| 20 |
-
|
|
|
|
| 21 |
- Loss: 3.8254
|
| 22 |
|
| 23 |
-
##
|
| 24 |
-
|
| 25 |
-
Um CLM aprende a prever o próximo token dado o contexto anterior, sendo adequado para geração de texto auto-regressiva. Aqui utilizamos o DistilGPT-2 como base e realizamos fine-tuning em um conjunto de dados local (não especificado neste card). O objetivo é adaptar o modelo ao domínio/estilo desejado.
|
| 26 |
-
|
| 27 |
-
## Usos previstos e limitações
|
| 28 |
-
|
| 29 |
-
- Geração de texto condicionada a um prompt.
|
| 30 |
-
- Completar sentenças ou parágrafos em língua portuguesa/inglesa (dependendo dos dados de treino).
|
| 31 |
-
- Não é um verificador de fatos; pode alucinar conteúdo.
|
| 32 |
-
- Evite uso em cenários sensíveis sem validação humana.
|
| 33 |
-
|
| 34 |
-
## Como testar rapidamente (linha de comando)
|
| 35 |
-
|
| 36 |
-
1) Crie/ative um ambiente Python e instale dependências mínimas:
|
| 37 |
-
- transformers, torch, accelerate, safetensors
|
| 38 |
-
2) Execute o script `test_inference.py` (fornecido nesta pasta):
|
| 39 |
-
|
| 40 |
-
```bash
|
| 41 |
-
python test_inference.py \
|
| 42 |
-
--model_dir . \
|
| 43 |
-
--prompt "Explique em termos simples o que é aprendizado de máquina." \
|
| 44 |
-
--max_new_tokens 80
|
| 45 |
-
```
|
| 46 |
|
| 47 |
-
|
| 48 |
-
- `--temperature` (controle de criatividade, ex.: 0.7)
|
| 49 |
-
- `--top_p` (amostragem nucleus, ex.: 0.9)
|
| 50 |
-
- `--seed` (reprodutibilidade)
|
| 51 |
|
| 52 |
-
##
|
| 53 |
|
| 54 |
-
|
| 55 |
-
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 56 |
-
import torch
|
| 57 |
|
| 58 |
-
|
| 59 |
-
tokenizer = AutoTokenizer.from_pretrained(model_dir)
|
| 60 |
-
model = AutoModelForCausalLM.from_pretrained(model_dir)
|
| 61 |
|
| 62 |
-
|
| 63 |
-
inputs = tokenizer(prompt, return_tensors="pt")
|
| 64 |
-
with torch.no_grad():
|
| 65 |
-
outputs = model.generate(
|
| 66 |
-
**inputs,
|
| 67 |
-
max_new_tokens=80,
|
| 68 |
-
temperature=0.7,
|
| 69 |
-
top_p=0.9,
|
| 70 |
-
do_sample=True,
|
| 71 |
-
)
|
| 72 |
-
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 73 |
-
```
|
| 74 |
|
| 75 |
-
##
|
| 76 |
|
| 77 |
-
|
| 78 |
-
- Tarefa: modelagem de linguagem causal (próximo token).
|
| 79 |
-
- Observação: para reprodutibilidade completa, registre e publique a origem dos dados quando possível.
|
| 80 |
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
### Hiperparâmetros de treino
|
| 84 |
-
|
| 85 |
-
Os seguintes hiperparâmetros foram usados durante o treino:
|
| 86 |
- learning_rate: 2e-05
|
| 87 |
- train_batch_size: 8
|
| 88 |
- eval_batch_size: 8
|
| 89 |
- seed: 42
|
| 90 |
-
- optimizer: ADAMW_TORCH_FUSED
|
| 91 |
- lr_scheduler_type: linear
|
| 92 |
- num_epochs: 3.0
|
| 93 |
|
| 94 |
-
###
|
| 95 |
|
| 96 |
| Training Loss | Epoch | Step | Validation Loss |
|
| 97 |
|:-------------:|:-----:|:----:|:---------------:|
|
|
@@ -99,29 +51,10 @@ Os seguintes hiperparâmetros foram usados durante o treino:
|
|
| 99 |
| 3.8243 | 2.0 | 2622 | 3.8266 |
|
| 100 |
| 3.7832 | 3.0 | 3933 | 3.8254 |
|
| 101 |
|
| 102 |
-
|
|
|
|
| 103 |
|
| 104 |
- Transformers 4.55.1
|
| 105 |
- Pytorch 2.8.0+cu128
|
| 106 |
- Datasets 4.0.0
|
| 107 |
- Tokenizers 0.21.4
|
| 108 |
-
|
| 109 |
-
## Reproduzindo o treino
|
| 110 |
-
|
| 111 |
-
O fine-tuning seguiu o guia oficial de CLM dos Transformers (link acima), utilizando `Trainer` com `AutoModelForCausalLM` e `AutoTokenizer`. Para reproduzir:
|
| 112 |
-
1) Prepare o dataset em texto (um exemplo por linha funciona bem).
|
| 113 |
-
2) Tokenize com o tokenizer do modelo base.
|
| 114 |
-
3) Treine com os hiperparâmetros acima, salvando checkpoints nesta pasta.
|
| 115 |
-
|
| 116 |
-
## Estrutura desta pasta
|
| 117 |
-
|
| 118 |
-
- `config.json`, `tokenizer.json`, `tokenizer_config.json`, `vocab.json`, `merges.txt`: artefatos do modelo/tokenizer.
|
| 119 |
-
- `model.safetensors`, `generation_config.json`: pesos e config de geração.
|
| 120 |
-
- `checkpoint-*`: checkpoints do treinamento.
|
| 121 |
-
- `runs/`: logs do treinamento (ex.: TensorBoard).
|
| 122 |
-
- `test_inference.py`: script de teste por CLI.
|
| 123 |
-
- `TESTE_RAPIDO.md`: guia de execução rápida.
|
| 124 |
-
|
| 125 |
-
## Aviso
|
| 126 |
-
|
| 127 |
-
Este modelo pode produzir saídas inexatas ou tendenciosas. Avalie e filtre conforme o uso pretendido.
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
| 2 |
library_name: transformers
|
| 3 |
+
license: apache-2.0
|
| 4 |
base_model: distilbert/distilgpt2
|
| 5 |
+
tags:
|
| 6 |
- generated_from_trainer
|
| 7 |
model-index:
|
| 8 |
- name: eli5_clm-model
|
| 9 |
results: []
|
| 10 |
---
|
| 11 |
|
| 12 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
| 13 |
+
should probably proofread and complete it, then remove this comment. -->
|
|
|
|
| 14 |
|
| 15 |
+
# eli5_clm-model
|
|
|
|
| 16 |
|
| 17 |
+
This model is a fine-tuned version of [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2) on an unknown dataset.
|
| 18 |
+
It achieves the following results on the evaluation set:
|
| 19 |
- Loss: 3.8254
|
| 20 |
|
| 21 |
+
## Model description
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
+
More information needed
|
|
|
|
|
|
|
|
|
|
| 24 |
|
| 25 |
+
## Intended uses & limitations
|
| 26 |
|
| 27 |
+
More information needed
|
|
|
|
|
|
|
| 28 |
|
| 29 |
+
## Training and evaluation data
|
|
|
|
|
|
|
| 30 |
|
| 31 |
+
More information needed
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
|
| 33 |
+
## Training procedure
|
| 34 |
|
| 35 |
+
### Training hyperparameters
|
|
|
|
|
|
|
| 36 |
|
| 37 |
+
The following hyperparameters were used during training:
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
- learning_rate: 2e-05
|
| 39 |
- train_batch_size: 8
|
| 40 |
- eval_batch_size: 8
|
| 41 |
- seed: 42
|
| 42 |
+
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 43 |
- lr_scheduler_type: linear
|
| 44 |
- num_epochs: 3.0
|
| 45 |
|
| 46 |
+
### Training results
|
| 47 |
|
| 48 |
| Training Loss | Epoch | Step | Validation Loss |
|
| 49 |
|:-------------:|:-----:|:----:|:---------------:|
|
|
|
|
| 51 |
| 3.8243 | 2.0 | 2622 | 3.8266 |
|
| 52 |
| 3.7832 | 3.0 | 3933 | 3.8254 |
|
| 53 |
|
| 54 |
+
|
| 55 |
+
### Framework versions
|
| 56 |
|
| 57 |
- Transformers 4.55.1
|
| 58 |
- Pytorch 2.8.0+cu128
|
| 59 |
- Datasets 4.0.0
|
| 60 |
- Tokenizers 0.21.4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
TESTE_RAPIDO.md
ADDED
|
@@ -0,0 +1,55 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Teste Rápido – eli5_clm-model
|
| 2 |
+
|
| 3 |
+
Este guia mostra como rodar uma inferência rápida no modelo em `eli5_clm-model/`.
|
| 4 |
+
|
| 5 |
+
## Requisitos
|
| 6 |
+
|
| 7 |
+
- Python 3.9+
|
| 8 |
+
- Pacotes:
|
| 9 |
+
- transformers
|
| 10 |
+
- torch
|
| 11 |
+
- accelerate
|
| 12 |
+
- safetensors
|
| 13 |
+
|
| 14 |
+
Instalação (exemplo com venv):
|
| 15 |
+
```bash
|
| 16 |
+
python -m venv .venv
|
| 17 |
+
source .venv/bin/activate # Linux/macOS
|
| 18 |
+
# .venv\Scripts\activate # Windows (PowerShell)
|
| 19 |
+
pip install --upgrade pip
|
| 20 |
+
pip install transformers torch accelerate safetensors
|
| 21 |
+
```
|
| 22 |
+
|
| 23 |
+
Se preferir, você pode usar o arquivo `huggin-face/causal-language-model/requirements.txt`, mas para inferência basta instalar os pacotes acima.
|
| 24 |
+
|
| 25 |
+
## Executando a inferência
|
| 26 |
+
|
| 27 |
+
Dentro da pasta `eli5_clm-model/` execute:
|
| 28 |
+
```bash
|
| 29 |
+
python test_inference.py \
|
| 30 |
+
--model_dir . \
|
| 31 |
+
--prompt "Explique em termos simples o que é aprendizado de máquina." \
|
| 32 |
+
--max_new_tokens 80 \
|
| 33 |
+
--temperature 0.7 \
|
| 34 |
+
--top_p 0.9
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
Saída esperada: um texto continuando o prompt fornecido.
|
| 38 |
+
|
| 39 |
+
Parâmetros úteis:
|
| 40 |
+
- `--max_new_tokens`: quantidade máxima de tokens gerados.
|
| 41 |
+
- `--temperature`: controla aleatoriedade (0.7 é um bom ponto de partida).
|
| 42 |
+
- `--top_p`: nucleus sampling (0.9 é comum).
|
| 43 |
+
- `--seed`: fixa a aleatoriedade para reproduzir resultados.
|
| 44 |
+
- `--device`: `auto` (padrão), `cpu` ou `cuda`.
|
| 45 |
+
|
| 46 |
+
## Dicas
|
| 47 |
+
|
| 48 |
+
- Se houver GPU CUDA disponível, o script usará automaticamente, a menos que `--device cpu` seja especificado.
|
| 49 |
+
- Para resultados mais determinísticos, use `--seed 42` (ou outro valor fixo) e desative amostragem (`--do_sample false`).
|
| 50 |
+
- Para prompts longos, aumente `max_new_tokens` com cautela para evitar respostas muito extensas.
|
| 51 |
+
|
| 52 |
+
## Referência
|
| 53 |
+
|
| 54 |
+
O modelo foi treinado seguindo o tutorial de Causal Language Modeling dos Transformers:
|
| 55 |
+
https://huggingface.co/docs/transformers/tasks/language_modeling#causal-language-modeling
|
config.json
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_num_labels": 1,
|
| 3 |
+
"activation_function": "gelu_new",
|
| 4 |
+
"architectures": [
|
| 5 |
+
"GPT2LMHeadModel"
|
| 6 |
+
],
|
| 7 |
+
"attn_pdrop": 0.1,
|
| 8 |
+
"bos_token_id": 50256,
|
| 9 |
+
"embd_pdrop": 0.1,
|
| 10 |
+
"eos_token_id": 50256,
|
| 11 |
+
"id2label": {
|
| 12 |
+
"0": "LABEL_0"
|
| 13 |
+
},
|
| 14 |
+
"initializer_range": 0.02,
|
| 15 |
+
"label2id": {
|
| 16 |
+
"LABEL_0": 0
|
| 17 |
+
},
|
| 18 |
+
"layer_norm_epsilon": 1e-05,
|
| 19 |
+
"model_type": "gpt2",
|
| 20 |
+
"n_ctx": 1024,
|
| 21 |
+
"n_embd": 768,
|
| 22 |
+
"n_head": 12,
|
| 23 |
+
"n_inner": null,
|
| 24 |
+
"n_layer": 6,
|
| 25 |
+
"n_positions": 1024,
|
| 26 |
+
"reorder_and_upcast_attn": false,
|
| 27 |
+
"resid_pdrop": 0.1,
|
| 28 |
+
"scale_attn_by_inverse_layer_idx": false,
|
| 29 |
+
"scale_attn_weights": true,
|
| 30 |
+
"summary_activation": null,
|
| 31 |
+
"summary_first_dropout": 0.1,
|
| 32 |
+
"summary_proj_to_labels": true,
|
| 33 |
+
"summary_type": "cls_index",
|
| 34 |
+
"summary_use_proj": true,
|
| 35 |
+
"task_specific_params": {
|
| 36 |
+
"text-generation": {
|
| 37 |
+
"do_sample": true,
|
| 38 |
+
"max_length": 50
|
| 39 |
+
}
|
| 40 |
+
},
|
| 41 |
+
"torch_dtype": "float32",
|
| 42 |
+
"transformers_version": "4.55.1",
|
| 43 |
+
"use_cache": true,
|
| 44 |
+
"vocab_size": 50257
|
| 45 |
+
}
|
generation_config.json
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_from_model_config": true,
|
| 3 |
+
"bos_token_id": 50256,
|
| 4 |
+
"eos_token_id": 50256,
|
| 5 |
+
"transformers_version": "4.55.1"
|
| 6 |
+
}
|
merges.txt
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:71a3a5892221cf87ec69583ec8590a40f910f2e6c9e8d4f04c380e0889b8b599
|
| 3 |
+
size 327657928
|
runs/Aug18_14-40-34_estudio-6b/events.out.tfevents.1755538840.estudio-6b.2677472.0
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0d2ffe968be2081242c1deb76850443a96c0d62f453506cf834196aff79c327a
|
| 3 |
+
size 5291
|
runs/Aug18_14-41-19_estudio-6b/events.out.tfevents.1755538887.estudio-6b.2677472.1
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b79b7a30b416e03298a4e6a7a50f946258eba8559cfac5eea232ce803b209843
|
| 3 |
+
size 5291
|
runs/Aug18_14-42-45_estudio-6b/events.out.tfevents.1755538967.estudio-6b.2677472.2
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7c0d3a5f44a6cd7e6c746167dd38a10dc740c40359d80e1eb8c60240bdc3446c
|
| 3 |
+
size 5291
|
runs/Aug18_15-34-14_estudio-6b/events.out.tfevents.1755542059.estudio-6b.2677472.3
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2483e19c0209be95712673b6f96f13f4153fe4b692b21939a44c07a6ce614e88
|
| 3 |
+
size 5291
|
runs/Aug18_15-34-44_estudio-6b/events.out.tfevents.1755542085.estudio-6b.2677472.4
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a03d0db9e9efecf74a032110d131f9cffcde69f432c8fb0519eadef6f0acf266
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-06_estudio-6b/events.out.tfevents.1755542107.estudio-6b.2677472.5
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:74595dba73a0bb4a2b62eb6782332f8c392eb29e03f29ff443f93ad8069ed6ba
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-09_estudio-6b/events.out.tfevents.1755542110.estudio-6b.2677472.6
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3623d332d40d156daf3ea57514264e0c9f979432cf72aa6f0e416398c3397b75
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-11_estudio-6b/events.out.tfevents.1755542112.estudio-6b.2677472.7
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:56082a3f64801a1a4a1af2d99072eedb4739ec3be5116f73b289870e6da1f094
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-14_estudio-6b/events.out.tfevents.1755542114.estudio-6b.2677472.8
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:11c5ad4478836f800eb4c64c21e0dcabe3e2f18dc2e1f3bb292a8914d1bca584
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-16_estudio-6b/events.out.tfevents.1755542116.estudio-6b.2677472.9
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b0b31439e498a9e140fa7eff109d01d086c5c4dc5a76dfc4aa47d649b376a002
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-18_estudio-6b/events.out.tfevents.1755542118.estudio-6b.2677472.10
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:db9ee4957a12f893de2b440e6a7ee4a5135c18c850e6ffe484b6a8b256d15041
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-20_estudio-6b/events.out.tfevents.1755542121.estudio-6b.2677472.11
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7606b6660494b032cc72ae18eb4a045bd9534f4b28e27cc190375e1e3992d092
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-22_estudio-6b/events.out.tfevents.1755542123.estudio-6b.2677472.12
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:22db0d70178b20208ad0cbbae791d8b5bfb0c5a84fac8aaf711eded155e71414
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-28_estudio-6b/events.out.tfevents.1755542129.estudio-6b.2677472.13
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:009ab06e291c49a23676c50819401dad47e8de821c962bae757304d8d5c22da8
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-30_estudio-6b/events.out.tfevents.1755542132.estudio-6b.2677472.14
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cf99476a4aecd93a4483358c6ae4f3aed10b854520633820d8d180074939f537
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-48_estudio-6b/events.out.tfevents.1755542149.estudio-6b.2677472.15
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6f6c0ebc37c7c4b4561caebfc4e62714164f6f82fe59af6cd6bcf699a1c9866a
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-51_estudio-6b/events.out.tfevents.1755542152.estudio-6b.2677472.16
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fe4421fb136a6fd4c037c05eef54fb85555cd11aa49e30b7541bcbad5211d061
|
| 3 |
+
size 5291
|
runs/Aug18_15-35-55_estudio-6b/events.out.tfevents.1755542155.estudio-6b.2677472.17
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:af0a24869419ec436322017e64ad2b0b571e38de2594dae9bad72f988b27c301
|
| 3 |
+
size 5291
|
runs/Aug18_15-40-59_estudio-6b/events.out.tfevents.1755542461.estudio-6b.2677472.18
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2bcb32f288417dc24e6a157a80e216ce2f16790336f21a8c61778ec1920d0f85
|
| 3 |
+
size 7935
|
runs/Aug18_15-40-59_estudio-6b/events.out.tfevents.1755549261.estudio-6b.2677472.19
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a1da0dc72af66f9378254bfa8afee39c34f027538308dc40d129af73be6e43da
|
| 3 |
+
size 359
|
special_tokens_map.json
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"bos_token": "<|endoftext|>",
|
| 3 |
+
"eos_token": "<|endoftext|>",
|
| 4 |
+
"pad_token": "<|endoftext|>",
|
| 5 |
+
"unk_token": "<|endoftext|>"
|
| 6 |
+
}
|
test_inference.py
ADDED
|
@@ -0,0 +1,92 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
import argparse
|
| 3 |
+
import os
|
| 4 |
+
import sys
|
| 5 |
+
|
| 6 |
+
import torch
|
| 7 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
def parse_args():
|
| 11 |
+
parser = argparse.ArgumentParser(description="Teste de inferência para eli5_clm-model (CLM)")
|
| 12 |
+
parser.add_argument("--model_dir", type=str, default=".", help="Diretório do modelo (pasta que contém config.json, tokenizer, pesos, etc.)")
|
| 13 |
+
parser.add_argument("--prompt", type=str, required=True, help="Texto de entrada para geração")
|
| 14 |
+
parser.add_argument("--max_new_tokens", type=int, default=80, help="Máximo de novos tokens a gerar")
|
| 15 |
+
parser.add_argument("--temperature", type=float, default=0.7, help="Temperatura para amostragem (criatividade)")
|
| 16 |
+
parser.add_argument("--top_p", type=float, default=0.9, help="Top-p (nucleus sampling)")
|
| 17 |
+
parser.add_argument("--do_sample", type=lambda x: str(x).lower() in {"1","true","yes","y"}, default=True,
|
| 18 |
+
help="Se verdadeiro, usa amostragem; se falso, greedy (padrao: true)")
|
| 19 |
+
parser.add_argument("--seed", type=int, default=None, help="Semente para reprodutibilidade")
|
| 20 |
+
parser.add_argument("--device", type=str, choices=["auto", "cpu", "cuda"], default="auto",
|
| 21 |
+
help="Força dispositivo: auto/cpu/cuda")
|
| 22 |
+
return parser.parse_args()
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
def select_device(choice: str) -> torch.device:
|
| 26 |
+
if choice == "cpu":
|
| 27 |
+
return torch.device("cpu")
|
| 28 |
+
if choice == "cuda":
|
| 29 |
+
if torch.cuda.is_available():
|
| 30 |
+
return torch.device("cuda")
|
| 31 |
+
print("[aviso] CUDA não disponível, usando CPU.")
|
| 32 |
+
return torch.device("cpu")
|
| 33 |
+
# auto
|
| 34 |
+
if torch.cuda.is_available():
|
| 35 |
+
return torch.device("cuda")
|
| 36 |
+
return torch.device("cpu")
|
| 37 |
+
|
| 38 |
+
|
| 39 |
+
def main():
|
| 40 |
+
args = parse_args()
|
| 41 |
+
|
| 42 |
+
if args.seed is not None:
|
| 43 |
+
torch.manual_seed(args.seed)
|
| 44 |
+
if torch.cuda.is_available():
|
| 45 |
+
torch.cuda.manual_seed_all(args.seed)
|
| 46 |
+
|
| 47 |
+
device = select_device(args.device)
|
| 48 |
+
print(f"[info] Usando dispositivo: {device}")
|
| 49 |
+
|
| 50 |
+
model_dir = os.path.abspath(args.model_dir)
|
| 51 |
+
if not os.path.isdir(model_dir):
|
| 52 |
+
print(f"[erro] Diretório do modelo não encontrado: {model_dir}")
|
| 53 |
+
sys.exit(1)
|
| 54 |
+
|
| 55 |
+
print("[info] Carregando tokenizer e modelo...")
|
| 56 |
+
tokenizer = AutoTokenizer.from_pretrained(model_dir)
|
| 57 |
+
model = AutoModelForCausalLM.from_pretrained(model_dir)
|
| 58 |
+
model.to(device)
|
| 59 |
+
model.eval()
|
| 60 |
+
|
| 61 |
+
inputs = tokenizer(args.prompt, return_tensors="pt")
|
| 62 |
+
inputs = {k: v.to(device) for k, v in inputs.items()}
|
| 63 |
+
|
| 64 |
+
gen_kwargs = {
|
| 65 |
+
"max_new_tokens": args.max_new_tokens,
|
| 66 |
+
"do_sample": args.do_sample,
|
| 67 |
+
}
|
| 68 |
+
if args.do_sample:
|
| 69 |
+
gen_kwargs.update({
|
| 70 |
+
"temperature": args.temperature,
|
| 71 |
+
"top_p": args.top_p,
|
| 72 |
+
})
|
| 73 |
+
|
| 74 |
+
print("[info] Gerando texto...")
|
| 75 |
+
with torch.no_grad():
|
| 76 |
+
outputs = model.generate(**inputs, **gen_kwargs)
|
| 77 |
+
|
| 78 |
+
full_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
| 79 |
+
print("\n=== Saída completa ===\n")
|
| 80 |
+
print(full_text)
|
| 81 |
+
|
| 82 |
+
# Tentar extrair apenas a continuação gerada (se compatível com o tokenizer)
|
| 83 |
+
try:
|
| 84 |
+
prompt_len = len(tokenizer.decode(inputs["input_ids"][0], skip_special_tokens=True))
|
| 85 |
+
print("\n=== Continuação gerada ===\n")
|
| 86 |
+
print(full_text[prompt_len:])
|
| 87 |
+
except Exception:
|
| 88 |
+
pass
|
| 89 |
+
|
| 90 |
+
|
| 91 |
+
if __name__ == "__main__":
|
| 92 |
+
main()
|
tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"add_prefix_space": false,
|
| 3 |
+
"added_tokens_decoder": {
|
| 4 |
+
"50256": {
|
| 5 |
+
"content": "<|endoftext|>",
|
| 6 |
+
"lstrip": false,
|
| 7 |
+
"normalized": true,
|
| 8 |
+
"rstrip": false,
|
| 9 |
+
"single_word": false,
|
| 10 |
+
"special": true
|
| 11 |
+
}
|
| 12 |
+
},
|
| 13 |
+
"bos_token": "<|endoftext|>",
|
| 14 |
+
"clean_up_tokenization_spaces": false,
|
| 15 |
+
"eos_token": "<|endoftext|>",
|
| 16 |
+
"extra_special_tokens": {},
|
| 17 |
+
"model_max_length": 1024,
|
| 18 |
+
"pad_token": "<|endoftext|>",
|
| 19 |
+
"tokenizer_class": "GPT2Tokenizer",
|
| 20 |
+
"unk_token": "<|endoftext|>"
|
| 21 |
+
}
|
training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:88174501987faa52255f1774d5641a1710e68886fe84b3299dbeda12317d34ea
|
| 3 |
+
size 5777
|
vocab.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|