Arthur Samuel Galego Panucci FIgueiredo

Update README.md

165dad9 verified about 1 month ago

4.13 kB

	---
	base_model: google/gemma-3-270m-it
	library_name: peft
	pipeline_tag: text-generation
	tags:
	- base_model:adapter:google/gemma-3-270m-it
	- lora
	- transformers
	---

	🧠 MODEL CARD — DogeAI-v1.0-instruct
	Model Details

	Model Description
	DogeAI-v1.0-instruct is an early-stage instruction-following language model fine-tuned for conversational use and experimentation. This version is intended as a proof of concept (v1) and focuses on language generation rather than reliable logical reasoning.

	Developed by: Arthur(loboGOAT)
	Funded by: Independent / Community-driven
	Shared by: Arthur(loboGOAT)
	Model type: Small Instruction-Tuned Language Model
	Language(s): Portuguese (primary), multilingual tendencies inherited from base model
	License: Apache 2.0 (or the same license as the base model, if different)
	Finetuned from model: Gemma-3-270M-it

	Model Sources

	Repository: loboGOAT/DogeAI-v1.0-instruct
	Paper: Not available
	Demo: Not available

	Uses
	Direct Use

	Conversational experiments

	Text generation and rewriting

	Prompt testing and evaluation

	Educational use to study limitations of small LLMs

	Downstream Use (Optional)

	Further fine-tuning

	Research on alignment, reasoning, and instruction-following

	Benchmarking small models

	Out-of-Scope Use

	Tasks requiring reliable logical reasoning

	Mathematical proof or formal logic

	Decision-making systems

	Safety-critical or automated validation tasks

	Recommendations

	This model should not be relied upon for reasoning-intensive tasks.
	Users are encouraged to treat DogeAI-v1.0-instruct as an experimental model and expect occasional logical inconsistencies, multilingual drift, or overgeneration.

	Future versions aim to address these limitations through:

	cleaner datasets

	improved stopping criteria

	alternative base models

	How to Get Started with the Model
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("loboGOAT/DogeAI-v1.0-instruct")
	model = AutoModelForCausalLM.from_pretrained("loboGOAT/DogeAI-v1.0-instruct")

	inputs = tokenizer("Olá! Vamos conversar?", return_tensors="pt")
	outputs = model.generate(
	**inputs,
	max_new_tokens=128,
	temperature=0.65,
	top_p=0.95
	)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))

	Training Details
	Training Data

	The model was fine-tuned on a custom instruction-style dataset, primarily in Portuguese, designed to encourage conversational responses.
	The dataset does not focus on formal logic or structured reasoning.

	Training Procedure

	Preprocessing

	Instruction–response formatting

	Text normalization

	No explicit chain-of-thought supervision

	Training Hyperparameters

	Training regime: Supervised fine-tuning (SFT)
	PEFT: Yes (LoRA-based fine-tuning)

	Evaluation
	Testing Data

	Manual testing and prompt-based evaluation.

	Factors

	Logical consistency

	Instruction-following

	Language fluency

	Metrics

	No automated benchmarks were used for this version.

	Results

	Strong conversational fluency for model size

	Inconsistent logical reasoning

	Occasional overgeneration beyond intended response

	Summary
	Model Examination

	DogeAI-v1.0-instruct demonstrates the strengths and limitations of small instruction-tuned language models.
	While capable of natural conversation, it lacks robust reasoning abilities, which will be a focus of future iterations.

	Environmental Impact

	Hardware Type: Consumer GPU / Local Machine
	Hours used: Low
	Cloud Provider: None
	Compute Region: Local
	Carbon Emitted: Negligible

	Technical Specifications
	Model Architecture and Objective

	Decoder-only Transformer

	Next-token prediction

	Instruction-following objective

	Compute Infrastructure

	Local training environment.

	Hardware

	Consumer-grade GPU / CPU

	Software

	Transformers

	PEFT 0.18.0

	PyTorch

	Citation

	BibTeX:

	@misc{dogeai_v1_2025,
	title={DogeAI-v1.0-instruct},
	author={Arthur},
	year={2025},
	note={Early experimental instruction-tuned language model}
	}


	APA:
	Arthur (2025). DogeAI-v1.0-instruct: An experimental instruction-tuned language model.

	Model Card Authors

	Arthur

	Model Card Contact

	(your Hugging Face profile or GitHub)