FastAccounting
/

Rize-0.5-tiny

Model card Files Files and versions

Rize-0.5-tiny / README.md

masatof's picture

Update readme.

88c901e verified 2 days ago

|

2.14 kB

	---
	license: mit
	language:
	- en
	- ja
	---
	# Model Card
	## Overview
	Rize is a causal language model for pretraining research and general text generation.
	It uses a Transformer decoder architecture with Mixture-of-Experts (MoE) layers.
	The model is designed for research and experimental development.

	## Model Size and Architecture
	This tiny model has about 4 billion total parameters and about 1 billion active parameters per token.

	Main architecture points:
	- decoder-only Transformer
	- 19 hidden layers
	- hidden size of 1536
	- 12 attention heads
	- 64 routed experts
	- top-4 expert routing per token
	- 1 shared expert
	- vocabulary size of 163,840
	- maximum context length of 8,192 tokens

	## Intended Use
	This model is intended for:
	- language modeling research
	- evaluation of training settings and architectures
	- general text generation benchmarks

	This model is not intended to be used as a source of factual truth or professional advice.

	## Training
	The model is trained with autoregressive next-token prediction on text data.
	It is developed as a research model and may change across checkpoints, runs, and configurations.

	## Capabilities
	- text continuation
	- general question answering
	- instruction-style response generation
	- multilingual text handling, depending on training data

	## Limitations
	- may generate incorrect or misleading information
	- may reflect biases in training data
	- may produce unsafe, harmful, or inappropriate text
	- performance may vary across languages and domains
	- not optimized for high-stakes decisions

	## Safety and Responsible Use
	Users should review outputs before any real-world use.
	The model should not be used on its own for:
	- medical advice
	- legal advice
	- financial advice
	- safety-critical decisions
	- sensitive personal decisions

	Human oversight is required.

	## Disclaimer
	This model is provided for research and experimental purposes only.
	The FA Research Team makes no guarantees regarding accuracy, completeness, reliability, safety, or fitness for a particular purpose.
	Use of this model and its outputs is at the user’s own risk.

	## Contact
	FA Research Team