MK0727
/

lambda-1-160m-base

Text Generation

Model card Files Files and versions

lambda-1-160m-base / README.md

MK0727's picture

Update README.md

116fa4e verified 4 days ago

|

History Blame Contribute Delete

1.44 kB

	---
	language:
	- ja
	library_name: transformers
	tags:
	- myllm
	- causal-lm
	- custom-code
	- safetensors
	pipeline_tag: text-generation
	---

	# lambda-1-160m-base

	lambda-1-160m-base is an experimental language model created with a custom `myllm` decoder-only Transformer implementation.

	All training code is publicly available at [KeisukeMiyamoto1324/myllm](https://github.com/KeisukeMiyamoto1324/myllm).

	## Model Details

	\| Item \| Value \|
	\|---\|---:\|
	\| Parameters \| 164.5M \|
	\| Architecture \| Decoder-only Transformer \|
	\| Context length \| 1024 tokens \|
	\| Tokenizer \| Byte-level BPE \|
	\| Vocabulary size \| 65,536 \|
	\| Layers \| 16 \|
	\| Hidden size \| 768 \|
	\| Attention heads \| 12 \|
	\| FFN size \| 3,072 \|

	## Training Data

	The model was pretrained on a Japanese text mixture.

	\| Dataset \| Notes \|
	\|---\|---\|
	\| `MK0727/CleanedFineWeb2Edu-jp` \| Filtered Japanese web corpus \|
	\| `MK0727/SyntheticTextbook-jp` \| Synthetic Japanese corpus \|

	## Usage
	```bash
	git clone https://github.com/KeisukeMiyamoto1324/lambda.git
	cd lambda
	python3 -m venv venv
	source venv/bin/activate
	pip3 install -r requirements.txt

	python3 src/inference_base/inference_hf.py \
	--prompt "人工知能とは" \
	--max-new-tokens 64
	```


	## Limitations

	This model is not instruction-tuned or safety-aligned. It may generate incorrect, biased, unsafe, or low-quality text.

	The model was trained on a limited Japanese corpus mixture and has not been evaluated on standard benchmarks.