lambda-1-160m-base / README.md
MK0727's picture
Update README.md
116fa4e verified
|
Raw
History Blame Contribute Delete
1.44 kB
metadata
language:
  - ja
library_name: transformers
tags:
  - myllm
  - causal-lm
  - custom-code
  - safetensors
pipeline_tag: text-generation

lambda-1-160m-base

lambda-1-160m-base is an experimental language model created with a custom myllm decoder-only Transformer implementation.

All training code is publicly available at KeisukeMiyamoto1324/myllm.

Model Details

Item Value
Parameters 164.5M
Architecture Decoder-only Transformer
Context length 1024 tokens
Tokenizer Byte-level BPE
Vocabulary size 65,536
Layers 16
Hidden size 768
Attention heads 12
FFN size 3,072

Training Data

The model was pretrained on a Japanese text mixture.

Dataset Notes
MK0727/CleanedFineWeb2Edu-jp Filtered Japanese web corpus
MK0727/SyntheticTextbook-jp Synthetic Japanese corpus

Usage

git clone https://github.com/KeisukeMiyamoto1324/lambda.git
cd lambda
python3 -m venv venv
source venv/bin/activate
pip3 install -r requirements.txt

python3 src/inference_base/inference_hf.py \
  --prompt "人工知能とは" \
  --max-new-tokens 64

Limitations

This model is not instruction-tuned or safety-aligned. It may generate incorrect, biased, unsafe, or low-quality text.

The model was trained on a limited Japanese corpus mixture and has not been evaluated on standard benchmarks.