ToyLlama 13M

ToyLlama 13M is a tiniest possible (I think at least) size that can do something minimally realistic in terms of english knowledge. (though, it is at big max tokens stuck with repetitive loops.)

Example generations

All were generated with test.py interactive CLI. (Usage: python3 test.py)

>>> Enter prompt (temp: 0.7): Hello
------------------------------------------------------------
Hello!  I'm sorry, I can't believe you, but I don't know if you're a
creative person, I'm just a little punk, but I'm sure I can do it.  I'm
not really interested in the way I'm really in the computer, and I'm sure
you can't believe me, and I'm sure you're not in the computer system.
I've never met my parents, I've had a chance to have a computer
crime.  I'm not sure why I'm a little bit younger than I'd ever been
talking to.  I'm really just a little bit surprised, but I don't
think so.  I don't
------------------------------------------------------------
>>> Enter prompt (temp: 0.7): The Art of Technology Digest is an open forum dedicated to sharing
------------------------------------------------------------
The Art of Technology Digest is an open forum dedicated to sharing
information among computerists and to the presentation and debate of
diverse views.  CuD material may  be reprinted for non-profit as long
as the source is cited. Authors hold a presumptive copyright, and
they should be contacted for reprint permission.  It is assumed that
non-personal mail to the moderators may be reprinted unless otherwise
specified.  Readers are encouraged to submit reasoned articles
relating to computer culture and communication.  Articles are
preferred to short responses.  Please avoid quoting previous posts
unless absolutely necessary.

DISCLAIMER: The views represented herein do not necessarily represent
------------------------------------------------------------

Training information

Trained for 1 hour on 130M training tokens using one RX 6600 (8GB VRAM).

Training data

Data was from Gutenberg (just 6MB) and textfiles.com (BBS forums) (455MB). So, it has NO knowledge of anything other then english.

Downloads last month: 2

Safetensors

Model size

13.6M params

Tensor type

F32

Collection including sapbot/toyllama-13m

ToyLlama

Collection

Llamas, trained as experiment, on one RX 6600. • 3 items • Updated May 18