bigcode_-_tiny_starcoder_py-gguf / README.md

uploaded readme

fa6f9fd verified over 1 year ago

6.69 kB

	Quantization made by Richard Erkhov.

	[Github](https://github.com/RichardErkhov)

	[Discord](https://discord.gg/pvy7H8DZMG)

	[Request more models](https://github.com/RichardErkhov/quant_request)


	tiny_starcoder_py - GGUF
	- Model creator: https://huggingface.co/bigcode/
	- Original model: https://huggingface.co/bigcode/tiny_starcoder_py/


	\| Name \| Quant method \| Size \|
	\| ---- \| ---- \| ---- \|
	\| [tiny_starcoder_py.Q2_K.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q2_K.gguf) \| Q2_K \| 0.1GB \|
	\| [tiny_starcoder_py.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.IQ3_XS.gguf) \| IQ3_XS \| 0.1GB \|
	\| [tiny_starcoder_py.IQ3_S.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.IQ3_S.gguf) \| IQ3_S \| 0.1GB \|
	\| [tiny_starcoder_py.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q3_K_S.gguf) \| Q3_K_S \| 0.1GB \|
	\| [tiny_starcoder_py.IQ3_M.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.IQ3_M.gguf) \| IQ3_M \| 0.11GB \|
	\| [tiny_starcoder_py.Q3_K.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q3_K.gguf) \| Q3_K \| 0.11GB \|
	\| [tiny_starcoder_py.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q3_K_M.gguf) \| Q3_K_M \| 0.11GB \|
	\| [tiny_starcoder_py.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q3_K_L.gguf) \| Q3_K_L \| 0.12GB \|
	\| [tiny_starcoder_py.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.IQ4_XS.gguf) \| IQ4_XS \| 0.11GB \|
	\| [tiny_starcoder_py.Q4_0.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q4_0.gguf) \| Q4_0 \| 0.12GB \|
	\| [tiny_starcoder_py.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.IQ4_NL.gguf) \| IQ4_NL \| 0.12GB \|
	\| [tiny_starcoder_py.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q4_K_S.gguf) \| Q4_K_S \| 0.12GB \|
	\| [tiny_starcoder_py.Q4_K.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q4_K.gguf) \| Q4_K \| 0.12GB \|
	\| [tiny_starcoder_py.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q4_K_M.gguf) \| Q4_K_M \| 0.12GB \|
	\| [tiny_starcoder_py.Q4_1.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q4_1.gguf) \| Q4_1 \| 0.12GB \|
	\| [tiny_starcoder_py.Q5_0.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q5_0.gguf) \| Q5_0 \| 0.13GB \|
	\| [tiny_starcoder_py.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q5_K_S.gguf) \| Q5_K_S \| 0.13GB \|
	\| [tiny_starcoder_py.Q5_K.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q5_K.gguf) \| Q5_K \| 0.14GB \|
	\| [tiny_starcoder_py.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q5_K_M.gguf) \| Q5_K_M \| 0.14GB \|
	\| [tiny_starcoder_py.Q5_1.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q5_1.gguf) \| Q5_1 \| 0.14GB \|
	\| [tiny_starcoder_py.Q6_K.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q6_K.gguf) \| Q6_K \| 0.15GB \|
	\| [tiny_starcoder_py.Q8_0.gguf](https://huggingface.co/RichardErkhov/bigcode_-_tiny_starcoder_py-gguf/blob/main/tiny_starcoder_py.Q8_0.gguf) \| Q8_0 \| 0.18GB \|




	Original model description:
	---
	pipeline_tag: text-generation
	inference: true
	widget:
	- text: 'def print_hello_world():'
	example_title: Hello world
	group: Python
	license: bigcode-openrail-m
	datasets:
	- bigcode/the-stack-dedup
	metrics:
	- code_eval
	library_name: transformers
	tags:
	- code
	model-index:
	- name: Tiny-StarCoder-Py
	results:
	- task:
	type: text-generation
	dataset:
	type: openai_humaneval
	name: HumanEval
	metrics:
	- name: pass@1
	type: pass@1
	value: 7.84%
	verified: false
	---

	# TinyStarCoderPy

	This is a 164M parameters model with the same architecture as [StarCoder](https://huggingface.co/bigcode/starcoder) (8k context length, MQA & FIM). It was trained on the Python data from [StarCoderData](https://huggingface.co/datasets/bigcode/starcoderdata)
	for ~6 epochs which amounts to 100B tokens.


	## Use

	### Intended use

	The model was trained on GitHub code, to assist with some tasks like [Assisted Generation](https://huggingface.co/blog/assisted-generation). For pure code completion, we advise using our 15B models [StarCoder]() or [StarCoderBase]().


	### Generation
	```python
	# pip install -q transformers
	from transformers import AutoModelForCausalLM, AutoTokenizer

	checkpoint = "bigcode/tiny_starcoder_py"
	device = "cuda" # for GPU usage or "cpu" for CPU usage

	tokenizer = AutoTokenizer.from_pretrained(checkpoint)
	model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

	inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to(device)
	outputs = model.generate(inputs)
	print(tokenizer.decode(outputs[0]))
	```

	### Fill-in-the-middle
	Fill-in-the-middle uses special tokens to identify the prefix/middle/suffix part of the input and output:

	```python
	input_text = "<fim_prefix>def print_one_two_three():\n print('one')\n <fim_suffix>\n print('three')<fim_middle>"
	inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
	outputs = model.generate(inputs)
	print(tokenizer.decode(outputs[0]))
	```

	# Training

	## Model

	- Architecture: GPT-2 model with multi-query attention and Fill-in-the-Middle objective
	- Pretraining steps: 50k
	- Pretraining tokens: 100 billion
	- Precision: bfloat16

	## Hardware

	- GPUs: 32 Tesla A100
	- Training time: 18 hours

	## Software

	- Orchestration: [Megatron-LM](https://github.com/bigcode-project/Megatron-LM)
	- Neural networks: [PyTorch](https://github.com/pytorch/pytorch)
	- BP16 if applicable: [apex](https://github.com/NVIDIA/apex)

	# License
	The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can find the full agreement [here](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement).