describeai
/

gemini-small

text2text-generation

Code Summarization

text-generation-inference

Model card Files Files and versions

gemini-small / README.md

land14's picture

Update README.md

ba5967d almost 4 years ago

|

history blame contribute delete

2.08 kB

	---
	language: en
	tags:
	- Explain code
	- Code Summarization
	- Summarization

	license: mit
	---


	# Gemini

	For in-depth understanding of our model and methods, please see our blog [here](https://www.describe-ai.com/gemini)

	## Model description

	Gemini is a transformer based on Google's T5 model. The model is pre-trained on approximately 800k code/description pairs and then fine-tuned on 10k higher-level explanations that were synthetically generated. Gemini is capable of summarization/explaining short to medium code snippets in:

	- Python
	- Javascript (mostly vanilla JS, however, it can handle frameworks like React as well)
	- Java
	- Ruby
	- Go

	And outputs a description in English.

	## Intended uses & limitations

	Gemini without any additional fine-tuning is capable of explaining code in a sentence or two and typically performs best in Python and Javascript. We recommend using Gemini for either simple code explanation, documentation or producing more synthetic data to improve its explanations.

	### How to use

	You can use this model directly with a pipeline for Text2Text generation, as shown below:

	```python
	from transformers import pipeline, set_seed

	summarizer = pipeline('text2text-generation', model='describeai/gemini-small')
	code = "print('hello world!')"

	response = summarizer(code, max_length=100, num_beams=3)
	print("Summarized code: " + response[0]['generated_text'])

	```

	Which should yield something along the lines of:

	```
	Summarized code: The following code is greeting the world.
	```

	### Model sizes

	- Gemini: 770 Million Parameters
	- Gemini-Small (this repo): 220 Million Parameters


	### Limitations

	Typically, Gemini may produce overly simplistic descriptions that don't encompass the entire code snippet. We suspect with more training data, this could be circumvented and will produce better results.


	### About Us

	A Describe.ai, we are focused on building Artificial Intelligence systems that can understand language as well as humans. While a long path, we plan to contribute our findings to our API to the Open Source community.