Update README.md
Browse files
README.md
CHANGED
|
@@ -4,6 +4,11 @@ language:
|
|
| 4 |
- en
|
| 5 |
pipeline_tag: text-generation
|
| 6 |
library_name: transformers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
tags:
|
| 8 |
- llm
|
| 9 |
- code
|
|
@@ -36,7 +41,8 @@ We also performed the same finetuning on the last **CrystalCoder** checkpoint of
|
|
| 36 |
|
| 37 |
# Instruction Tuning Data
|
| 38 |
|
| 39 |
-
The fine-tuning data is a mix of publicly available language and code datasets, plus a orginally created dataset called **WebAlpaca
|
|
|
|
| 40 |
|
| 41 |
The summary of the fine-tuning data is as follows:
|
| 42 |
|
|
@@ -57,8 +63,6 @@ The summary of the fine-tuning data is as follows:
|
|
| 57 |
| [Programming Books](https://huggingface.co/datasets/open-phi/programming_books_llama) | 395,628,544 | Not instruction data
|
| 58 |
| Total | 1,102,516,224
|
| 59 |
|
| 60 |
-
The HTML Instruction dataset was curated by LLM360 and will be made available shortly.
|
| 61 |
-
|
| 62 |
For more details, check out the [data table](https://huggingface.co/LLM360/CrystalChat/blob/main/data_table.jpg).
|
| 63 |
|
| 64 |
# Instruction Format
|
|
|
|
| 4 |
- en
|
| 5 |
pipeline_tag: text-generation
|
| 6 |
library_name: transformers
|
| 7 |
+
widget:
|
| 8 |
+
- example_title: "Write python function and explain"
|
| 9 |
+
text: "<s> <|sys_start|> You are an AI assistant. You will be given a task. You must generate a detailed and long answer. <|sys_end|> <|im_start|> Write a python function that takes a list of integers and returns the squared sum of the list. <|im_end|>"
|
| 10 |
+
output:
|
| 11 |
+
text: "Response:\n````\nHere's a Python function named `squared_sum_list` that takes a list of integers as input and returns the squared sum of the list:\n\n```python\ndef squared_sum_list(lst):\n\treturn sum([num ** 2 for num in lst])\n```\nThe function `squared_sum_list` uses a list comprehension to iterate over each number in the input list `lst` and calculate its square. Then, it uses the `sum` function to accumulate all the squared numbers in a single value - the squared sum of the list.\n\nFor example:\n\n```python\nnumbers = [1, 2, 3, 4, 5]\nprint(squared_sum_list(numbers)) # Outputs: 55\n```\n\nIn the above code, the list `[1, 2, 3, 4, 5]` is passed as an argument to the `squared_sum_list` function. The function calculates the sum of the squares of the elements in the list, which is `1 + 4 + 9 + 16 + 25 = 55`. The function then returns this result, which is printed to the console.</s>"
|
| 12 |
tags:
|
| 13 |
- llm
|
| 14 |
- code
|
|
|
|
| 41 |
|
| 42 |
# Instruction Tuning Data
|
| 43 |
|
| 44 |
+
The fine-tuning data is a mix of publicly available language and code datasets, plus a orginally created dataset called **WebAlpaca** on HTML coding instructions.
|
| 45 |
+
The WebAlpaca dataset is created by us and is used as part of our instruction tuning training data. We will release the WebAlpaca dataset in a separate repository soon.
|
| 46 |
|
| 47 |
The summary of the fine-tuning data is as follows:
|
| 48 |
|
|
|
|
| 63 |
| [Programming Books](https://huggingface.co/datasets/open-phi/programming_books_llama) | 395,628,544 | Not instruction data
|
| 64 |
| Total | 1,102,516,224
|
| 65 |
|
|
|
|
|
|
|
| 66 |
For more details, check out the [data table](https://huggingface.co/LLM360/CrystalChat/blob/main/data_table.jpg).
|
| 67 |
|
| 68 |
# Instruction Format
|