Spaces:
Runtime error
Runtime error
| {% extends "page.html" %} {% block stylesheet %} | |
| <style> | |
| .left-align { | |
| text-align: left; | |
| } | |
| .center-align { | |
| text-align: center; | |
| } | |
| .container { | |
| margin: 20px auto; | |
| max-width: 800px; | |
| } | |
| h2, | |
| h3, | |
| h4 { | |
| margin-top: 20px; | |
| } | |
| p { | |
| line-height: 1.6; | |
| } | |
| ul { | |
| margin-left: 20px; | |
| } | |
| </style> | |
| {% endblock %} {% block site %} | |
| <div id="jupyter-main-app" class="container"> | |
| <div class="center-align"> | |
| <img | |
| src="https://huggingface.co/datasets/davanstrien/assets/resolve/main/logo.jpg" | |
| alt="Space Logo" | |
| style="width: 75%" | |
| /> | |
| <p> | |
| This Space is designed to provide you with an easy way to get started | |
| generating synthetic datasets using Spaces compute to host open LLMs. The | |
| Space comes with a ready-to-go environment and a series of notebooks | |
| showing various examples of generating synthetic datasets. | |
| </p> | |
| </div> | |
| <div class="left-align"> | |
| <h2>What's covered?</h2> | |
| <p>Currently this Space has notebooks covering the following topics:</p> | |
| <h3>Creating synthetic text similarity datasets</h3> | |
| <p> | |
| A set of notebooks covering the steps for creating a synthetic dataset for | |
| fine-tuning a sentence similarity model. These notebooks cover: | |
| </p> | |
| <ul> | |
| <li> | |
| How to do structured generation using the | |
| <a href="https://github.com/outlines-dev/outlines">outlines</a> library | |
| to have more control on the outputs generated by a LLM. | |
| </li> | |
| <li> | |
| How to use | |
| <a href="https://docs.llamaindex.ai/en/stable/">Llama-index</a> to chunk | |
| texts to fit into the context length of sentence embedding models | |
| </li> | |
| <li> | |
| Using <a href="https://github.com/vllm-project/vllm">vLLM</a> to | |
| efficiently create a dataset that can be used to fine tune a Sentence | |
| similarity model | |
| </li> | |
| </ul> | |
| </div> | |
| <div class="center-align"> | |
| <h2>Using the Space</h2> | |
| <p> | |
| To use this Space, use the duplicate button. You'll want to enable | |
| persistent storage so you can save your work. To start, you may want to use a smaller GPU like the T4 and switch out to a bigger GPU when you want to use bigger models for generating data. | |
| </p> | |
| <h2>Duplicate the Space to run your own instance</h4> | |
| <h4>The default token is <span style="color: orange">huggingface</span></h4> | |
| </div> | |
| {% if login_available %} | |
| <div class="center-align"> | |
| <form | |
| action="{{base_url}}login?next={{next}}" | |
| method="post" | |
| class="form-inline" | |
| > | |
| {{ xsrf_form_html() | safe }} {% if token_available %} | |
| <label for="password_input" | |
| ><strong>{% trans %}Token:{% endtrans %}</strong></label | |
| > | |
| {% else %} | |
| <label for="password_input" | |
| ><strong>{% trans %}Password:{% endtrans %}</strong></label | |
| > | |
| {% endif %} | |
| <input | |
| type="password" | |
| name="password" | |
| id="password_input" | |
| class="form-control" | |
| /> | |
| <button type="submit" class="btn btn-default" id="login_submit"> | |
| {% trans %}Log in{% endtrans %} | |
| </button> | |
| </form> | |
| </div> | |
| {% else %} | |
| <div class="center-align"> | |
| <p> | |
| {% trans %}No login available, you shouldn't be seeing this page.{% | |
| endtrans %} | |
| </p> | |
| </div> | |
| {% endif %} | |
| <div class="center-align" style="font-size: 0.8em; color: #888"> | |
| <p> | |
| This template was created by | |
| <a href="https://twitter.com/camenduru" target="_blank">camenduru</a> and | |
| <a href="https://huggingface.co/nateraw" target="_blank">nateraw</a>, with | |
| contributions of | |
| <a href="https://huggingface.co/osanseviero" target="_blank" | |
| >osanseviero</a | |
| > | |
| and <a href="https://huggingface.co/azzr" target="_blank">azzr</a> | |
| </p> | |
| </div> | |
| {% if message %} | |
| <div class="row"> | |
| {% for key in message %} | |
| <div class="message {{key}}">{{message[key]}}</div> | |
| {% endfor %} | |
| </div> | |
| {% endif %} {% if token_available %} {% block token_message %} {% endblock | |
| token_message %} {% endif %} | |
| </div> | |
| {% endblock %} {% block script %} {% endblock %} | |