Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"Create a video dataset","local":"create-a-video-dataset","sections":[{"title":"VideoFolder","local":"videofolder","sections":[{"title":"Video captioning","local":"video-captioning","sections":[],"depth":3},{"title":"Upload dataset to the Hub","local":"upload-dataset-to-the-hub","sections":[],"depth":3}],"depth":2},{"title":"WebDataset","local":"webdataset","sections":[],"depth":2}],"depth":1}"> | |
| <link href="/docs/datasets/pr_7489/en/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/entry/start.e78fe3cb.js"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/chunks/scheduler.bdbef820.js"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/chunks/singletons.de3d48f2.js"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/chunks/index.8a885b74.js"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/chunks/paths.8ce597b3.js"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/entry/app.ec9204ff.js"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/chunks/index.c0aea24a.js"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/nodes/0.06409290.js"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/nodes/58.4f2a2b0c.js"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/chunks/Tip.31005f7d.js"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/chunks/CodeBlock.e814ab8d.js"> | |
| <link rel="modulepreload" href="/docs/datasets/pr_7489/en/_app/immutable/chunks/index.6bcf9ddd.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"Create a video dataset","local":"create-a-video-dataset","sections":[{"title":"VideoFolder","local":"videofolder","sections":[{"title":"Video captioning","local":"video-captioning","sections":[],"depth":3},{"title":"Upload dataset to the Hub","local":"upload-dataset-to-the-hub","sections":[],"depth":3}],"depth":2},{"title":"WebDataset","local":"webdataset","sections":[],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <h1 class="relative group"><a id="create-a-video-dataset" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#create-a-video-dataset"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Create a video dataset</span></h1> <p data-svelte-h="svelte-1bv94t6">This guide will show you how to create a video dataset with <code>VideoFolder</code> and some metadata. This is a no-code solution for quickly creating a video dataset with several thousand videos.</p> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-ztusze">You can control access to your dataset by requiring users to share their contact information first. Check out the <a href="https://huggingface.co/docs/hub/datasets-gated" rel="nofollow">Gated datasets</a> guide for more information about how to enable this feature on the Hub.</p></div> <h2 class="relative group"><a id="videofolder" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#videofolder"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>VideoFolder</span></h2> <p data-svelte-h="svelte-1ycgm9f">The <code>VideoFolder</code> is a dataset builder designed to quickly load a video dataset with several thousand videos without requiring you to write any code.</p> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-1iaaatg">💡 Take a look at the <a href="repository_structure#split-pattern-hierarchy">Split pattern hierarchy</a> to learn more about how <code>VideoFolder</code> creates dataset splits based on your dataset repository structure.</p></div> <p data-svelte-h="svelte-17l0fb0"><code>VideoFolder</code> automatically infers the class labels of your dataset based on the directory name. Store your dataset in a directory structure like:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->folder<span class="hljs-regexp">/train/</span>dog/golden_retriever.mp4 | |
| folder<span class="hljs-regexp">/train/</span>dog/german_shepherd.mp4 | |
| folder<span class="hljs-regexp">/train/</span>dog/chihuahua.mp4 | |
| folder<span class="hljs-regexp">/train/</span>cat/maine_coon.mp4 | |
| folder<span class="hljs-regexp">/train/</span>cat/bengal.mp4 | |
| folder<span class="hljs-regexp">/train/</span>cat/birman.mp4<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-4if838">If the dataset follows the <code>VideoFolder</code> structure, then you can load it directly with <a href="/docs/datasets/pr_7489/en/package_reference/loading_methods#datasets.load_dataset">load_dataset()</a>:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> datasets <span class="hljs-keyword">import</span> load_dataset | |
| <span class="hljs-meta">>>> </span>dataset = load_dataset(<span class="hljs-string">"path/to/folder"</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1nxmmti">This is equivalent to passing <code>videofolder</code> manually in <a href="/docs/datasets/pr_7489/en/package_reference/loading_methods#datasets.load_dataset">load_dataset()</a> and the directory in <code>data_dir</code>:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>dataset = load_dataset(<span class="hljs-string">"videofolder"</span>, data_dir=<span class="hljs-string">"/path/to/folder"</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-hzh3ks">You can also use <code>videofolder</code> to load datasets involving multiple splits. To do so, your dataset directory should have the following structure:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->folder<span class="hljs-regexp">/train/</span>dog/golden_retriever.mp4 | |
| folder<span class="hljs-regexp">/train/</span>cat/maine_coon.mp4 | |
| folder<span class="hljs-regexp">/test/</span>dog/german_shepherd.mp4 | |
| folder<span class="hljs-regexp">/test/</span>cat/bengal.mp4<!-- HTML_TAG_END --></pre></div> <div class="course-tip course-tip-orange bg-gradient-to-br dark:bg-gradient-to-r before:border-orange-500 dark:before:border-orange-800 from-orange-50 dark:from-gray-900 to-white dark:to-gray-950 border border-orange-50 text-orange-700 dark:text-gray-400"><p data-svelte-h="svelte-bgajrm">If all video files are contained in a single directory or if they are not on the same level of directory structure, <code>label</code> column won’t be added automatically. If you need it, set <code>drop_labels=False</code> explicitly.</p></div> <p data-svelte-h="svelte-69vkpo">If there is additional information you’d like to include about your dataset, like text captions or bounding boxes, add it as a <code>metadata.csv</code> file in your folder. This lets you quickly create datasets for different computer vision tasks like text captioning or object detection. You can also use a JSONL file <code>metadata.jsonl</code> or a Parquet file <code>metadata.parquet</code>.</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->folder<span class="hljs-regexp">/train/m</span>etadata.csv | |
| folder<span class="hljs-regexp">/train/</span><span class="hljs-number">0001</span>.mp4 | |
| folder<span class="hljs-regexp">/train/</span><span class="hljs-number">0002</span>.mp4 | |
| folder<span class="hljs-regexp">/train/</span><span class="hljs-number">0003</span>.mp4<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-dyxa8w">Your <code>metadata.csv</code> file must have a <code>file_name</code> or <code>*_file_name</code> field which links video files with their metadata:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->file_name,additional_feature | |
| <span class="hljs-number">0001.</span>mp4,This is <span class="hljs-keyword">a</span> <span class="hljs-keyword">first</span> <span class="hljs-built_in">value</span> <span class="hljs-keyword">of</span> <span class="hljs-keyword">a</span> <span class="hljs-keyword">text</span> feature you added <span class="hljs-built_in">to</span> your videos | |
| <span class="hljs-number">0002.</span>mp4,This is <span class="hljs-keyword">a</span> <span class="hljs-keyword">second</span> <span class="hljs-built_in">value</span> <span class="hljs-keyword">of</span> <span class="hljs-keyword">a</span> <span class="hljs-keyword">text</span> feature you added <span class="hljs-built_in">to</span> your videos | |
| <span class="hljs-number">0003.</span>mp4,This is <span class="hljs-keyword">a</span> <span class="hljs-keyword">third</span> <span class="hljs-built_in">value</span> <span class="hljs-keyword">of</span> <span class="hljs-keyword">a</span> <span class="hljs-keyword">text</span> feature you added <span class="hljs-built_in">to</span> your videos<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-16ywdcf">or using <code>metadata.jsonl</code>:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->{<span class="hljs-comment">"file_name"</span>: <span class="hljs-comment">"0001.mp4"</span>, <span class="hljs-comment">"additional_feature"</span>: <span class="hljs-comment">"This is a first value of a text feature you added to your videos"</span>} | |
| {<span class="hljs-comment">"file_name"</span>: <span class="hljs-comment">"0002.mp4"</span>, <span class="hljs-comment">"additional_feature"</span>: <span class="hljs-comment">"This is a second value of a text feature you added to your videos"</span>} | |
| {<span class="hljs-comment">"file_name"</span>: <span class="hljs-comment">"0003.mp4"</span>, <span class="hljs-comment">"additional_feature"</span>: <span class="hljs-comment">"This is a third value of a text feature you added to your videos"</span>}<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1stbr8p">Here the <code>file_name</code> must be the name of the video file next to the metadata file. More generally, it must be the relative path from the directory containing the metadata to the video file.</p> <p data-svelte-h="svelte-ft8iu1">It’s possible to point to more than one video in each row in your dataset, for example if both your input and output are videos:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-punctuation">{</span><span class="hljs-attr">"input_file_name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"0001.mp4"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"output_file_name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"0001_output.mp4"</span><span class="hljs-punctuation">}</span> | |
| <span class="hljs-punctuation">{</span><span class="hljs-attr">"input_file_name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"0002.mp4"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"output_file_name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"0002_output.mp4"</span><span class="hljs-punctuation">}</span> | |
| <span class="hljs-punctuation">{</span><span class="hljs-attr">"input_file_name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"0003.mp4"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"output_file_name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"0003_output.mp4"</span><span class="hljs-punctuation">}</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-a7fbwu">You can also define lists of videos. In that case you need to name the field <code>file_names</code> or <code>*_file_names</code>. Here is an example:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->{<span class="hljs-string">"videos_file_names"</span>: [<span class="hljs-string">"0001_left.mp4"</span>, <span class="hljs-string">"0001_right.mp4"</span>], <span class="hljs-string">"label"</span>: <span class="hljs-string">"moving_up"</span>} | |
| {<span class="hljs-string">"videos_file_names"</span>: [<span class="hljs-string">"0002_left.mp4"</span>, <span class="hljs-string">"0002_right.mp4"</span>], <span class="hljs-string">"label"</span>: <span class="hljs-string">"moving_down"</span>} | |
| {<span class="hljs-string">"videos_file_names"</span>: [<span class="hljs-string">"0003_left.mp4"</span>, <span class="hljs-string">"0003_right.mp4"</span>], <span class="hljs-string">"label"</span>: <span class="hljs-string">"moving_right"</span>}<!-- HTML_TAG_END --></pre></div> <h3 class="relative group"><a id="video-captioning" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#video-captioning"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Video captioning</span></h3> <p data-svelte-h="svelte-wjodyx">Video captioning datasets have text describing a video. An example <code>metadata.csv</code> may look like:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->file_name,text | |
| <span class="hljs-number">0001</span><span class="hljs-selector-class">.mp4</span>,This is <span class="hljs-selector-tag">a</span> golden retriever playing with <span class="hljs-selector-tag">a</span> ball | |
| <span class="hljs-number">0002</span><span class="hljs-selector-class">.mp4</span>,A german shepherd | |
| <span class="hljs-number">0003</span><span class="hljs-selector-class">.mp4</span>,One chihuahua<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-10mkyte">Load the dataset with <code>VideoFolder</code>, and it will create a <code>text</code> column for the video captions:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span>dataset = load_dataset(<span class="hljs-string">"videofolder"</span>, data_dir=<span class="hljs-string">"/path/to/folder"</span>, split=<span class="hljs-string">"train"</span>) | |
| <span class="hljs-meta">>>> </span>dataset[<span class="hljs-number">0</span>][<span class="hljs-string">"text"</span>] | |
| <span class="hljs-string">"This is a golden retriever playing with a ball"</span><!-- HTML_TAG_END --></pre></div> <h3 class="relative group"><a id="upload-dataset-to-the-hub" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#upload-dataset-to-the-hub"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Upload dataset to the Hub</span></h3> <p data-svelte-h="svelte-lvrx1l">Once you’ve created a dataset, you can share it to the using <code>huggingface_hub</code> for example. Make sure you have the <a href="https://huggingface.co/docs/huggingface_hub/index" rel="nofollow">huggingface_hub</a> library installed and you’re logged in to your Hugging Face account (see the <a href="upload_dataset#upload-with-python">Upload with Python tutorial</a> for more details).</p> <p data-svelte-h="svelte-1y2guln">Upload your dataset with <code>huggingface_hub.HfApi.upload_folder</code>:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> huggingface_hub <span class="hljs-keyword">import</span> HfApi | |
| api = HfApi() | |
| api.upload_folder( | |
| folder_path=<span class="hljs-string">"/path/to/local/dataset"</span>, | |
| repo_id=<span class="hljs-string">"username/my-cool-dataset"</span>, | |
| repo_type=<span class="hljs-string">"dataset"</span>, | |
| )<!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="webdataset" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#webdataset"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>WebDataset</span></h2> <p data-svelte-h="svelte-vihcr3">The <a href="https://github.com/webdataset/webdataset" rel="nofollow">WebDataset</a> format is based on TAR archives and is suitable for big video datasets. | |
| Indeed you can group your videos in TAR archives (e.g. 1GB of videos per TAR archive) and have thousands of TAR archives:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->folder<span class="hljs-regexp">/train/</span><span class="hljs-number">00000</span>.tar | |
| folder<span class="hljs-regexp">/train/</span><span class="hljs-number">00001</span>.tar | |
| folder<span class="hljs-regexp">/train/</span><span class="hljs-number">00002</span>.tar | |
| ...<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-69nbou">In the archives, each example is made of files sharing the same prefix:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->e39871fd9fd74f55<span class="hljs-selector-class">.mp4</span> | |
| e39871fd9fd74f55<span class="hljs-selector-class">.json</span> | |
| f18b91585c4d3f3e<span class="hljs-selector-class">.mp4</span> | |
| f18b91585c4d3f3e<span class="hljs-selector-class">.json</span> | |
| ede6e66b2fb59aab<span class="hljs-selector-class">.mp4</span> | |
| ede6e66b2fb59aab<span class="hljs-selector-class">.json</span> | |
| ed600d57fcee4f94<span class="hljs-selector-class">.mp4</span> | |
| ed600d57fcee4f94<span class="hljs-selector-class">.json</span> | |
| ...<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-15hkn5q">You can put your videos labels/captions/features using JSON or text files for example.</p> <p data-svelte-h="svelte-12q6x7f">For more details on the WebDataset format and the python library, please check the <a href="https://webdataset.github.io/webdataset" rel="nofollow">WebDataset documentation</a>.</p> <p data-svelte-h="svelte-1dk8isd">Load your WebDataset and it will create on column per file suffix (here “mp4” and “json”):</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-meta">>>> </span><span class="hljs-keyword">from</span> datasets <span class="hljs-keyword">import</span> load_dataset | |
| <span class="hljs-meta">>>> </span>dataset = load_dataset(<span class="hljs-string">"webdataset"</span>, data_dir=<span class="hljs-string">"/path/to/folder"</span>, split=<span class="hljs-string">"train"</span>) | |
| <span class="hljs-meta">>>> </span>dataset[<span class="hljs-number">0</span>][<span class="hljs-string">"json"</span>] | |
| {<span class="hljs-string">"bbox"</span>: [[<span class="hljs-number">302.0</span>, <span class="hljs-number">109.0</span>, <span class="hljs-number">73.0</span>, <span class="hljs-number">52.0</span>]], <span class="hljs-string">"categories"</span>: [<span class="hljs-number">0</span>]}<!-- HTML_TAG_END --></pre></div> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/datasets/blob/main/docs/source/video_dataset.mdx" target="_blank"><span data-svelte-h="svelte-1kd6by1"><</span> <span data-svelte-h="svelte-x0xyl0">></span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p> | |
| <script> | |
| { | |
| __sveltekit_maf72q = { | |
| assets: "/docs/datasets/pr_7489/en", | |
| base: "/docs/datasets/pr_7489/en", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/datasets/pr_7489/en/_app/immutable/entry/start.e78fe3cb.js"), | |
| import("/docs/datasets/pr_7489/en/_app/immutable/entry/app.ec9204ff.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 58], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 42.4 kB
- Xet hash:
- 404364e9e61c79f2b0cab4453df9c6c20529c9ceca6d9e5e62caadee01569fa0
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.