desu
Browse files
README.md
CHANGED
|
@@ -1,41 +1,45 @@
|
|
| 1 |
---
|
| 2 |
license: cc0-1.0
|
| 3 |
---
|
|
|
|
| 4 |
|
| 5 |
-
|
|
|
|
|
|
|
| 6 |
|
| 7 |
-
|
|
|
|
|
|
|
|
|
|
| 8 |
|
| 9 |
-
|
| 10 |
-
This
|
| 11 |
-
|
| 12 |
-
The
|
| 13 |
-
|
| 14 |
-
The Sora backend that was used for generation was the following:
|
| 15 |
-
`https://sora.openai.com/backend/video_gen`
|
| 16 |
-
|
| 17 |
-
Please note that user prompts are often "augmented" (changed by some LLM) before generating videos, so the prompts listed may not be the exact one used by the model.
|
| 18 |
-
The prompt used for four videos are not known, and these are denoted as [unknown_n].
|
| 19 |
|
| 20 |
---
|
| 21 |
### Archive versions
|
| 22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
**sora-turbo-vids.zip**
|
| 25 |
-
This
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
The ten longer prompts in "full_long_prompts.txt" were used for the videos in the "long_prompts" directory.
|
| 30 |
|
| 31 |
-
|
| 32 |
-
**videos_only.zip** and **videos_only.7z**
|
| 33 |
These identical archives (in different compression formats) contain only the original videos, with names such as `video_24.mp4`.
|
| 34 |
The `video_24` part is the video ID, and the prompt used for a specific video ID is listed in the separate CSV and JSONL files (video_id, prompt).
|
| 35 |
-
You
|
| 36 |
|
| 37 |
---
|
| 38 |
|
|
|
|
|
|
|
| 39 |
~ desuAnon
|
| 40 |
|
| 41 |
https://rentry.org/desuAnon
|
|
|
|
| 1 |
---
|
| 2 |
license: cc0-1.0
|
| 3 |
---
|
| 4 |
+
### Release Information
|
| 5 |
|
| 6 |
+
Temporary access to OpenAI's video generation model Sora (turbo) was provided by the HF repo [PR-Puppet-Sora](https://huggingface.co/spaces/PR-Puppets/PR-Puppet-Sora), on November 26th.
|
| 7 |
+
After a few hours, OpenAI revoked the API key used by the repo and removed access to the generated videos.
|
| 8 |
+
In anticipation of that event, the publicly displayed videos and their prompts were archived.
|
| 9 |
|
| 10 |
+
This release contains 87 archived videos (~702 MB) and 83 of their prompts, and dedicated to the public domain (CC0 1.0 Universal).
|
| 11 |
+
The generation parameters may be found in the app.py of the original repo [here](https://huggingface.co/spaces/PR-Puppets/PR-Puppet-Sora/blob/main/app.py). An archive of this script is available [here](https://archive.is/r70Ao).
|
| 12 |
+
User prompts are often "augmented" (changed by some LLM) before generating videos, and this may be true for these videos as well.
|
| 13 |
+
The Sora backend that was used for generation was `https://sora.openai.com/backend/video_gen`
|
| 14 |
|
| 15 |
+
Contrary to claims online, the generations were *not* uncensored. User prompts, as well as the generated videos, passed through OpenAI's content moderation normally.
|
| 16 |
+
This is partly the reason why none of the videos in this archive are NSFW, or similar, despite a few *brave attempts* in the prompts.
|
| 17 |
+
It is also incorrect that "Sora leaked", since the model itself (its model parameters) had not been acquired by outsiders.
|
| 18 |
+
The only thing that "leaked" was previewer/beta tester access to Sora video generation, via a single HF repo - while keeping its API keys secret.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
---
|
| 21 |
### Archive versions
|
| 22 |
|
| 23 |
+
All videos are `.mp4`, of varying resolutions, and a framerate of 30 FPS.
|
| 24 |
+
Not all of the videos that were generated were able to be archived, due to HF server load issues.
|
| 25 |
+
The prompts used for four videos are not known, and these are denoted as [unknown_n].
|
| 26 |
+
Hugging Face performs *File Security Scans* of uploaded files, and you can click on the icon next to each file to see the result of this.
|
| 27 |
|
| 28 |
**sora-turbo-vids.zip**
|
| 29 |
+
This is the original archive containing both videos and their prompts, and some users experienced encoding/compatibility issues with it.
|
| 30 |
+
Consider using the more recent "separated" uploads if you encounter similar issues.
|
| 31 |
+
The filenames in the `short_prompts` directory are the full prompts used for each video generation request.
|
| 32 |
+
The filenames in the `long_prompts` directory are shortened versions of the long prompts (above 256 chars), and their full versions are found in `full_long_prompts.txt`.
|
|
|
|
| 33 |
|
| 34 |
+
**videos_only.zip** & **videos_only.7z**
|
|
|
|
| 35 |
These identical archives (in different compression formats) contain only the original videos, with names such as `video_24.mp4`.
|
| 36 |
The `video_24` part is the video ID, and the prompt used for a specific video ID is listed in the separate CSV and JSONL files (video_id, prompt).
|
| 37 |
+
You may easily view both those files in a text editor, and they are easy to import and process in various programming languages.
|
| 38 |
|
| 39 |
---
|
| 40 |
|
| 41 |
+
Even though this is a *dataset* upload, I went with a *model* repo because a) the URL is shorter, and b) the original upload wasn't compatible with the HF dataset viewer.
|
| 42 |
+
|
| 43 |
~ desuAnon
|
| 44 |
|
| 45 |
https://rentry.org/desuAnon
|