| facesaver | |
| A tool to process video files into still for image and video AI training, using yolov11 face detection to find scenes with people in them, within a certain size and position range. | |
| Requirements: | |
| CUDA 12.x | |
| A GPU with 6GB or more VRAM | |
| Raw video rips, unless you want subtitles in your training data. | |
| Usage: | |
| 1. create a conda env | |
| conda env create -n facesaver python=3.12 | |
| 2. activate the env | |
| conda activate facesaver | |
| 3. install the requiremnts | |
| pip3 install -r requirements.txt | |
| 4. put your video files into the input directory | |
| 5. | |
| run the command for stills | |
| python3 main.py -I ./input -O ./output -w 200 -m 200 | |
| run the command for clips | |
| python3 clipsaver.py -I ./input -O ./output -w 200 -m 200 | |
| notes: | |
| You can use -w and -m to specify the minimum bounding box for face detection, to avoid triggering on background faces | |
| If you find you're getting too many false positives or not enough faces, adjust the code here: | |
| # Perform face detection if no face has been detected in this scene | |
| if not face_detected_in_scene: | |
| try: | |
| results = model.predict(frame, classes=[0], conf=0.75, device=device) | |
| by changing conf to somethihng bigger or smaller | |
| You will have to do some cleanup to remove the occasional non-face and faces in credit scenes. | |
| If you process something like as 12-episode anime, you should end up with 250-1000 usable stills or clips after manual cleanup. | |