Instructions to use yonigozlan/EdgeTAM-hf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use yonigozlan/EdgeTAM-hf with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("mask-generation", model="yonigozlan/EdgeTAM-hf")# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("yonigozlan/EdgeTAM-hf") model = AutoModel.from_pretrained("yonigozlan/EdgeTAM-hf") - Notebooks
- Google Colab
- Kaggle
torch.OutOfMemoryError: CUDA out of memory
I am using the webcam in real-time, but after a few minutes, it crashes with an out of memory error. Has anyone solved this?
edgetam_video_output = model(inference_session=inference_session, frame=inputs.pixel_values[0])
video_res_masks = processor.post_process_masks(
[edgetam_video_output.pred_masks], original_sizes=inputs.original_sizes, binarize=False
)[0]
Hello @leocneves ! Where are you storing the inference states and processed video frames when instantiating the inference session?
Having something like this should alleviate the storage issue on cuda memory, but it will be slightly slower:
inference_session = processor.init_video_session(
video=video_frames,
inference_device="cuda",
inference_state_device="cpu",
processing_device="cpu",
video_storage_device="cpu",
dtype=torch.bfloat16,
)
Granted the states will still accumulate on RAM, which might cause issues for long sessions as well. You could manually progressively delete old states in the inference session as a temporary solution, but if this is a recurring problem, we can try to add something to do this cleanly on Transformers directly. Feel free to open an issue on Transformers!
Thanks for the help! I solved it by recreating the session every N frames.
Basically, I keep the new processor in memory and use 5 random points from the last largest mask to forward the target object… It works like a charm!