Instructions to use microsoft/BiomedVLP-BioViL-T with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/BiomedVLP-BioViL-T with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="microsoft/BiomedVLP-BioViL-T", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("microsoft/BiomedVLP-BioViL-T", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
How can I use this code to perform the "temporal image classification" task described in paper ?
Hello, I did not find the inference code for the 'temporal image classification' task.
Could you tell me where it is?
Thanks very much!
Hi, @fepegar . The snippet in the README is performing a 'Temporal Sentence Similarity' analysis, which is a different task discussed in the paper. I have the following three questions, and I would be very grateful if you could answer them.
What I am confused about is the "zero-shot temporal image classification" task in the paper. According to the paper, this task was performed after "Fine-tuning BioViL-T for report generation." Is the currently open-source model the one that was performed after "fine-tune"?
Is the biovil_t_image_model_proj_size_128 a single-layer linear head on the image encoder, or a multi-layer classification head attached to the BioViL-T image encoder?
Is there any evaluation code for Section F.4 “Auto-regressive prompting for zero-shot temporal image classification” on GitHub?
Thank you for your contribution to the community through your reply.