EGAI-vision-encoder / v2 /README.md
ANISH-j's picture
Upload 2 files
dc4e3e8 verified

Minimal Inference Setup

This project provides a lightweight setup for running inference with a pre-trained model.
It contains the model configuration, trained weights, and a Python script to perform inference.


Project Structure

.
β”œβ”€β”€ model/
β”‚   β”œβ”€β”€ config.json         # Model configuration file
β”‚   β”œβ”€β”€ model.safetensors   # Pre-trained model weights
└── infer.py                # Script to run inference on input data

Prerequisites

  • Python 3.8+
  • PyTorch
  • Transformers library
  • safetensors
  • PIL (Pillow)
  • (Optional) tkinter if a GUI is implemented in infer.py

Install required packages:

pip install torch transformers safetensors pillow

Files Description

model/config.json

Defines the architecture and hyperparameters of the model (e.g., hidden size, number of layers, vocabulary size).

Required to correctly instantiate the model before loading the weights.

model/model.safetensors

Contains the trained weights of the model.

Stored in the Safetensors format for safety and efficiency.

infer.py

Main script to perform inference with the pre-trained model.

Responsibilities:

  • Loads config.json and model.safetensors
  • Preprocesses input text/image (depending on model type)
  • Runs the model forward pass
  • Outputs predictions

Usage:

python infer.py --input "your input text or path to image"

Example:

python infer.py --input "Hello, how are you?"

Usage Workflow

  1. Place the model files (config.json and model.safetensors) inside the model/ directory.
  2. Run infer.py with your desired input.
  3. The script will display the prediction/classification result.

Notes

  • Ensure the model files are compatible (same checkpoint version).
  • For image-based models, inputs must be resized to the expected dimensions (e.g., 224x224 RGB).
  • For text-based models, ensure the tokenizer is compatible with the config (may require adding tokenizer files).
  • GPU is recommended for faster inference, but CPU is supported.

License

[Add license information here if applicable]


Contributing

[Add contribution guidelines here if applicable]