--- license: apache-2.0 --- # Seq2Seq Transformer for Function Call Generation This repository hosts a custom-trained Seq2Seq Transformer model designed to convert natural language queries into corresponding function call representations. The model leverages an encoder-decoder Transformer architecture built from scratch using PyTorch and supports versioning to facilitate continuous improvements and updates. ## Model Description - **Architecture:** A full Transformer-based encoder-decoder model with multi-head attention and feed-forward layers. The model incorporates sinusoidal positional encoding to capture sequential information. - **Tokenization & Vocabulary:** The model uses a custom-built vocabulary derived from training data. Special tokens include: - `` for padding, - `` to denote the beginning of a sequence, - `` to denote the end of a sequence, and - `` for unknown tokens. - **Training:** Trained on paired examples of natural language inputs and function call outputs using a cross-entropy loss function. The training process supports versioning, where each training run increments the model version, and each version is stored for reproducibility and comparison. - **Inference:** Greedy decoding is used to generate output sequences from an input sequence. Users can specify the model version to load the appropriate model for inference. ## Intended Use This model is primarily intended for: - Automated function call generation from natural language instructions. - Enhancing natural language interfaces for code generation or task automation. - Integrating into virtual assistants and chatbots to execute backend function calls. ## Limitations - **Data Dependency:** The model's performance relies on the quality and representativeness of the training data. Out-of-distribution inputs may yield suboptimal or erroneous outputs. - **Decoding Strategy:** The current greedy decoding approach may not always produce the most diverse or optimal outputs. Alternative strategies (e.g., beam search) might be explored for improved results. - **Generalization:** While the model works well on data similar to its training examples, its performance may degrade on substantially different domains or complex instructions. ## Training Data The model is trained on custom datasets comprising natural language inputs paired with function call outputs. Users are encouraged to fine-tune the model on domain-specific data to maximize its utility in real-world applications. ## How to Use 1. **Loading a Specific Version:** The system supports multiple versions. Specify the model version when performing inference to load the desired model. 2. **Inference:** Provide an input text (e.g., "Book me a flight from London to NYC") and the model will generate the corresponding function call output. 3. **Publishing:** The model can be published to the Hugging Face Hub with version-specific details for reproducibility and community sharing.