Upload from model-test

ac908fb verified 15 days ago

3.6 kB

language:
  - en
license: mit
library_name: transformers
tags:
  - bert
  - text-classification
  - nlp
  - test
model_name: Dummy BERT for Testing
model_id: test/bert-dummy
inference: true

Dummy BERT Model

This is a test model created for experimental upload testing to Hugging Face using dmf-ng.

Model Details

Model Description

A minimal BERT model for testing artifact upload workflows with dmf-ng to Hugging Face Hub.

Developed by: dmf-ng Test Suite
Model type: Transformer-based language model
Library: Transformers
License: MIT

Model Architecture

Architecture: BERT (Bidirectional Encoder Representations from Transformers)
Hidden Size: 768
Number of Hidden Layers: 12
Number of Attention Heads: 12
Intermediate Size: 3,072
Maximum Position Embeddings: 512
Vocabulary Size: 30,522

Model Configuration

{
  "model_type": "bert",
  "hidden_size": 768,
  "num_hidden_layers": 12,
  "num_attention_heads": 12,
  "intermediate_size": 3072,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "attention_probs_dropout_prob": 0.1,
  "max_position_embeddings": 512,
  "type_vocab_size": 2,
  "initializer_range": 0.02,
  "layer_norm_eps": 1e-12,
  "pad_token_id": 0
}

Files

model.pt - PyTorch model weights (placeholder)
config.json - Model configuration in HuggingFace format
tokenizer.json - Tokenizer configuration
vocab.txt - Vocabulary file with token mappings
README.md - This model card

Intended Use

This model is for testing purposes only and should not be used for actual inference or production workloads.

Primary Intended Use

Testing artifact upload workflows with dmf-ng
Validating model card metadata
Experimenting with Hugging Face Hub integration
Testing lineage tracking with MLflow

Out-of-Scope Use Cases

Production inference
Real-world text classification tasks
Fine-tuning on real datasets
Deploying to inference endpoints

Technical Details

Model Inputs

input_ids: Token IDs (shape: [batch_size, sequence_length])
attention_mask: Binary mask for padding (shape: [batch_size, sequence_length])
token_type_ids: Segment IDs for sentence pairs (shape: [batch_size, sequence_length])

Model Outputs

Hidden states from the last transformer layer (shape: [batch_size, sequence_length, 768])
[CLS] token representation for sequence classification tasks

Limitations and Biases

This is a dummy model created for testing purposes and does not represent a real, trained model. It has not been trained on any data and produces random outputs.

Training Data

None - this model was generated as test data.

Evaluation Results

Not applicable - this is a test model.

Environmental Impact

Minimal environmental impact - this is a test model used only for software development and testing.

How to Get Started

from transformers import AutoTokenizer, AutoModelForMaskedLM

model_id = "your-username/test-model"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForMaskedLM.from_pretrained(model_id)

# This model is not trained, so outputs are random
inputs = tokenizer("Hello, world!", return_tensors="pt")
outputs = model(**inputs)

Model Card Contact

For issues related to this test model, please open an issue on the dmf-ng repository.

Note: This is a test artifact. For production models, ensure comprehensive model cards with real training data, evaluation metrics, and bias analysis.