Upload from model-test

Browse files

Files changed (5) hide show

README.md +140 -0
config.json +17 -0
model.pt +3 -0
tokenizer.json +31 -0
vocab.txt +84 -0

README.md ADDED Viewed

	@@ -0,0 +1,140 @@

+---
+language:
+  - en
+license: mit
+library_name: transformers
+tags:
+  - bert
+  - text-classification
+  - nlp
+  - test
+model_name: "Dummy BERT for Testing"
+model_id: "test/bert-dummy"
+inference: true
+---
+# Dummy BERT Model
+This is a test model created for experimental upload testing to Hugging Face using dmf-ng.
+## Model Details
+### Model Description
+A minimal BERT model for testing artifact upload workflows with dmf-ng to Hugging Face Hub.
+- **Developed by:** dmf-ng Test Suite
+- **Model type:** Transformer-based language model
+- **Library:** Transformers
+- **License:** MIT
+### Model Architecture
+- **Architecture:** BERT (Bidirectional Encoder Representations from Transformers)
+- **Hidden Size:** 768
+- **Number of Hidden Layers:** 12
+- **Number of Attention Heads:** 12
+- **Intermediate Size:** 3,072
+- **Maximum Position Embeddings:** 512
+- **Vocabulary Size:** 30,522
+### Model Configuration
+```json
+{
+  "model_type": "bert",
+  "hidden_size": 768,
+  "num_hidden_layers": 12,
+  "num_attention_heads": 12,
+  "intermediate_size": 3072,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "attention_probs_dropout_prob": 0.1,
+  "max_position_embeddings": 512,
+  "type_vocab_size": 2,
+  "initializer_range": 0.02,
+  "layer_norm_eps": 1e-12,
+  "pad_token_id": 0
+}
+```
+## Files
+- **model.pt** - PyTorch model weights (placeholder)
+- **config.json** - Model configuration in HuggingFace format
+- **tokenizer.json** - Tokenizer configuration
+- **vocab.txt** - Vocabulary file with token mappings
+- **README.md** - This model card
+## Intended Use
+This model is **for testing purposes only** and should not be used for actual inference or production workloads.
+### Primary Intended Use
+- Testing artifact upload workflows with dmf-ng
+- Validating model card metadata
+- Experimenting with Hugging Face Hub integration
+- Testing lineage tracking with MLflow
+## Out-of-Scope Use Cases
+- Production inference
+- Real-world text classification tasks
+- Fine-tuning on real datasets
+- Deploying to inference endpoints
+## Technical Details
+### Model Inputs
+- **input_ids**: Token IDs (shape: [batch_size, sequence_length])
+- **attention_mask**: Binary mask for padding (shape: [batch_size, sequence_length])
+- **token_type_ids**: Segment IDs for sentence pairs (shape: [batch_size, sequence_length])
+### Model Outputs
+- **Hidden states** from the last transformer layer (shape: [batch_size, sequence_length, 768])
+- **[CLS] token representation** for sequence classification tasks
+## Limitations and Biases
+This is a dummy model created for testing purposes and does not represent a real, trained model. It has not been trained on any data and produces random outputs.
+## Training Data
+None - this model was generated as test data.
+## Evaluation Results
+Not applicable - this is a test model.
+## Environmental Impact
+Minimal environmental impact - this is a test model used only for software development and testing.
+## How to Get Started
+```python
+from transformers import AutoTokenizer, AutoModelForMaskedLM
+model_id = "your-username/test-model"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForMaskedLM.from_pretrained(model_id)
+# This model is not trained, so outputs are random
+inputs = tokenizer("Hello, world!", return_tensors="pt")
+outputs = model(**inputs)
+```
+## Model Card Contact
+For issues related to this test model, please open an issue on the dmf-ng repository.
+---
+**Note:** This is a test artifact. For production models, ensure comprehensive model cards with real training data, evaluation metrics, and bias analysis.

config.json ADDED Viewed

	@@ -0,0 +1,17 @@

+{
+  "model_type": "bert",
+  "hidden_size": 768,
+  "num_hidden_layers": 12,
+  "num_attention_heads": 12,
+  "intermediate_size": 3072,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "attention_probs_dropout_prob": 0.1,
+  "max_position_embeddings": 512,
+  "type_vocab_size": 2,
+  "initializer_range": 0.02,
+  "layer_norm_eps": 1e-12,
+  "pad_token_id": 0,
+  "vocabulary_size": 30522,
+  "description": "Dummy BERT model configuration for testing"
+}

model.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1c6bed4e8beea2ccd1c7e1ce5c86c9317c3651524d9e0fbe65789f2e2f5a431b
+size 170

tokenizer.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "version": "1.0",
+  "truncation": null,
+  "padding": null,
+  "added_tokens": [],
+  "normalizer": {
+    "type": "Sequence",
+    "normalizers": [
+      {"type": "Lowercase"},
+      {"type": "StripAccents"}
+    ]
+  },
+  "pre_tokenizer": {
+    "type": "WhitespaceSplit"
+  },
+  "post_processor": {
+    "type": "TemplateProcessing",
+    "single": "[CLS] $A [SEP]",
+    "pair": "[CLS] $A [SEP] $B:1 [SEP]:1"
+  },
+  "decoder": {
+    "type": "WordPiece",
+    "unknown": "[UNK]",
+    "prefix": "##"
+  },
+  "model": {
+    "type": "BPE",
+    "vocab_size": 30522,
+    "merges": []
+  }
+}

vocab.txt ADDED Viewed

	@@ -0,0 +1,84 @@

+[PAD]
+[unused0]
+[unused1]
+[unused2]
+[unused3]
+[unused4]
+[unused5]
+[unused6]
+[unused7]
+[unused8]
+[unused9]
+[unused10]
+[unused11]
+[unused12]
+[unused13]
+[unused14]
+[unused15]
+[unused16]
+[unused17]
+[unused18]
+[unused19]
+[unused20]
+[UNK]
+[CLS]
+[SEP]
+[MASK]
+!
+"
+#
+$
+%
+&
+'
+(
+)
+*
++
+,
+-
+.
+/
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+:
+;
+<
+=
+>
+?
+@
+a
+b
+c
+d
+e
+f
+g
+h
+i
+j
+k
+l
+m
+n
+o
+p
+q
+r
+s
+t
+u
+v
+w
+x
+y
+z