AdilHayat173
/

token_classification

@@ -1,92 +1,63 @@
----
-license: apache-2.0
-base_model: bert-base-cased
-tags:
-- generated_from_trainer
-datasets:
-- conll2003
-metrics:
-- precision
-- recall
-- f1
-- accuracy
-model-index:
-- name: token_classification
-  results:
-  - task:
-      name: Token Classification
-      type: token-classification
-    dataset:
-      name: conll2003
-      type: conll2003
-      config: conll2003
-      split: validation
-      args: conll2003
-    metrics:
-    - name: Precision
-      type: precision
-      value: 0.9325062034739454
-    - name: Recall
-      type: recall
-      value: 0.9486704813194211
-    - name: F1
-      type: f1
-      value: 0.9405188954700927
-    - name: Accuracy
-      type: accuracy
-      value: 0.9859745687878966
----
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# token_classification
-This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on the conll2003 dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.0628
-- Precision: 0.9325
-- Recall: 0.9487
-- F1: 0.9405
-- Accuracy: 0.9860
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 8
-- eval_batch_size: 8
-- seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- num_epochs: 3
-### Training results
-| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     | Accuracy |
-|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
-| 0.0751        | 1.0   | 1756 | 0.0683          | 0.8977    | 0.9291 | 0.9132 | 0.9812   |
-| 0.0349        | 2.0   | 3512 | 0.0682          | 0.9289    | 0.9433 | 0.9360 | 0.9844   |
-| 0.0206        | 3.0   | 5268 | 0.0628          | 0.9325    | 0.9487 | 0.9405 | 0.9860   |
-### Framework versions
-- Transformers 4.42.4
-- Pytorch 2.3.1+cu121
-- Datasets 2.21.0
-- Tokenizers 0.19.1

+# Token Classification Model
+## Description
+This project involves developing a machine learning model for token classification, specifically for Named Entity Recognition (NER). Using a fine-tuned BERT model from the Hugging Face library, this system classifies tokens in text into predefined categories like names, locations, and dates.
+The model is trained on a dataset annotated with entity labels to accurately classify each token. This token classification system is useful for information extraction, document processing, and conversational AI applications.
+## Technologies Used
+### Dataset
+- **Source:** kaggle : conll2003
+- **Purpose:** Contains text data with annotated entities for token classification.
+### Model
+- **Base Model:** BERT (bert-base-uncased)
+- **Library:** Hugging Face transformers
+- **Task:** Token Classification (Named Entity Recognition)
+### Approach
+#### Preprocessing:
+- Load and preprocess the dataset.
+- Tokenize the text data and align labels with tokens.
+#### Fine-Tuning:
+- Fine-tune the BERT model on the token classification dataset.
+#### Training:
+- Train the model to classify each token into predefined entity labels.
+#### Inference:
+- Use the trained model to predict entity labels for new text inputs.
+### Key Technologies
+- **Deep Learning (BERT):** For advanced token classification and contextual understanding.
+- **Natural Language Processing (NLP):** For text preprocessing, tokenization, and entity recognition.
+- **Machine Learning Algorithms:** For model training and prediction tasks.
+## Streamlit App
+You can view and interact with the Streamlit app for token classification [here](https://huggingface.co/spaces/AdilHayat173/token_classifcation).
+## Examples
+Here are some examples of outputs from the model:
+![example1](https://github.com/user-attachments/assets/9e9dd85c-1447-4229-b691-febec17439cf)
+![example2](https://github.com/user-attachments/assets/97dfc391-bda9-4614-93f7-a5f45d64dd03)
+## Google Colab Notebook
+You can view and run the Google Colab notebook for this project [here](https://colab.research.google.com/drive/1GYVlIToQ_lnT8XEjGrR2WFkUQWpWXgQi#scrollTo=ZlyX1Lgn8gjj).
+## Acknowledgements
+- Hugging Face for transformer models and libraries.
+- Streamlit for creating the interactive web interface.
+- [Your Dataset Provider] for the token classification dataset.
+## Author
+- AdilHayat
+- [Hugging Face Profile](https://huggingface.co/AdilHayat173)
+- [GitHub Profile](https://github.com/AdilHayat21173)
+## Feedback
+If you have any feedback, please reach out to us at hayatadil300@gmail.com.