crisisMMD_task1 / README.md
Henishma's picture
Update README.md
85ba41d verified
Source: CrisisMMD (Alam et al., 2017)
Data Type: Multimodal β€” each sample includes:
tweet_text (social media text)
tweet_image (corresponding image from the tweet)
Total Samples Used: ~18,802(from the dataset)
Class Labels:
0 β†’ Non-informative
1 β†’ Informative
Collect only values where tweet_text and tweet_image are equal. (thus collected 12,743 tweets and convert it into test and train .pt files)
βœ… Preprocessing Done
Text:
Tokenized using BERT tokenizer (bert-base-uncased)
Extracted input_ids and attention_mask
Image:
Processed using ResNet-50
Extracted 2048-dimensional feature vectors
Label:
Encoded to 0 or 1 as per class
The final preprocessed dataset was saved as .pt files:
train_info.pt
test_info.pt
Each contains: input_ids, attention_mask, image_vector, and label tensors.
βœ… Model Architecture
A custom multimodal neural network combining both BERT and ResNet features:
Component Details
Text Encoder BERT base model (bert-base-uncased) – outputs pooler_output (768-d)
Image Encoder ResNet-50 pre-extracted features (2048-d)
Fusion Concatenation β†’ FC layers β†’ Softmax
Classifier Fully connected layers with BatchNorm, ReLU, Dropout
βœ… Training Setup
Loss Function: CrossEntropyLoss
Optimizer: AdamW
Scheduler: StepLR (Ξ³ = 0.9)
Epochs: 8
Batch Size: 16
Device: CUDA (if available)
βœ… Evaluation Metrics
Accuracy
Precision
Recall
F1 Score
βœ… Test Accuracy : 0.8518
βœ… Precision : 0.8289
βœ… Recall : 0.8032
βœ… F1 Score : 0.8142
Newly created dataset: https://huggingface.co/datasets/Henishma/crisisMMD_cleaned_task1