| Source: CrisisMMD (Alam et al., 2017) | |
| Data Type: Multimodal β each sample includes: | |
| tweet_text (social media text) | |
| tweet_image (corresponding image from the tweet) | |
| Total Samples Used: ~18,802(from the dataset) | |
| Class Labels: | |
| 0 β Non-informative | |
| 1 β Informative | |
| Collect only values where tweet_text and tweet_image are equal. (thus collected 12,743 tweets and convert it into test and train .pt files) | |
| β Preprocessing Done | |
| Text: | |
| Tokenized using BERT tokenizer (bert-base-uncased) | |
| Extracted input_ids and attention_mask | |
| Image: | |
| Processed using ResNet-50 | |
| Extracted 2048-dimensional feature vectors | |
| Label: | |
| Encoded to 0 or 1 as per class | |
| The final preprocessed dataset was saved as .pt files: | |
| train_info.pt | |
| test_info.pt | |
| Each contains: input_ids, attention_mask, image_vector, and label tensors. | |
| β Model Architecture | |
| A custom multimodal neural network combining both BERT and ResNet features: | |
| Component Details | |
| Text Encoder BERT base model (bert-base-uncased) β outputs pooler_output (768-d) | |
| Image Encoder ResNet-50 pre-extracted features (2048-d) | |
| Fusion Concatenation β FC layers β Softmax | |
| Classifier Fully connected layers with BatchNorm, ReLU, Dropout | |
| β Training Setup | |
| Loss Function: CrossEntropyLoss | |
| Optimizer: AdamW | |
| Scheduler: StepLR (Ξ³ = 0.9) | |
| Epochs: 8 | |
| Batch Size: 16 | |
| Device: CUDA (if available) | |
| β Evaluation Metrics | |
| Accuracy | |
| Precision | |
| Recall | |
| F1 Score | |
| β Test Accuracy : 0.8518 | |
| β Precision : 0.8289 | |
| β Recall : 0.8032 | |
| β F1 Score : 0.8142 | |
| Newly created dataset: https://huggingface.co/datasets/Henishma/crisisMMD_cleaned_task1 | |