Saving local graphs and readme

Browse files

Files changed (10) hide show

.idea/.gitignore +10 -0
.idea/document-classification.iml +8 -0
.idea/inspectionProfiles/profiles_settings.xml +6 -0
.idea/modules.xml +8 -0
README.md +1 -25
main.ipynb +0 -0
res.ipynb +0 -0
results.ipynb +0 -0
results/loss_and_acc_curve.png +2 -2
test.ipynb +0 -0

.idea/.gitignore ADDED Viewed

	@@ -0,0 +1,10 @@

+# Default ignored files
+/shelf/
+/workspace.xml
+# Ignored default folder with query files
+/queries/
+# Datasource local storage ignored files
+/dataSources/
+/dataSources.local.xml
+# Editor-based HTTP Client requests
+/httpRequests/

.idea/document-classification.iml ADDED Viewed

	@@ -0,0 +1,8 @@

+<?xml version="1.0" encoding="UTF-8"?>
+<module type="PYTHON_MODULE" version="4">
+  <component name="NewModuleRootManager">
+    <content url="file://$MODULE_DIR$" />
+    <orderEntry type="inheritedJdk" />
+    <orderEntry type="sourceFolder" forTests="false" />
+  </component>
+</module>

.idea/inspectionProfiles/profiles_settings.xml ADDED Viewed

	@@ -0,0 +1,6 @@

+<component name="InspectionProjectProfileManager">
+  <settings>
+    <option name="USE_PROJECT_PROFILE" value="false" />
+    <version value="1.0" />
+  </settings>
+</component>

.idea/modules.xml ADDED Viewed

	@@ -0,0 +1,8 @@

+<?xml version="1.0" encoding="UTF-8"?>
+<project version="4">
+  <component name="ProjectModuleManager">
+    <modules>
+      <module fileurl="file://$PROJECT_DIR$/.idea/document-classification.iml" filepath="$PROJECT_DIR$/.idea/document-classification.iml" />
+    </modules>
+  </component>
+</project>

README.md CHANGED Viewed

@@ -38,8 +38,6 @@ This model is a **ResNet-50** Convolutional Neural Network (CNN) finetuned to cl
 ## Model Details
-![Model Architecture](aechitecture.png)
 ### Model Description
 This model utilizes the standard ResNet-50 architecture designed for image classification. Instead of "reading" the text like an OCR system, it analyzes the visual layout, structure, and low-level texture features of a whole document page to determine its category (e.g., recognizing the block layout of a resume versus the dense, two-column text of a scientific report).
@@ -50,20 +48,7 @@ It was trained using **Transfer Learning**, starting with weights pre-trained on
 - **Model type:** Computer Vision (Image Classification / CNN)
 - **Language(s) (NLP):** English (Implicitly, via the text present in the RVL-CDIP dataset images)
 - **License:** MIT
-## Why ResNet50
-| Model      | Approximate Parameters | Year Released | Layers |
-|------------|------------------------|---------------|--------|
-| VGG16      | 138.4 Million          | 2014          | 16     |
-| AlexNet    | 61.1 Million           | 2012          | 8      |
-| ResNet-50  | 25.6 Million           | 2015          | 50     |
-| Model      | FLOPs (Billions) | Efficiency Score      |
-|------------|------------------|-----------------------|
-| AlexNet    | 0.7 GFLOPs       | Low Cost / Low Acc    |
-| ResNet-50  | 3.8 GFLOPs       | High Efficiency       |
-| VGG-16     | 15.5 GFLOPs      | Terribly Inefficient  |
 ### Model Sources
@@ -198,15 +183,6 @@ The model was evaluated on the standard, unseen **RVL-CDIP Test Split** containi
 | **Overall Accuracy** | **88.46%** | Solid baseline performance. |
 | **Top-3 Accuracy** | **95.62%** | Excellent reliability for triage tasks. |
-![Loss and Accuracy Curves](results/loss_and_acc_curve.png)
-#### Confusion Matrix
-![Confusion Matrix](results/cm.png)
-#### Detailed Classificatio report
-![Detailed Classification report](results/detailed_classification_report.png)
 #### Detailed Performance Analysis (The "Traffic Light" Report)
 An analysis of per-class F1-scores reveals distinct tiers of performance:

 ## Model Details
 ### Model Description
 This model utilizes the standard ResNet-50 architecture designed for image classification. Instead of "reading" the text like an OCR system, it analyzes the visual layout, structure, and low-level texture features of a whole document page to determine its category (e.g., recognizing the block layout of a resume versus the dense, two-column text of a scientific report).
 - **Model type:** Computer Vision (Image Classification / CNN)
 - **Language(s) (NLP):** English (Implicitly, via the text present in the RVL-CDIP dataset images)
 - **License:** MIT
+- **Finetuned from model:** ResNet-50 (ImageNet weights)
 ### Model Sources
 | **Overall Accuracy** | **88.46%** | Solid baseline performance. |
 | **Top-3 Accuracy** | **95.62%** | Excellent reliability for triage tasks. |
 #### Detailed Performance Analysis (The "Traffic Light" Report)
 An analysis of per-class F1-scores reveals distinct tiers of performance:

main.ipynb CHANGED Viewed

The diff for this file is too large to render. See raw diff

res.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

results.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

results/loss_and_acc_curve.png CHANGED Viewed

Git LFS Details

SHA256: 4984e1dc4b6048fc5328a18c94c2ac3db5b244538178b443e25fbf32158f06b3
Pointer size: 130 Bytes
Size of remote file: 72.9 kB

Git LFS Details

SHA256: cd82f3722791d2fc18c1f4fee3d0c7f5ea8a349ba26ac89ae8099ebea7aba8fe
Pointer size: 131 Bytes
Size of remote file: 262 kB

test.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff