Aashish34 commited on
Commit
6d6a6aa
Β·
1 Parent(s): 14422b0

add deeplearning

Browse files
DeepLearning/Deep Learning Curriculum.html CHANGED
@@ -730,7 +730,497 @@
730
  }
731
  ];
732
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
733
  function createModuleHTML(module) {
 
 
734
  return `
735
  <div class="module" id="${module.id}-module">
736
  <button class="btn-back" onclick="switchTo('dashboard')">← Back to Dashboard</button>
@@ -751,27 +1241,31 @@
751
  <div id="${module.id}-overview" class="tab active">
752
  <div class="section">
753
  <h2>πŸ“– Overview</h2>
754
- <p>Complete coverage of ${module.title.toLowerCase()}. Learn the fundamentals, mathematics, real-world applications, and implementation details.</p>
755
- <div class="info-box">
756
- <div class="box-title">Learning Objectives</div>
757
- <div class="box-content">
758
- βœ“ Understand core concepts and theory<br>
759
- βœ“ Master mathematical foundations<br>
760
- βœ“ Learn practical applications<br>
761
- βœ“ Implement and experiment
 
 
762
  </div>
763
- </div>
764
  </div>
765
  </div>
766
 
767
  <div id="${module.id}-concepts" class="tab">
768
  <div class="section">
769
  <h2>🎯 Key Concepts</h2>
770
- <p>Fundamental concepts and building blocks for ${module.title.toLowerCase()}.</p>
771
- <div class="callout insight">
772
- <div class="callout-title">πŸ’‘ Main Ideas</div>
773
- This section covers the core ideas you need to understand before diving into mathematics.
774
- </div>
 
 
775
  </div>
776
  </div>
777
 
@@ -812,13 +1306,15 @@
812
  <div id="${module.id}-applications" class="tab">
813
  <div class="section">
814
  <h2>🌍 Real-World Applications</h2>
815
- <p>How ${module.title.toLowerCase()} is used in practice across different industries.</p>
816
- <div class="info-box">
817
- <div class="box-title">Use Cases</div>
818
- <div class="box-content">
819
- Common applications and practical examples
 
 
820
  </div>
821
- </div>
822
  </div>
823
  <div class="section">
824
  <h2>πŸ“Š Application Scenarios Visualization</h2>
 
730
  }
731
  ];
732
 
733
+ // Comprehensive content for all modules
734
+ const MODULE_CONTENT = {
735
+ "nn-basics": {
736
+ overview: `
737
+ <h3>What are Neural Networks?</h3>
738
+ <p>Neural Networks are computational models inspired by the human brain's structure. They consist of interconnected nodes (neurons) organized in layers that process information through weighted connections.</p>
739
+
740
+ <h3>Why Use Neural Networks?</h3>
741
+ <ul>
742
+ <li><strong>Universal Approximation:</strong> Can theoretically approximate any continuous function</li>
743
+ <li><strong>Feature Learning:</strong> Automatically discover representations from raw data</li>
744
+ <li><strong>Adaptability:</strong> Learn from examples without explicit programming</li>
745
+ <li><strong>Parallel Processing:</strong> Highly parallelizable for modern hardware</li>
746
+ </ul>
747
+
748
+ <div class="callout tip">
749
+ <div class="callout-title">βœ… Advantages</div>
750
+ β€’ Non-linear problem solving<br>
751
+ β€’ Robust to noisy data<br>
752
+ β€’ Works with incomplete information<br>
753
+ β€’ Continuous learning capability
754
+ </div>
755
+
756
+ <div class="callout warning">
757
+ <div class="callout-title">⚠️ Disadvantages</div>
758
+ β€’ Requires large amounts of training data<br>
759
+ β€’ Computationally expensive<br>
760
+ β€’ "Black box" - difficult to interpret<br>
761
+ β€’ Prone to overfitting without regularization
762
+ </div>
763
+ `,
764
+ concepts: `
765
+ <h3>Core Components</h3>
766
+ <div class="list-item">
767
+ <div class="list-num">01</div>
768
+ <div><strong>Neurons (Nodes):</strong> Basic computational units that receive inputs, apply weights, add bias, and apply activation function</div>
769
+ </div>
770
+ <div class="list-item">
771
+ <div class="list-num">02</div>
772
+ <div><strong>Layers:</strong> Input layer (receives data), Hidden layers (feature extraction), Output layer (predictions)</div>
773
+ </div>
774
+ <div class="list-item">
775
+ <div class="list-num">03</div>
776
+ <div><strong>Weights:</strong> Parameters learned during training that determine connection strength</div>
777
+ </div>
778
+ <div class="list-item">
779
+ <div class="list-num">04</div>
780
+ <div><strong>Bias:</strong> Allows shifting the activation function for better fitting</div>
781
+ </div>
782
+ <div class="list-item">
783
+ <div class="list-num">05</div>
784
+ <div><strong>Activation Function:</strong> Introduces non-linearity (ReLU, Sigmoid, Tanh)</div>
785
+ </div>
786
+ `,
787
+ applications: `
788
+ <h3>Real-World Applications</h3>
789
+ <div class="info-box">
790
+ <div class="box-title">πŸ₯ Healthcare</div>
791
+ <div class="box-content">Disease diagnosis, medical image analysis, drug discovery, patient risk prediction</div>
792
+ </div>
793
+ <div class="info-box">
794
+ <div class="box-title">πŸ’° Finance</div>
795
+ <div class="box-content">Fraud detection, algorithmic trading, credit scoring, portfolio optimization</div>
796
+ </div>
797
+ <div class="info-box">
798
+ <div class="box-title">πŸ›’ E-commerce</div>
799
+ <div class="box-content">Recommendation systems, demand forecasting, customer segmentation, price optimization</div>
800
+ </div>
801
+ `
802
+ },
803
+ "activation": {
804
+ overview: `
805
+ <h3>What are Activation Functions?</h3>
806
+ <p>Activation functions introduce non-linearity into neural networks, enabling them to learn complex patterns. Without activation functions, a neural network would be just a linear regression model regardless of depth.</p>
807
+
808
+ <h3>Why Do We Need Them?</h3>
809
+ <ul>
810
+ <li><strong>Non-linearity:</strong> Real-world problems are rarely linear</li>
811
+ <li><strong>Complex Pattern Learning:</strong> Enable learning of intricate decision boundaries</li>
812
+ <li><strong>Gradient Flow:</strong> Control how gradients propagate during backpropagation</li>
813
+ <li><strong>Range Normalization:</strong> Keep activations in manageable ranges</li>
814
+ </ul>
815
+
816
+ <h3>Common Activation Functions Comparison</h3>
817
+ <table>
818
+ <tr>
819
+ <th>Function</th>
820
+ <th>Range</th>
821
+ <th>Best Use</th>
822
+ <th>Issue</th>
823
+ </tr>
824
+ <tr>
825
+ <td>ReLU</td>
826
+ <td>[0, ∞)</td>
827
+ <td>Hidden layers (default)</td>
828
+ <td>Dying ReLU problem</td>
829
+ </tr>
830
+ <tr>
831
+ <td>Sigmoid</td>
832
+ <td>(0, 1)</td>
833
+ <td>Binary classification output</td>
834
+ <td>Vanishing gradients</td>
835
+ </tr>
836
+ <tr>
837
+ <td>Tanh</td>
838
+ <td>(-1, 1)</td>
839
+ <td>RNNs, zero-centered</td>
840
+ <td>Vanishing gradients</td>
841
+ </tr>
842
+ <tr>
843
+ <td>Leaky ReLU</td>
844
+ <td>(-∞, ∞)</td>
845
+ <td>Fixes dying ReLU</td>
846
+ <td>Extra hyperparameter</td>
847
+ </tr>
848
+ <tr>
849
+ <td>Softmax</td>
850
+ <td>(0, 1) sum=1</td>
851
+ <td>Multi-class output</td>
852
+ <td>Computationally expensive</td>
853
+ </tr>
854
+ </table>
855
+ `,
856
+ concepts: `
857
+ <h3>Key Properties</h3>
858
+ <div class="list-item">
859
+ <div class="list-num">01</div>
860
+ <div><strong>Differentiability:</strong> Must have derivatives for backpropagation to work</div>
861
+ </div>
862
+ <div class="list-item">
863
+ <div class="list-num">02</div>
864
+ <div><strong>Monotonicity:</strong> Preferably monotonic for easier optimization</div>
865
+ </div>
866
+ <div class="list-item">
867
+ <div class="list-num">03</div>
868
+ <div><strong>Zero-Centered:</strong> Helps with faster convergence (Tanh)</div>
869
+ </div>
870
+ <div class="list-item">
871
+ <div class="list-num">04</div>
872
+ <div><strong>Computational Efficiency:</strong> Should be fast to compute (ReLU wins)</div>
873
+ </div>
874
+
875
+ <div class="callout tip">
876
+ <div class="callout-title">πŸ’‘ Best Practices</div>
877
+ β€’ Use <strong>ReLU</strong> for hidden layers by default<br>
878
+ β€’ Use <strong>Sigmoid</strong> for binary classification output<br>
879
+ β€’ Use <strong>Softmax</strong> for multi-class classification<br>
880
+ β€’ Try <strong>Leaky ReLU</strong> or <strong>ELU</strong> if ReLU neurons are dying<br>
881
+ β€’ Avoid Sigmoid/Tanh in deep networks (gradient vanishing)
882
+ </div>
883
+ `
884
+ },
885
+ "conv-layer": {
886
+ overview: `
887
+ <h3>What are Convolutional Layers?</h3>
888
+ <p>Convolutional layers are the fundamental building blocks of CNNs. They apply learnable filters (kernels) across input data to detect local patterns like edges, textures, and shapes.</p>
889
+
890
+ <h3>Why Use Convolutions Instead of Fully Connected Layers?</h3>
891
+ <ul>
892
+ <li><strong>Parameter Efficiency:</strong> Share weights across spatial locations (fewer parameters)</li>
893
+ <li><strong>Translation Invariance:</strong> Detect features regardless of position</li>
894
+ <li><strong>Local Connectivity:</strong> Each neuron sees
895
+
896
+ only a small region (receptive field)</li>
897
+ <li><strong>Hierarchical Learning:</strong> Build complex features from simple ones</li>
898
+ </ul>
899
+
900
+ <div class="callout insight">
901
+ <div class="callout-title">πŸ” Example: Parameter Comparison</div>
902
+ For a 224Γ—224 RGB image:<br>
903
+ β€’ <strong>Fully Connected:</strong> 224 Γ— 224 Γ— 3 Γ— 1000 = 150M parameters (for 1000 neurons)<br>
904
+ β€’ <strong>Convolutional (3Γ—3):</strong> 3 Γ— 3 Γ— 3 Γ— 64 = 1,728 parameters (for 64 filters)<br>
905
+ <strong>Result:</strong> 87,000x fewer parameters! πŸš€
906
+ </div>
907
+
908
+ <div class="callout tip">
909
+ <div class="callout-title">βœ… Advantages</div>
910
+ β€’ Drastically reduced parameters<br>
911
+ β€’ Spatial hierarchy (edges β†’ textures β†’ parts β†’ objects)<br>
912
+ β€’ GPU-friendly (highly parallelizable)<br>
913
+ β€’ Built-in translation equivariance
914
+ </div>
915
+
916
+ <div class="callout warning">
917
+ <div class="callout-title">⚠️ Disadvantages</div>
918
+ β€’ Not rotation invariant (require data augmentation)<br>
919
+ β€’ Fixed receptive field size<br>
920
+ β€’ Memory intensive during training<br>
921
+ β€’ Require careful hyperparameter tuning (kernel size, stride, padding)
922
+ </div>
923
+ `,
924
+ concepts: `
925
+ <h3>Key Hyperparameters</h3>
926
+ <div class="list-item">
927
+ <div class="list-num">01</div>
928
+ <div><strong>Kernel/Filter Size:</strong> Typically 3Γ—3 or 5Γ—5. Smaller = more layers needed, larger = more parameters</div>
929
+ </div>
930
+ <div class="list-item">
931
+ <div class="list-num">02</div>
932
+ <div><strong>Stride:</strong> Step size when sliding filter. Stride=1 (preserves size), Stride=2 (downsamples by 2Γ—)</div>
933
+ </div>
934
+ <div class="list-item">
935
+ <div class="list-num">03</div>
936
+ <div><strong>Padding:</strong> Add zeros around borders. 'SAME' keeps size, 'VALID' shrinks output</div>
937
+ </div>
938
+ <div class="list-item">
939
+ <div class="list-num">04</div>
940
+ <div><strong>Number of Filters:</strong> Each filter learns different features. More filters = more capacity but slower</div>
941
+ </div>
942
+ <div class="list-item">
943
+ <div class="list-num">05</div>
944
+ <div><strong>Dilation:</strong> Spacing between kernel elements. Increases receptive field without adding parameters</div>
945
+ </div>
946
+
947
+ <div class="formula">
948
+ Output Size Formula:<br>
949
+ W_out = floor((W_in + 2Γ—padding - kernel_size) / stride) + 1<br>
950
+ H_out = floor((H_in + 2Γ—padding - kernel_size) / stride) + 1
951
+ </div>
952
+ `
953
+ },
954
+ "yolo": {
955
+ overview: `
956
+ <h3>What is YOLO?</h3>
957
+ <p>YOLO (You Only Look Once) treats object detection as a single regression problem, going directly from image pixels to bounding box coordinates and class probabilities in one forward pass.</p>
958
+
959
+ <h3>Why YOLO Over R-CNN?</h3>
960
+ <ul>
961
+ <li><strong>Speed:</strong> 45+ FPS (real-time) vs R-CNN's ~0.05 FPS</li>
962
+ <li><strong>Global Context:</strong> Sees entire image during training (fewer background errors)</li>
963
+ <li><strong>One Network:</strong> Unlike R-CNN's multi-stage pipeline</li>
964
+ <li><strong>End-to-End Training:</strong> Optimize detection directly</li>
965
+ </ul>
966
+
967
+ <div class="callout tip">
968
+ <div class="callout-title">βœ… Advantages</div>
969
+ β€’ <strong>Lightning Fast:</strong> Real-time inference (YOLOv8 at 100+ FPS)<br>
970
+ β€’ <strong>Simple Architecture:</strong> Single network, easy to train<br>
971
+ β€’ <strong>Generalizes Well:</strong> Works on natural images and artwork<br>
972
+ β€’ <strong>Small Model Size:</strong> Can run on edge devices (mobile, IoT)
973
+ </div>
974
+
975
+ <div class="callout warning">
976
+ <div class="callout-title">⚠️ Disadvantages</div>
977
+ β€’ <strong>Struggles with Small Objects:</strong> Grid limitation affects tiny items<br>
978
+ β€’ <strong>Localization Errors:</strong> Less precise than two-stage detectors<br>
979
+ β€’ <strong>Limited Objects per Cell:</strong> Can't detect many close objects<br>
980
+ β€’ <strong>Aspect Ratio Issues:</strong> Struggles with unusual object shapes
981
+ </div>
982
+
983
+ <h3>YOLO Evolution</h3>
984
+ <table>
985
+ <tr>
986
+ <th>Version</th>
987
+ <th>Year</th>
988
+ <th>Key Innovation</th>
989
+ <th>mAP</th>
990
+ </tr>
991
+ <tr>
992
+ <td>YOLOv1</td>
993
+ <td>2015</td>
994
+ <td>Original single-shot detector</td>
995
+ <td>63.4%</td>
996
+ </tr>
997
+ <tr>
998
+ <td>YOLOv3</td>
999
+ <td>2018</td>
1000
+ <td>Multi-scale predictions</td>
1001
+ <td>57.9% (faster)</td>
1002
+ </tr>
1003
+ <tr>
1004
+ <td>YOLOv5</td>
1005
+ <td>2020</td>
1006
+ <td>PyTorch, Auto-augment</td>
1007
+ <td>~50% (optimized)</td>
1008
+ </tr>
1009
+ <tr>
1010
+ <td>YOLOv8</td>
1011
+ <td>2023</td>
1012
+ <td>Anchor-free, SOTA speed</td>
1013
+ <td>53.9% (real-time)</td>
1014
+ </tr>
1015
+ </table>
1016
+ `,
1017
+ concepts: `
1018
+ <h3>How YOLO Works (3 Steps)</h3>
1019
+ <div class="list-item">
1020
+ <div class="list-num">01</div>
1021
+ <div><strong>Grid Division:</strong> Divide image into SΓ—S grid (e.g., 7Γ—7). Each cell predicts B bounding boxes</div>
1022
+ </div>
1023
+ <div class="list-item">
1024
+ <div class="list-num">02</div>
1025
+ <div><strong>Predictions Per Cell:</strong> Each box predicts (x, y, w, h, confidence) + class probabilities</div>
1026
+ </div>
1027
+ <div class="list-item">
1028
+ <div class="list-num">03</div>
1029
+ <div><strong>Non-Max Suppression:</strong> Remove duplicate detections, keep highest confidence boxes</div>
1030
+ </div>
1031
+
1032
+ <div class="formula">
1033
+ Output Tensor Shape (YOLOv1):<br>
1034
+ S Γ— S Γ— (B Γ— 5 + C)<br>
1035
+ Example: 7 Γ— 7 Γ— (2 Γ— 5 + 20) = 7 Γ— 7 Γ— 30<br>
1036
+ <br>
1037
+ Where:<br>
1038
+ β€’ S = grid size (7)<br>
1039
+ β€’ B = boxes per cell (2)<br>
1040
+ β€’ 5 = (x, y, w, h, confidence)<br>
1041
+ β€’ C = number of classes (20 for PASCAL VOC)
1042
+ </div>
1043
+ `,
1044
+ applications: `
1045
+ <h3>Industry Applications</h3>
1046
+ <div class="info-box">
1047
+ <div class="box-title">πŸš— Autonomous Vehicles</div>
1048
+ <div class="box-content">
1049
+ Real-time detection of pedestrians, vehicles, traffic signs, and lane markings for self-driving cars
1050
+ </div>
1051
+ </div>
1052
+ <div class="info-box">
1053
+ <div class="box-title">🏭 Manufacturing</div>
1054
+ <div class="box-content">
1055
+ Quality control, defect detection on assembly lines, robot guidance, inventory management
1056
+ </div>
1057
+ </div>
1058
+ <div class="info-box">
1059
+ <div class="box-title">πŸ›‘οΈ Security & Surveillance</div>
1060
+ <div class="box-content">
1061
+ Intrusion detection, crowd monitoring, suspicious behavior analysis, license plate recognition
1062
+ </div>
1063
+ </div>
1064
+ <div class="info-box">
1065
+ <div class="box-title">πŸ₯ Medical Imaging</div>
1066
+ <div class="box-content">
1067
+ Tumor localization, cell counting, anatomical structure detection in X-rays/CT scans
1068
+ </div>
1069
+ </div>
1070
+ `
1071
+ },
1072
+ "transformers": {
1073
+ overview: `
1074
+ <h3>What are Transformers?</h3>
1075
+ <p>Transformers are neural architectures based entirely on attention mechanisms, eliminating recurrence and convolutions. Introduced in "Attention is All You Need" (2017), they revolutionized NLP and are now conquering computer vision.</p>
1076
+
1077
+ <h3>Why Transformers Over RNNs/LSTMs?</h3>
1078
+ <ul>
1079
+ <li><strong>Parallelization:</strong> Process entire sequence at once (vs sequential RNNs)</li>
1080
+ <li><strong>Long-Range Dependencies:</strong> Direct connections between any two positions</li>
1081
+ <li><strong>No Gradient Vanishing:</strong> Skip connections and attention bypass depth issues</li>
1082
+ <li><strong>Scalability:</strong> Performance improves with more data and compute</li>
1083
+ </ul>
1084
+
1085
+ <div class="callout tip">
1086
+ <div class="callout-title">βœ… Advantages</div>
1087
+ β€’ <strong>Superior Performance:</strong> SOTA on nearly all NLP benchmarks<br>
1088
+ β€’ <strong>Highly Parallelizable:</strong> Train 100Γ— faster than RNNs on TPUs/GPUs<br>
1089
+ β€’ <strong>Transfer Learning:</strong> Pre-train once, fine-tune for many tasks<br>
1090
+ β€’ <strong>Interpretability:</strong> Attention weights show what model focuses on<br>
1091
+ β€’ <strong>Multi-Modal:</strong> Works for text, images, audio, video
1092
+ </div>
1093
+
1094
+ <div class="callout warning">
1095
+ <div class="callout-title">⚠️ Disadvantages</div>
1096
+ β€’ <strong>Quadratic Complexity:</strong> O(nΒ²) in sequence length (memory intensive)<br>
1097
+ β€’ <strong>Massive Data Requirements:</strong> Need millions of examples to train from scratch<br>
1098
+ β€’ <strong>Computational Cost:</strong> Training GPT-3 cost ~$4.6M<br>
1099
+ β€’ <strong>Position Encoding:</strong> Require explicit positional information<br>
1100
+ β€’ <strong>Limited Context:</strong> Most models cap at 512-4096 tokens
1101
+ </div>
1102
+
1103
+ <h3>Transformer Variants</h3>
1104
+ <table>
1105
+ <tr>
1106
+ <th>Model</th>
1107
+ <th>Type</th>
1108
+ <th>Architecture</th>
1109
+ <th>Best For</th>
1110
+ </tr>
1111
+ <tr>
1112
+ <td>BERT</td>
1113
+ <td>Encoder-only</td>
1114
+ <td>Bidirectional</td>
1115
+ <td>Understanding (classification, QA)</td>
1116
+ </tr>
1117
+ <tr>
1118
+ <td>GPT</td>
1119
+ <td>Decoder-only</td>
1120
+ <td>Autoregressive</td>
1121
+ <td>Generation (text, code)</td>
1122
+ </tr>
1123
+ <tr>
1124
+ <td>T5</td>
1125
+ <td>Encoder-Decoder</td>
1126
+ <td>Full Transformer</td>
1127
+ <td>Text-to-text tasks (translation)</td>
1128
+ </tr>
1129
+ <tr>
1130
+ <td>ViT</td>
1131
+ <td>Encoder-only</td>
1132
+ <td>Patch embeddings</td>
1133
+ <td>Image classification</td>
1134
+ </tr>
1135
+ </table>
1136
+ `,
1137
+ concepts: `
1138
+ <h3>Core Components</h3>
1139
+ <div class="list-item">
1140
+ <div class="list-num">01</div>
1141
+ <div><strong>Self-Attention:</strong> Each token attends to all other tokens, learning contextual relationships</div>
1142
+ </div>
1143
+ <div class="list-item">
1144
+ <div class="list-num">02</div>
1145
+ <div><strong>Multi-Head Attention:</strong> Multiple attention mechanisms in parallel (8-16 heads), each learning different patterns</div>
1146
+ </div>
1147
+ <div class="list-item">
1148
+ <div class="list-num">03</div>
1149
+ <div><strong>Positional Encoding:</strong> Add position information since attention is permutation-invariant</div>
1150
+ </div>
1151
+ <div class="list-item">
1152
+ <div class="list-num">04</div>
1153
+ <div><strong>Feed-Forward Networks:</strong> Two-layer MLPs applied to each position independently</div>
1154
+ </div>
1155
+ <div class="list-item">
1156
+ <div class="list-num">05</div>
1157
+ <div><strong>Layer Normalization:</strong> Stabilize training, applied before attention and FFN</div>
1158
+ </div>
1159
+ <div class="list-item">
1160
+ <div class="list-num">06</div>
1161
+ <div><strong>Residual Connections:</strong> Skip connections around each sub-layer for gradient flow</div>
1162
+ </div>
1163
+
1164
+ <div class="formula">
1165
+ Self-Attention Formula:<br>
1166
+ Attention(Q, K, V) = softmax(QK<sup>T</sup> / √d<sub>k</sub>) V<br>
1167
+ <br>
1168
+ Where:<br>
1169
+ β€’ Q = Queries (what we're looking for)<br>
1170
+ β€’ K = Keys (what each token represents)<br>
1171
+ β€’ V = Values (actual information to aggregate)<br>
1172
+ β€’ d<sub>k</sub> = dimension of keys (for scaling)<br>
1173
+ <br>
1174
+ Multi-Head Attention:<br>
1175
+ MultiHead(Q,K,V) = Concat(head₁,...,head<sub>h</sub>)W<sup>O</sup><br>
1176
+ where head<sub>i</sub> = Attention(QW<sub>i</sub><sup>Q</sup>, KW<sub>i</sub><sup>K</sup>, VW<sub>i</sub><sup>V</sup>)
1177
+ </div>
1178
+ `,
1179
+ applications: `
1180
+ <h3>Revolutionary Applications</h3>
1181
+ <div class="info-box">
1182
+ <div class="box-title">πŸ’¬ Large Language Models</div>
1183
+ <div class="box-content">
1184
+ <strong>ChatGPT, GPT-4, Claude:</strong> Conversational AI, code generation, creative writing, reasoning<br>
1185
+ <strong>BERT, RoBERTa:</strong> Search engines (Google), question answering, sentiment analysis
1186
+ </div>
1187
+ </div>
1188
+ <div class="info-box">
1189
+ <div class="box-title">🌐 Machine Translation</div>
1190
+ <div class="box-content">
1191
+ <strong>Google Translate, DeepL:</strong> Transformers achieved human-level translation quality<br>
1192
+ Supports 100+ languages, real-time translation
1193
+ </div>
1194
+ </div>
1195
+ <div class="info-box">
1196
+ <div class="box-title">🎨 Multi-Modal AI</div>
1197
+ <div class="box-content">
1198
+ <strong>DALL-E, Midjourney:</strong> Text-to-image generation<br>
1199
+ <strong>CLIP:</strong> Image-text understanding<br>
1200
+ <strong>Whisper:</strong> Speech recognition
1201
+ </div>
1202
+ </div>
1203
+ <div class="info-box">
1204
+ <div class="box-title">🧬 Scientific Discovery</div>
1205
+ <div class="box-content">
1206
+ <strong>AlphaFold:</strong> Protein structure prediction (Nobel Prize-worthy breakthrough)<br>
1207
+ <strong>Drug Discovery:</strong> Molecule generation and property prediction
1208
+ </div>
1209
+ </div>
1210
+ <div class="info-box">
1211
+ <div class="box-title">πŸ’» Code Intelligence</div>
1212
+ <div class="box-content">
1213
+ <strong>GitHub Copilot:</strong> AI pair programmer<br>
1214
+ <strong>CodeGen, AlphaCode:</strong> Automated coding, bug detection
1215
+ </div>
1216
+ </div>
1217
+ `
1218
+ }
1219
+ };
1220
+
1221
  function createModuleHTML(module) {
1222
+ const content = MODULE_CONTENT[module.id] || {};
1223
+
1224
  return `
1225
  <div class="module" id="${module.id}-module">
1226
  <button class="btn-back" onclick="switchTo('dashboard')">← Back to Dashboard</button>
 
1241
  <div id="${module.id}-overview" class="tab active">
1242
  <div class="section">
1243
  <h2>πŸ“– Overview</h2>
1244
+ ${content.overview || `
1245
+ <p>Complete coverage of ${module.title.toLowerCase()}. Learn the fundamentals, mathematics, real-world applications, and implementation details.</p>
1246
+ <div class="info-box">
1247
+ <div class="box-title">Learning Objectives</div>
1248
+ <div class="box-content">
1249
+ βœ“ Understand core concepts and theory<br>
1250
+ βœ“ Master mathematical foundations<br>
1251
+ βœ“ Learn practical applications<br>
1252
+ βœ“ Implement and experiment
1253
+ </div>
1254
  </div>
1255
+ `}
1256
  </div>
1257
  </div>
1258
 
1259
  <div id="${module.id}-concepts" class="tab">
1260
  <div class="section">
1261
  <h2>🎯 Key Concepts</h2>
1262
+ ${content.concepts || `
1263
+ <p>Fundamental concepts and building blocks for ${module.title.toLowerCase()}.</p>
1264
+ <div class="callout insight">
1265
+ <div class="callout-title">πŸ’‘ Main Ideas</div>
1266
+ This section covers the core ideas you need to understand before diving into mathematics.
1267
+ </div>
1268
+ `}
1269
  </div>
1270
  </div>
1271
 
 
1306
  <div id="${module.id}-applications" class="tab">
1307
  <div class="section">
1308
  <h2>🌍 Real-World Applications</h2>
1309
+ ${content.applications || `
1310
+ <p>How ${module.title.toLowerCase()} is used in practice across different industries.</p>
1311
+ <div class="info-box">
1312
+ <div class="box-title">Use Cases</div>
1313
+ <div class="box-content">
1314
+ Common applications and practical examples
1315
+ </div>
1316
  </div>
1317
+ `}
1318
  </div>
1319
  <div class="section">
1320
  <h2>πŸ“Š Application Scenarios Visualization</h2>