AZIIIIIIIIZ commited on
Commit
0ddf827
Β·
verified Β·
1 Parent(s): d670799

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -91
README.md CHANGED
@@ -1,92 +1,92 @@
1
- ---
2
- title: GenVidBench - Video Action Recognition
3
- emoji: 🎬
4
- colorFrom: blue
5
- colorTo: purple
6
- sdk: gradio
7
- sdk_version: 4.0.0
8
- app_file: app.py
9
- pinned: false
10
- license: apache-2.0
11
- short_description: State-of-the-art video action recognition using MMAction2
12
- ---
13
-
14
- # GenVidBench - Video Action Recognition
15
-
16
- A powerful video analysis tool that uses state-of-the-art deep learning models to recognize actions and activities in videos. Built on top of MMAction2 framework with a user-friendly Gradio interface.
17
-
18
- ## πŸš€ Features
19
-
20
- - **Action Recognition**: Identify actions and activities in videos using TSN (Temporal Segment Networks)
21
- - **Top-5 Predictions**: Get the most likely actions with confidence scores
22
- - **Multiple Formats**: Support for MP4, AVI, MOV, and other video formats
23
- - **Real-time Processing**: Fast inference optimized for web deployment
24
- - **User-friendly Interface**: Clean and intuitive Gradio web interface
25
-
26
- ## 🎯 Model Details
27
-
28
- This demo uses:
29
- - **Model**: TSN (Temporal Segment Networks) with ResNet-50 backbone
30
- - **Dataset**: Trained on Kinetics-400 dataset (400 action classes)
31
- - **Framework**: MMAction2 (OpenMMLab)
32
- - **Input**: RGB video frames
33
- - **Output**: Top-5 action predictions with confidence scores
34
-
35
- ## πŸ› οΈ Technical Stack
36
-
37
- - **Backend**: Python, PyTorch, MMAction2
38
- - **Frontend**: Gradio
39
- - **Video Processing**: OpenCV, Decord
40
- - **Deployment**: Hugging Face Spaces
41
-
42
- ## πŸ“– How to Use
43
-
44
- 1. **Upload Video**: Click the upload area or drag and drop your video file
45
- 2. **Wait for Processing**: The model will analyze your video (usually takes a few seconds)
46
- 3. **View Results**: See the top 5 predicted actions with confidence scores
47
-
48
- ## πŸ’‘ Tips for Best Results
49
-
50
- - **Video Length**: Shorter videos (under 30 seconds) process faster
51
- - **Video Quality**: Clear, well-lit videos work best
52
- - **Action Clarity**: Videos with clear, distinct actions yield better results
53
- - **Supported Formats**: MP4, AVI, MOV, and other common video formats
54
-
55
- ## πŸ”¬ Supported Actions
56
-
57
- The model can recognize 400 different action classes from the Kinetics-400 dataset, including:
58
- - Sports activities (basketball, soccer, tennis, etc.)
59
- - Daily activities (cooking, cleaning, reading, etc.)
60
- - Physical exercises (push-ups, jumping jacks, etc.)
61
- - Musical activities (playing instruments, singing, etc.)
62
- - And many more!
63
-
64
- ## πŸ—οΈ Architecture
65
-
66
- ```
67
- Video Input β†’ Frame Sampling β†’ Feature Extraction β†’ Classification β†’ Top-5 Predictions
68
- ```
69
-
70
- ## πŸ“Š Performance
71
-
72
- - **Accuracy**: State-of-the-art performance on Kinetics-400
73
- - **Speed**: Optimized for real-time inference
74
- - **Memory**: Efficient GPU/CPU utilization
75
-
76
- ## 🀝 Contributing
77
-
78
- This project is part of the GenVidBench framework. Contributions are welcome!
79
-
80
- ## πŸ“„ License
81
-
82
- This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
83
-
84
- ## πŸ™ Acknowledgments
85
-
86
- - [MMAction2](https://github.com/open-mmlab/mmaction2) - The underlying framework
87
- - [OpenMMLab](https://openmmlab.com/) - For the excellent computer vision tools
88
- - [Hugging Face](https://huggingface.co/) - For the deployment platform
89
-
90
- ---
91
-
92
  **Note**: This is a demonstration of video action recognition capabilities. For production use, consider additional validation and error handling.
 
1
+ ---
2
+ title: GenVidBench - Video Action Recognition
3
+ emoji: 🎬
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 5.47.2
8
+ app_file: app.py
9
+ pinned: false
10
+ license: apache-2.0
11
+ short_description: State-of-the-art video action recognition using MMAction2
12
+ ---
13
+
14
+ # GenVidBench - Video Action Recognition
15
+
16
+ A powerful video analysis tool that uses state-of-the-art deep learning models to recognize actions and activities in videos. Built on top of MMAction2 framework with a user-friendly Gradio interface.
17
+
18
+ ## πŸš€ Features
19
+
20
+ - **Action Recognition**: Identify actions and activities in videos using TSN (Temporal Segment Networks)
21
+ - **Top-5 Predictions**: Get the most likely actions with confidence scores
22
+ - **Multiple Formats**: Support for MP4, AVI, MOV, and other video formats
23
+ - **Real-time Processing**: Fast inference optimized for web deployment
24
+ - **User-friendly Interface**: Clean and intuitive Gradio web interface
25
+
26
+ ## 🎯 Model Details
27
+
28
+ This demo uses:
29
+ - **Model**: TSN (Temporal Segment Networks) with ResNet-50 backbone
30
+ - **Dataset**: Trained on Kinetics-400 dataset (400 action classes)
31
+ - **Framework**: MMAction2 (OpenMMLab)
32
+ - **Input**: RGB video frames
33
+ - **Output**: Top-5 action predictions with confidence scores
34
+
35
+ ## πŸ› οΈ Technical Stack
36
+
37
+ - **Backend**: Python, PyTorch, MMAction2
38
+ - **Frontend**: Gradio
39
+ - **Video Processing**: OpenCV, Decord
40
+ - **Deployment**: Hugging Face Spaces
41
+
42
+ ## πŸ“– How to Use
43
+
44
+ 1. **Upload Video**: Click the upload area or drag and drop your video file
45
+ 2. **Wait for Processing**: The model will analyze your video (usually takes a few seconds)
46
+ 3. **View Results**: See the top 5 predicted actions with confidence scores
47
+
48
+ ## πŸ’‘ Tips for Best Results
49
+
50
+ - **Video Length**: Shorter videos (under 30 seconds) process faster
51
+ - **Video Quality**: Clear, well-lit videos work best
52
+ - **Action Clarity**: Videos with clear, distinct actions yield better results
53
+ - **Supported Formats**: MP4, AVI, MOV, and other common video formats
54
+
55
+ ## πŸ”¬ Supported Actions
56
+
57
+ The model can recognize 400 different action classes from the Kinetics-400 dataset, including:
58
+ - Sports activities (basketball, soccer, tennis, etc.)
59
+ - Daily activities (cooking, cleaning, reading, etc.)
60
+ - Physical exercises (push-ups, jumping jacks, etc.)
61
+ - Musical activities (playing instruments, singing, etc.)
62
+ - And many more!
63
+
64
+ ## πŸ—οΈ Architecture
65
+
66
+ ```
67
+ Video Input β†’ Frame Sampling β†’ Feature Extraction β†’ Classification β†’ Top-5 Predictions
68
+ ```
69
+
70
+ ## πŸ“Š Performance
71
+
72
+ - **Accuracy**: State-of-the-art performance on Kinetics-400
73
+ - **Speed**: Optimized for real-time inference
74
+ - **Memory**: Efficient GPU/CPU utilization
75
+
76
+ ## 🀝 Contributing
77
+
78
+ This project is part of the GenVidBench framework. Contributions are welcome!
79
+
80
+ ## πŸ“„ License
81
+
82
+ This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
83
+
84
+ ## πŸ™ Acknowledgments
85
+
86
+ - [MMAction2](https://github.com/open-mmlab/mmaction2) - The underlying framework
87
+ - [OpenMMLab](https://openmmlab.com/) - For the excellent computer vision tools
88
+ - [Hugging Face](https://huggingface.co/) - For the deployment platform
89
+
90
+ ---
91
+
92
  **Note**: This is a demonstration of video action recognition capabilities. For production use, consider additional validation and error handling.