arxiv:2405.03978

VMambaCC: A Visual State Space Model for Crowd Counting

Published on May 7, 2024

Authors:

Abstract

Visual Mamba is extended to crowd counting through a novel VMambaCC model that incorporates MHF attention and HS2PFN to improve feature representation and achieve competitive performance on public datasets.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

As a deep learning model, Visual Mamba (VMamba) has a low computational complexity and a global receptive field, which has been successful applied to image classification and detection. To extend its applications, we apply VMamba to crowd counting and propose a novel VMambaCC (VMamba Crowd Counting) model. Naturally, VMambaCC inherits the merits of VMamba, or global modeling for images and low computational cost. Additionally, we design a Multi-head High-level Feature (MHF) attention mechanism for VMambaCC. MHF is a new attention mechanism that leverages high-level semantic features to augment low-level semantic features, thereby enhancing spatial feature representation with greater precision. Building upon MHF, we further present a High-level Semantic Supervised Feature Pyramid Network (HS2PFN) that progressively integrates and enhances high-level semantic information with low-level semantic information. Extensive experimental results on five public datasets validate the efficacy of our approach. For example, our method achieves a mean absolute error of 51.87 and a mean squared error of 81.3 on the ShangHaiTech\_PartA dataset. Our code is coming soon.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2405.03978

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2405.03978 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2405.03978 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2405.03978 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.