litert-community
/

FasterRCNN-ResNet50-FPN

Object Detection

Model card Files Files and versions

FasterRCNN-ResNet50-FPN / README.md

snnn001's picture

Update README.md

7d2e0d4 verified 21 days ago

|

history blame contribute delete

3.15 kB

	---
	library_name: litert
	tags:
	- vision
	- object-detection
	- litert
	- tflite
	- torchvision
	datasets:
	- detection-datasets/coco
	---

	# Faster R-CNN ResNet-50 FPN for LiteRT

	This repository contains an inference-only LiteRT packaging of TorchVision
	`fasterrcnn_resnet50_fpn`.

	The detector is split into three TFLite files:

	\| File \| Role \|
	\| --- \| --- \|
	\| `fasterrcnn_resnet50_fpn_backbone_body_dynamic_hw.tflite` \| ResNet backbone body: transformed image tensor to C2-C5 feature maps \|
	\| `fasterrcnn_resnet50_fpn_rpn_head_dynamic_hw.tflite` \| RPN head: one FPN level to objectness logits and box deltas \|
	\| `fasterrcnn_resnet50_fpn_roi_box_dynamic_n.tflite` \| ROI box head and predictor: pooled ROI features to class logits and box deltas \|

	TorchVision host code keeps the detector-specific orchestration around those
	LiteRT submodels: preprocessing, FPN, anchor/proposal decode, NMS, ROIAlign, and
	final postprocessing.

	Source model documentation:
	https://docs.pytorch.org/vision/main/models/generated/torchvision.models.detection.fasterrcnn_resnet50_fpn.html

	## Usage

	Install runtime dependencies:

	```bash
	pip install ai-edge-litert torch torchvision pillow numpy
	```

	Run the sample from this repository directory:

	```bash
	python sample_torchvision_fasterrcnn_litert_cpu.py
	```

	The script defaults to the three `.tflite` files in the same directory as the
	sample. You can override them explicitly:

	```bash
	python sample_torchvision_fasterrcnn_litert_cpu.py \
	--backbone-model fasterrcnn_resnet50_fpn_backbone_body_dynamic_hw.tflite \
	--rpn-head-model fasterrcnn_resnet50_fpn_rpn_head_dynamic_hw.tflite \
	--roi-model fasterrcnn_resnet50_fpn_roi_box_dynamic_n.tflite \
	--image https://github.com/pytorch/hub/raw/master/images/dog.jpg
	```

	Example output:

	```text
	input image: original=(1213, 1546) transformed=(1, 3, 800, 1024)
	host proposal decode/NMS: 1000 proposals
	ROI LiteRT outputs: logits=(1000, 91) box_regression=(1000, 364)
	detections above 0.50: 1
	01: dog score=0.9669 box=[137.84, 67.8, 1386.9, 1172.82]
	```

	## Notes

	This is not a single-file object detector. The sample script is part of the
	runtime contract and uses TorchVision host modules for the portions that are
	not represented by the three LiteRT submodels.

	The TFLite files use dynamic image height/width where the current CPU LiteRT
	runtime path supports it. The sample runs with `HardwareAccelerator.CPU`.



	## BibTeX entry and citation info

	```bibtex
	@article{DBLP:journals/corr/RenHG015,
	author = {Shaoqing Ren and
	Kaiming He and
	Ross B. Girshick and
	Jian Sun},
	title = {Faster {R-CNN:} Towards Real-Time Object Detection with Region Proposal
	Networks},
	journal = {CoRR},
	volume = {abs/1506.01497},
	year = {2015},
	url = {http://arxiv.org/abs/1506.01497},
	eprinttype = {arXiv},
	eprint = {1506.01497},
	timestamp = {Mon, 13 Aug 2018 16:46:02 +0200},
	biburl = {https://dblp.org/rec/journals/corr/RenHG015.bib},
	bibsource = {dblp computer science bibliography, https://dblp.org}
	}
	```