camenduru
/

ncnn

Model card Files Files and versions

ncnn / docs /developer-guide /custom-allocator.md

camenduru's picture

thanks to ncnn ❤

be903e2 over 2 years ago

|

history blame contribute delete

2.13 kB

	Mat structure is now allocator-aware via an extra allocator parameter with default zero value.

	The good-old ncnn::fastMalloc()/ncnn::fastFree() will be used for a null allocator.

	You could pass a custom allocator to delegate all memory allocation and deallocation.

	```cpp
	class Allocator
	{
	public:
	virtual void* fastMalloc(size_t size) = 0;
	virtual void fastFree(void* ptr) = 0;
	};
	```

	ncnn has already implemented two simple pooled Allocator class, with mutex lock or without it.

	```cpp
	ncnn::PoolAllocator locked_mempool;
	ncnn::UnlockedPoolAllocator unlocked_mempool;
	```

	the two allocator types in ncnn

	* blob allocator

	used to allocate memory for all named blobs, which you could retrieve by Extractor::extract()
	* workspace allocator

	used to allocate memory for internal temporary use in layer implementation, such as the temp blob after padding in convolution

	by default, all Extractor instance use the two allocator in the default option
	You can alter them by ncnn::set_default_option()
	or you can set them per Extractor by Extractor::set_blob_allocator()/Extractor::set_workspace_allocator()

	blob allocator is guaranteed to be called in-order in layer implementation during each Extractor lifecycle
	while workspace allocator may be called synchronously

	the practical usage

	* one network, one-by-one inference

	shared unlocked blob allocator for all Extractor

	shared locked workspace allocator for all Extractor

	* one network, concurrent inference

	shared unlocked blob allocator for all Extractor in each thread

	shared locked workspace allocator for all Extractor among all threads

	* concurrent multiple networks, one-by-one inference for each network

	shared unlocked blob allocator for all Extractor of each network

	shared locked workspace allocator for all Extractor among all networks (for saving memory)

	* concurrent multiple networks, concurrent inference for each network

	shared unlocked blob allocator for all Extractor of each network in each thread

	shared locked workspace allocator for all Extractor among all networks (for saving memory)