File size: 22,245 Bytes
be903e2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716


# 如何加入技术交流QQ群?

- 打开QQ→点击群聊搜索→搜索群号637093648→输入问题答案:卷卷卷卷卷→进入群聊→准备接受图灵测试(bushi)
- 前往QQ搜索Pocky群:677104663(超多大佬),问题答案:multi level intermediate representation

# 如何看作者b站直播?

- nihui的bilibili直播间:[水竹院落](https://live.bilibili.com/1264617)

# 编译

- ## 怎样下载完整源码?

   git clone --recursive https://github.com/Tencent/ncnn/
   
   或者
   
   下载 [ncnn-xxxxx-full-source.zip](https://github.com/Tencent/ncnn/releases)

- ## 怎么交叉编译?cmake 工具链怎么设置啊?
  
   参见 https://github.com/Tencent/ncnn/wiki/how-to-build

- ## The submodules were not downloaded! Please update submodules with "git submodule update --init" and try again

   如上,下载完整源码。或者按提示执行: git submodule update --init

- ## Could NOT find Protobuf (missing: Protobuf_INCLUDE_DIR)
  
   sudo apt-get install libprotobuf-dev protobuf-compiler

- ## Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY)

   https://github.com/Tencent/ncnn/issues/1873

- ## Could not find a package configuration file provided by "OpenCV" with any of the following names: OpenCVConfig.cmake opencv-config.cmake

   sudo apt-get install libopencv-dev

   或者自行编译安装,set(OpenCV_DIR {OpenCVConfig.cmake所在目录})

- ## Could not find a package configuration file provided by "ncnn" with any of the following names: ncnnConfig.cmake ncnn-config.cmake

   set(ncnn_DIR {ncnnConfig.cmake所在目录})

- ## 找不到 Vulkan, 

   cmake版本 3.10,否则没有带 FindVulkan.cmake

   android-api >= 24

   macos 要先执行安装脚本

- ## 如何安装 vulkan sdk

- ## 找不到库(需要根据系统/编译器指定)

   undefined reference to __kmpc_for_static_init_4 __kmpc_for_static_fini __kmpc_fork_call ...

   需要链接openmp库 

   undefined reference to vkEnumerateInstanceExtensionProperties vkGetInstanceProcAddr vkQueueSubmit ...

   需要 vulkan-1.lib

   undefined reference to glslang::InitializeProcess() glslang::TShader::TShader(EShLanguage) ...

   需要 glslang.lib OGLCompiler.lib SPIRV.lib OSDependent.lib

   undefined reference to AAssetManager_fromJava AAssetManager_open AAsset_seek ...

   find_library和target_like_libraries中增加 android 

   find_package(ncnn)

- ## undefined reference to typeinfo for ncnn::Layer

   opencv rtti -> opencv-mobile

- ## undefined reference to __cpu_model

   升级编译器 / libgcc_s libgcc

- ## unrecognized command line option "-mavx2"

   升级 gcc

- ## 为啥自己编译的ncnn android库特别大?

   https://github.com/Tencent/ncnn/wiki/build-for-android.zh 以及见 如何裁剪更小的 ncnn 库

- ## ncnnoptimize和自定义层

   先ncnnoptimize再增加自定义层,避免ncnnoptimize不能处理自定义层保存。


- ## rtti/exceptions冲突

   产生原因是项目工程中使用的库配置不一样导致冲突,根据自己的实际情况分析是需要开启还是关闭。ncnn默认是ON,在重新编译ncnn时增加以下2个参数即可:
   - 开启:-DNCNN_DISABLE_RTTI=OFF -DNCNN_DISABLE_EXCEPTION=OFF
   - 关闭:-DNCNN_DISABLE_RTTI=ON -DNCNN_DISABLE_EXCEPTION=ON


- ## error: undefined symbol: ncnn::Extractor::extract(char const*, ncnn::Mat&)

   可能的情况:
   - 尝试升级 Android Studio 的 NDK 版本

- ## CMake 3.14.0 or higher is required.  You are running version 2.8.12.2
```shell
wget https://github.com/Kitware/CMake/releases/download/v3.18.2/cmake-3.18.2-Linux-x86_64.tar.gz
tar zxvf cmake-3.18.2-Linux-x86_64.tar.gz
mv cmake-3.18.2-Linux-x86_64 /opt/cmake-3.18.2
ln -sf /opt/cmake-3.18.2/bin/* /usr/bin/
```

# 怎样添加ncnn库到项目中?cmake方式怎么用?

编译ncnn,make install。linux/windows set/export ncnn_DIR 指向 install目录下包含ncnnConfig.cmake 的目录

- ## android

- ## ios

- ## linux

- ## windows

- ## macos

- ## arm linux


# 转模型问题

- ## caffe

   `./caffe2ncnn caffe.prototxt caffe.caffemodel ncnn.param ncnn.bin`

- ## mxnet

   ` ./mxnet2ncnn mxnet-symbol.json mxnet.params ncnn.param ncnn.bin`

- ## darknet

   [https://github.com/xiangweizeng/darknet2ncnn](https://github.com/xiangweizeng/darknet2ncnn)

- ## pytorch - onnx

   [use ncnn with pytorch or onnx](https://github.com/Tencent/ncnn/wiki/use-ncnn-with-pytorch-or-onnx)

- ## tensorflow 1.x/2.x - keras

   [https://github.com/MarsTechHAN/keras2ncnn](https://github.com/MarsTechHAN/keras2ncnn) **[@MarsTechHAN](https://github.com/MarsTechHAN)**

- ## tensorflow 2.x - mlir

   [通过MLIR将tensorflow2模型转换到ncnn](https://zhuanlan.zhihu.com/p/152535430) **@[nihui](https://www.zhihu.com/people/nihui-2)**

- ## Shape not supported yet! Gather not supported yet! Cast not supported yet!

   onnx-simplifier 静态shape

- ## convertmodel

   [https://convertmodel.com/](https://convertmodel.com/) **[@大老师](https://github.com/daquexian)**

- ## netron

   [https://github.com/lutzroeder/netron](https://github.com/lutzroeder/netron)

- ## 怎么生成有固定 shape 信息的模型?

   Input      0=w 1=h 2=c

- ## why gpu能更快

- ## ncnnoptimize 怎么转成 fp16 模型

   `ncnnoptimize model.param model.bin yolov5s-opt.param yolov5s-opt.bin 65536`

- ## ncnnoptimize 怎样查看模型的 FLOPS / 内存占用情况

- ## 怎么修改模型支持动态 shape?

   Interp Reshape

- ## 如何将模型转换为代码内嵌到程序里?

   ncnn2mem

- ## 如何加密模型?

   https://zhuanlan.zhihu.com/p/268327784

- ## Linux下转的ncnn模型,Windows/MacOS/Android/.. 也能直接用吗?

   Yes,全平台通用

- ## 如何去掉后处理,再导出 onnx?

   检测:

   参考up的一篇文章<https://zhuanlan.zhihu.com/p/128974102>,步骤三就是去掉后处理,再导出onnx,其中去掉后处理可以是项目内测试时去掉后续步骤的结果。

- ## pytorch 有的层导不出 onnx 怎么办?

 方式一:

   ONNX_ATEN_FALLBACK
完全自定义的op,先改成能导出的(如 concat slice),转到 ncnn 后再修改 param

 方式二:

 可以使用PNNX来试试,参考以下文章大概说明:

   1. [Windows/Linux/macOS 编译 PNNX 步骤](https://zhuanlan.zhihu.com/p/431833958)

   2. [5分钟学会!用 PNNX 转换 TorchScript 模型到 ncnn 模型](https://zhuanlan.zhihu.com/p/427512763)

# 使用

- ## vkEnumeratePhysicalDevices failed -3

- ## vkCreateInstance failed -9

   出现此类问题请先更新GPU驱动。Please upgrade your GPU driver if you encounter this crash or error.
   这里提供了一些品牌的GPU驱动下载网址.We have provided some drivers' download pages here.
   [Intel](https://downloadcenter.intel.com/product/80939/Graphics-Drivers),[AMD](https://www.amd.com/en/support),[Nvidia](https://www.nvidia.com/Download/index.aspx)

- ## ModuleNotFoundError: No module named 'ncnn.ncnn'

   python setup.py develop

- ## fopen nanodet-m.param failed

   文件路径 working dir

   File not found or not readable. Make sure that XYZ.param/XYZ.bin is accessible.

- ## find_blob_index_by_name data / output / ... failed

   layer name vs blob name
   
   param.bin 应该用 xxx.id.h 的枚举

- ## parse magic failed

- ## param is too old, please regenerate

   模型本身有问题

   Your model file is being the old format converted by an old caffe2ncnn tool.

   Checkout the latest ncnn code, build it and regenerate param and model binary files, and that should work.

   Make sure that your param file starts with the magic number 7767517.

   you may find more info on use-ncnn-with-alexnet
   
   When adding the softmax layer yourself, you need to add 1=1

- ## set_vulkan_compute failed, network use_vulkan_compute disabled

   你应该在 load_param / load_model 之前设置 net.opt.use_vulkan_compute = true;

- ## 多个blob输入,多个blob输出,怎么做?
   多次执行`ex.input()` 和 `ex.extract()`
```
ex.input("data1", in_1);
ex.input("data2", in_2);
ex.extract("output1", out_1);
ex.extract("output2", out_2);
```
- ## Extractor extract 多次会重复计算吗?

   不会

- ## 如何看每一层的耗时?

   cmake -DNCNN_BENCHMARK=ON ..

- ## 如何转换 cv::Mat CV_8UC3 BGR 图片

   from_pixels to_pixels

- ## 如何转换 float 数据为 ncnn::Mat

   首先,自己申请的内存需要自己管理,此时ncnn::Mat不会自动给你释放你传过来的float数据
   ``` c++
   std::vector<float> testData(60, 1.0);                                      // 利用std::vector<float>自己管理内存的申请和释放
   ncnn::Mat in1 = ncnn::Mat(60, (void*)testData.data()).reshape(4, 5, 3);    // 把float数据的指针转成void*传过去即可,甚至还可以指定维度(up说最好使用reshape用来解决channel gap)
   float* a = new float[60];                                                  // 自己new一块内存,后续需要自己释放
   ncnn::Mat in2 = ncnn::Mat(60, (void*)a).reshape(4, 5, 3).clone();          // 使用方法和上面相同,clone() to transfer data owner
   ```

- ## 如何初始化 ncnn::Mat 为全 0

   `mat.fill(0.f);`

- ## 如何查看/获取版本号

   cmake时会打印

   c_api.h ncnn_version()

   自己拼 1.0+yyyymmdd

- ## 如何转换 yuv 数据

   yuv420sp2rgb yuv420sp2rgb_nv12

   **[@metarutaiga](https://github.com/metarutaiga/xxYUV)**

- ## 如何 resize crop rotate 图片

   [efficient roi resize rotate](https://github.com/Tencent/ncnn/wiki/efficient-roi-resize-rotate)

- ## 如何人脸5点对齐

   get_affine_transform

   warpaffine_bilinear_c3

```c
// 计算变换矩阵 并且求逆变换
int type = 0;       // 0->区域外填充为v[0],v[1],v[2], -233->区域外不处理
unsigned int v = 0;
float tm[6];
float tm_inv[6];
// 人脸区域在原图上的坐标和宽高
float src_x = target->det.rect.x / target->det.w * pIveImageU8C3->u32Width;
float src_y = target->det.rect.y / target->det.h * pIveImageU8C3->u32Height;
float src_w = target->det.rect.w / target->det.w * pIveImageU8C3->u32Width;
float src_h = target->det.rect.h / target->det.h * pIveImageU8C3->u32Height;
float point_src[10] = {
src_x + src_w * target->attr.land[0][0], src_x + src_w * target->attr.land[0][1],
src_x + src_w * target->attr.land[1][0], src_x + src_w * target->attr.land[1][1],
src_x + src_w * target->attr.land[2][0], src_x + src_w * target->attr.land[2][1],
src_x + src_w * target->attr.land[3][0], src_x + src_w * target->attr.land[3][1],
src_x + src_w * target->attr.land[4][0], src_x + src_w * target->attr.land[4][1],
};
float point_dst[10] = { // +8 是因为我们处理112*112的图
30.2946f + 8.0f, 51.6963f,
65.5318f + 8.0f, 51.5014f,
48.0252f + 8.0f, 71.7366f,
33.5493f + 8.0f, 92.3655f,
62.7299f + 8.0f, 92.2041f,
};
// 第一种方式:先计算变换在求逆
AffineTrans::get_affine_transform(point_src, point_dst, 5, tm);
AffineTrans::invert_affine_transform(tm, tm_inv);
// 第二种方式:直接拿到求逆的结果
// AffineTrans::get_affine_transform(point_dst, point_src, 5, tm_inv);
// rgb 分离的,所以要单独处理
for(int c = 0; c < 3; c++)
{
    unsigned char* pSrc = malloc(xxx);
    unsigned char* pDst = malloc(xxx);
    ncnn::warpaffine_bilinear_c1(pSrc, SrcWidth, SrcHeight, SrcStride[c], pDst, DstWidth, DstHeight, DstStride[c], tm_inv, type, v);
}
// rgb packed则可以一次处理
ncnn::warpaffine_bilinear_c3(pSrc, SrcWidth, SrcHeight, SrcStride, pDst, DstWidth, DstHeight, DstStride, tm_inv, type, v);
```

- ## 如何获得中间层的blob输出
  
   ncnn::Mat output;
   
   ex.extract("your_blob_name", output);

- ## 为什么我使用GPU,但是GPU占用为0

   windows 10 任务管理器 - 性能选项卡 - GPU - 选择其中一个视图左上角的下拉箭头切换到 Compute_0 / Compute_1 / Cuda

   你还可以安装软件:GPU-Z 

- ## layer XYZ not exists or registered

   Your network contains some operations that are not implemented in ncnn.

   You may implement them as custom layer followed in how-to-implement-custom-layer-step-by-step.

   Or you could simply register them as no-op if you are sure those operations make no sense.

```
class Noop : public ncnn::Layer {};
DEFINE_LAYER_CREATOR(Noop)

net.register_custom_layer("LinearRegressionOutput", Noop_layer_creator);
net.register_custom_layer("MAERegressionOutput", Noop_layer_creator);
```

- ## network graph not ready

   You shall call Net::load_param() first, then Net::load_model().

   This error may also happens when Net::load_param() failed, but not properly handled.

   For more information about the ncnn model load api, see ncnn-load-model

- ## memory not 32-bit aligned at XYZ

   The pointer passed to Net::load_param() or Net::load_model() is not 32bit aligned.

   In practice, the head pointer of std::vector is not guaranteed to be 32bit aligned.

   you can store your binary buffer in ncnn::Mat structure, its internal memory is aligned.

- ## crash on android with '__kmp_abort_process'

   This usually happens if you bundle multiple shared library with openmp linked

   It is actually an issue of the android ndk https://github.com/android/ndk/issues/1028

   On old android ndk, modify the link flags as

   -Wl,-Bstatic -lomp -Wl,-Bdynamic

   For recent ndk >= 21

   -fstatic-openmp

- ## dlopen failed: library "libomp.so" not found
   Newer android ndk defaults to dynamic openmp runtime

   modify the link flags as

   -fstatic-openmp -fopenmp

- ## crash when freeing a ncnn dynamic library(.dll/.so) built with openMP

   for optimal performance, the openmp threadpool spin waits for about a second prior to shutting down in case more work becomes available.

   If you unload a dynamic library that's in the process of spin-waiting, it will crash in the manner you see (most of the time).

   Just set OMP_WAIT_POLICY=passive in your environment, before calling loadlibrary. or Just wait a few seconds before calling freelibrary.

   You can also use the following method to set environment variables in your code:

   for msvc++:

      SetEnvironmentVariable(_T("OMP_WAIT_POLICY"), _T("passive"));

   for g++:

      setenv("OMP_WAIT_POLICY", "passive", 1)
   
      reference: https://stackoverflow.com/questions/34439956/vc-crash-when-freeing-a-dll-built-with-openmp

# 跑出来的结果对不上

[ncnn-produce-wrong-result](https://github.com/Tencent/ncnn/wiki/FAQ-ncnn-produce-wrong-result)

- ## 如何打印 ncnn::Mat 的值?

```C++
void pretty_print(const ncnn::Mat& m)
{
    for (int q=0; q<m.c; q++)
    {
        const float* ptr = m.channel(q);
        for (int y=0; y<m.h; y++)
        {
            for (int x=0; x<m.w; x++)
            {
                printf("%f ", ptr[x]);
            }
            ptr += m.w;
            printf("\n");
        }
        printf("------------------------\n");
    }
}
```
In Android Studio, `printf` will not work, you can use `__android_log_print` instead. Example :
```C++
#include <android/log.h>  // Don't forget this

void pretty_print(const ncnn::Mat& m)
{
    for (int q=0; q<m.c; q++)
    {
        for (int y=0; y<m.h; y++)
        {
            for (int x=0; x<m.w; x++)
            {
                __android_log_print(ANDROID_LOG_DEBUG,"LOG_TAG","ncnn Mat is : %f", m.channel(q).row(y)[x]);
            }
        }
    }
}
```

- ## 如何可视化 ncnn::Mat 的值?

```
void visualize(const char* title, const ncnn::Mat& m)
{
    std::vector<cv::Mat> normed_feats(m.c);

    for (int i=0; i<m.c; i++)
    {
        cv::Mat tmp(m.h, m.w, CV_32FC1, (void*)(const float*)m.channel(i));

        cv::normalize(tmp, normed_feats[i], 0, 255, cv::NORM_MINMAX, CV_8U);

        cv::cvtColor(normed_feats[i], normed_feats[i], cv::COLOR_GRAY2BGR);

        // check NaN
        for (int y=0; y<m.h; y++)
        {
            const float* tp = tmp.ptr<float>(y);
            uchar* sp = normed_feats[i].ptr<uchar>(y);
            for (int x=0; x<m.w; x++)
            {
                float v = tp[x];
                if (v != v)
                {
                    sp[0] = 0;
                    sp[1] = 0;
                    sp[2] = 255;
                }

                sp += 3;
            }
        }
    }

    int tw = m.w < 10 ? 32 : m.w < 20 ? 16 : m.w < 40 ? 8 : m.w < 80 ? 4 : m.w < 160 ? 2 : 1;
    int th = (m.c - 1) / tw + 1;

    cv::Mat show_map(m.h * th, m.w * tw, CV_8UC3);
    show_map = cv::Scalar(127);

    // tile
    for (int i=0; i<m.c; i++)
    {
        int ty = i / tw;
        int tx = i % tw;

        normed_feats[i].copyTo(show_map(cv::Rect(tx * m.w, ty * m.h, m.w, m.h)));
    }

    cv::resize(show_map, show_map, cv::Size(0,0), 2, 2, cv::INTER_NEAREST);
    cv::imshow(title, show_map);
}
```

- ## 总是输出第一张图的结果

   复用 Extractor?!

- ## 启用fp16时的精度有差异

   net.opt.use_fp16_packed = false;

   net.opt.use_fp16_storage = false;

   net.opt.use_fp16_arithmetic = false;

   [ncnn-produce-wrong-result](https://github.com/Tencent/ncnn/wiki/FAQ-ncnn-produce-wrong-result)


# 如何跑得更快?内存占用更少?库体积更小?

- ## fp32 fp16

- ## 大小核绑定
   ncnn::set_cpu_powersave(int)绑定大核或小核
   注意windows系统不支持绑核。
   ncnn支持不同的模型运行在不同的核心。假设硬件平台有2个大核,4个小核,你想把netA运行在大核,netB运行在小核。
   可以通过std::thread or pthread创建两个线程,运行如下代码:
   0:全部
   1:小核
   2:大核
```
   void thread_1()
   {
      ncnn::set_cpu_powersave(2); // bind to big cores
      netA.opt.num_threads = 2;
   }

   void thread_2()
   {
      ncnn::set_cpu_powersave(1); // bind to little cores
      netB.opt.num_threads = 4;
   }
```

   [openmp-best-practice.zh.md](https://github.com/Tencent/ncnn/blob/master/docs/how-to-use-and-FAQ/openmp-best-practice.zh.md)

- ## 查看 CPU 或 GPU 数量
   get_cpu_count
   
   get_gpu_count

- ## ncnnoptimize

   使用方式一:
    - ./ncnnoptimize ncnn.param ncnn.bin new.param new.bin flag
    <br/>注意这里的flag指的是fp32和fp16,其中0指的是fp32,1指的是fp16

   使用方式二:
    - ./ncnnoptimize ncnn.param ncnn.bin new.param new.bin flag cutstartname cutendname
    <br/>cutstartname:模型截取的起点
     <br/>cutendname:模型截取的终点


- ## 如何使用量化工具?

   [Post Training Quantization Tools](https://github.com/Tencent/ncnn/tree/master/tools/quantize)

- ## 如何设置线程数?

   opt.num_threads

- ## 如何降低CPU占用率?

   net.opt.openmp_blocktime = 0;
   
   OMP_WAIT_POLICY=passive

- ## 如何 batch inference?

```
   int max_batch_size = vkdev->info.compute_queue_count;
   
   ncnn::Mat inputs[1000];
   ncnn::Mat outputs[1000];
   
   #pragma omp parallel for num_threads(max_batch_size)
   for (int i=0; i<1000; i++)
   {
       ncnn::Extractor ex = net1.create_extractor();
       ex.input("data", inputs[i]);
       ex.extract("prob", outputs[i]);
   }
```

   

- ## partial graph inference

   先 extract 分类,判断后,再 extract bbox

- ## 如何启用 bf16s 加速?

```
net.opt.use_packing_layout = true;
net.opt.use_bf16_storage = true;
```

   [用bf16加速ncnn](https://zhuanlan.zhihu.com/p/112564372) **@[nihui](https://www.zhihu.com/people/nihui-2)**

   A53

- ## 如何裁剪更小的 ncnn 库?

   [build-minimal-library](https://github.com/Tencent/ncnn/wiki/build-minimal-library)

- ## net.opt sgemm winograd fp16_storage 各是有什么作用?

   对内存消耗的影响

- ## 如何解决显卡进入节能模式造成的一系列问题?

   nVidia显卡(Intel和AMD估计也有)会在它认为的所谓空闲模式下,自动进入 `节能模式`,显存和核心频率就都会降低。
   
   简单来说就是如果你的计算任务是 `非连续的`,那么可能会让耗时看起来非常 `不均匀`,当期间有运算空闲间隔发生,显卡进入节能模式,则会在下一次冷启动时发生计算耗时远超正常耗时几倍的情况,如下日志所示:

   ```cpp
   //开始播放
   Total: 162ms, Diff: 0ms, GLTex2Mat: 7ms, calc: 152ms, Mat2GLTex: 3ms
   Total: 43ms, Diff: 0ms, GLTex2Mat: 3ms, calc: 35ms, Mat2GLTex: 2ms
   Total: 45ms, Diff: 0ms, GLTex2Mat: 3ms, calc: 37ms, Mat2GLTex: 3ms
   Total: 40ms, Diff: 0ms, GLTex2Mat: 3ms, calc: 32ms, Mat2GLTex: 4ms
   //暂停3秒
   //继续播放
   Total: 190ms, Diff: 0ms, GLTex2Mat: 9ms, calc: 177ms, Mat2GLTex: 3ms
   Total: 134ms, Diff: 0ms, GLTex2Mat: 5ms, calc: 110ms, Mat2GLTex: 18ms
   Total: 40ms, Diff: 0ms, GLTex2Mat: 3ms, calc: 34ms, Mat2GLTex: 2ms
   Total: 42ms, Diff: 0ms, GLTex2Mat: 3ms, calc: 36ms, Mat2GLTex: 2ms
   Total: 47ms, Diff: 0ms, GLTex2Mat: 5ms, calc: 38ms, Mat2GLTex: 3ms
   ...
   ```

   在对时间不敏感的项目上,这个问题没什么大不了的,完全可以忽略,但是有些业务场景上必须精准推估下一帧及其未来几帧的从上传、计算到渲染的耗时情况,则这种现象将会给开发者打开些许困扰。

   ### 3种解决方法
   * 联系显卡厂商,让其更新驱动将你的应用加入到免节能模式的白名单。
     * 优点:你什么都不用改。缺点:沟通困难,很可能显卡厂商根本不理你。
   * [显卡控制面板] - [管理3D设置] - [电源管理模式],改成:[最高性能优先]。
     * 优点:不用改代码。缺点:如果是部署端是小白用户,需要编写手册手把手教他。
   * 可以空闲(暂停)时定期灌一些心跳计算包的任务进去(放1x1小图)让GPU维持在高性能状态。
     * 优点:需要改代码。缺点:不低碳不环保。

# 白嫖项目

- ## nanodet

# 其他

- ## up主用的什么系统/编辑器/开发环境?

   | 软件类型     |   软件名称  |
   | ------------| ----------- |
   | 系统        | Fedora       |
   | 桌面环境     | KDE         |
   | 编辑器       | Kate        |
   | 画草图       | kolourpaint |
   | 画函数图像   | kmplot      |
   | bilibili直播 |  OBS         |