prasb commited on
Commit
d0ad144
·
verified ·
1 Parent(s): 5ef4913

Add files using upload-large-folder tool

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +5 -0
  2. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/brotli/_brotli.abi3.so +3 -0
  3. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/__pycache__/__init__.cpython-38.pyc +0 -0
  4. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/__pycache__/loader.cpython-38.pyc +0 -0
  5. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/gmm/gmm.cpp +85 -0
  6. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/gmm/gmm.h +53 -0
  7. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/gmm/gmm_cpu.cpp +35 -0
  8. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/gmm/gmm_cuda.cu +518 -0
  9. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/gmm/gmm_cuda_linalg.cuh +144 -0
  10. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/engines/__pycache__/__init__.cpython-38.pyc +0 -0
  11. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/engines/__pycache__/evaluator.cpython-38.pyc +0 -0
  12. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/engines/__pycache__/multi_gpu_supervised_trainer.cpython-38.pyc +0 -0
  13. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/engines/__pycache__/trainer.cpython-38.pyc +0 -0
  14. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/engines/__pycache__/utils.cpython-38.pyc +0 -0
  15. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/engines/__pycache__/workflow.cpython-38.pyc +0 -0
  16. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/__init__.cpython-38.pyc +0 -0
  17. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/checkpoint_saver.cpython-38.pyc +0 -0
  18. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/classification_saver.cpython-38.pyc +0 -0
  19. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/confusion_matrix.cpython-38.pyc +0 -0
  20. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/decollate_batch.cpython-38.pyc +0 -0
  21. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/earlystop_handler.cpython-38.pyc +0 -0
  22. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/garbage_collector.cpython-38.pyc +0 -0
  23. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/hausdorff_distance.cpython-38.pyc +0 -0
  24. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/ignite_metric.cpython-38.pyc +0 -0
  25. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/lr_schedule_handler.cpython-38.pyc +0 -0
  26. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/mean_dice.cpython-38.pyc +0 -0
  27. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/metric_logger.cpython-38.pyc +0 -0
  28. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/metrics_saver.cpython-38.pyc +0 -0
  29. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/mlflow_handler.cpython-38.pyc +0 -0
  30. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/nvtx_handlers.cpython-38.pyc +0 -0
  31. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/postprocessing.cpython-38.pyc +0 -0
  32. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/probability_maps.cpython-38.pyc +0 -0
  33. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/regression_metrics.cpython-38.pyc +0 -0
  34. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/roc_auc.cpython-38.pyc +0 -0
  35. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/smartcache_handler.cpython-38.pyc +0 -0
  36. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/stats_handler.cpython-38.pyc +0 -0
  37. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/surface_distance.cpython-38.pyc +0 -0
  38. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/tensorboard_handlers.cpython-38.pyc +0 -0
  39. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/validation_handler.cpython-38.pyc +0 -0
  40. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/inferers/__pycache__/__init__.cpython-38.pyc +0 -0
  41. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/inferers/__pycache__/inferer.cpython-38.pyc +0 -0
  42. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/inferers/__pycache__/utils.cpython-38.pyc +0 -0
  43. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/__init__.cpython-38.pyc +0 -0
  44. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/confusion_matrix.cpython-38.pyc +0 -0
  45. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/cumulative_average.cpython-38.pyc +0 -0
  46. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/froc.cpython-38.pyc +0 -0
  47. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/generalized_dice.cpython-38.pyc +0 -0
  48. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/hausdorff_distance.cpython-38.pyc +0 -0
  49. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/meandice.cpython-38.pyc +0 -0
  50. my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/metric.cpython-38.pyc +0 -0
.gitattributes CHANGED
@@ -331,3 +331,8 @@ my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/opencv_pyth
331
  my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/pyzmq.libs/libsodium-bcf9f097.so.23.3.0 filter=lfs diff=lfs merge=lfs -text
332
  my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/yarl/_quoting_c.cpython-38-x86_64-linux-gnu.so filter=lfs diff=lfs merge=lfs -text
333
  my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/h5py.libs/libaec-9c9e97eb.so.0.0.10 filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
331
  my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/pyzmq.libs/libsodium-bcf9f097.so.23.3.0 filter=lfs diff=lfs merge=lfs -text
332
  my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/yarl/_quoting_c.cpython-38-x86_64-linux-gnu.so filter=lfs diff=lfs merge=lfs -text
333
  my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/h5py.libs/libaec-9c9e97eb.so.0.0.10 filter=lfs diff=lfs merge=lfs -text
334
+ my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/opencv_python.libs/libQt5Test-c38a5234.so.5.15.0 filter=lfs diff=lfs merge=lfs -text
335
+ my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/scipy.libs/libgfortran-040039e1.so.5.0.0 filter=lfs diff=lfs merge=lfs -text
336
+ my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/brotli/_brotli.abi3.so filter=lfs diff=lfs merge=lfs -text
337
+ my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/opencv_python.libs/libxcb-xkb-9ba31ab3.so.1.0.0 filter=lfs diff=lfs merge=lfs -text
338
+ my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/opencv_python.libs/libssl-28bef1ac.so.1.1 filter=lfs diff=lfs merge=lfs -text
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/brotli/_brotli.abi3.so ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1c9ea7e74f258c0527249f553bdd1a136e016017a222f38186df92e85361a4f0
3
+ size 746208
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/__pycache__/__init__.cpython-38.pyc ADDED
Binary file (211 Bytes). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/__pycache__/loader.cpython-38.pyc ADDED
Binary file (2.86 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/gmm/gmm.cpp ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /*
2
+ Copyright (c) MONAI Consortium
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+ http://www.apache.org/licenses/LICENSE-2.0
7
+ Unless required by applicable law or agreed to in writing, software
8
+ distributed under the License is distributed on an "AS IS" BASIS,
9
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
10
+ See the License for the specific language governing permissions and
11
+ limitations under the License.
12
+ */
13
+
14
+ #include <torch/extension.h>
15
+
16
+ #include "gmm.h"
17
+
18
+ py::tuple init() {
19
+ torch::Tensor gmm_tensor =
20
+ torch::zeros({GMM_COUNT, GMM_COMPONENT_COUNT}, torch::dtype(torch::kFloat32).device(torch::kCUDA));
21
+ torch::Tensor scratch_tensor = torch::empty({1}, torch::dtype(torch::kFloat32).device(torch::kCUDA));
22
+ return py::make_tuple(gmm_tensor, scratch_tensor);
23
+ }
24
+
25
+ void learn(
26
+ torch::Tensor gmm_tensor,
27
+ torch::Tensor scratch_tensor,
28
+ torch::Tensor input_tensor,
29
+ torch::Tensor label_tensor) {
30
+ c10::DeviceType device_type = input_tensor.device().type();
31
+
32
+ unsigned int batch_count = input_tensor.size(0);
33
+ unsigned int element_count = input_tensor.stride(1);
34
+
35
+ unsigned int scratch_size =
36
+ batch_count * (element_count + GMM_COMPONENT_COUNT * GMM_COUNT * (element_count / (32 * 32)));
37
+
38
+ if (scratch_tensor.size(0) < scratch_size) {
39
+ scratch_tensor.resize_({scratch_size});
40
+ }
41
+
42
+ float* gmm = gmm_tensor.data_ptr<float>();
43
+ float* scratch = scratch_tensor.data_ptr<float>();
44
+ float* input = input_tensor.data_ptr<float>();
45
+ int* labels = label_tensor.data_ptr<int>();
46
+
47
+ if (device_type == torch::kCUDA) {
48
+ learn_cuda(input, labels, gmm, scratch, batch_count, element_count);
49
+ } else {
50
+ learn_cpu(input, labels, gmm, scratch, batch_count, element_count);
51
+ }
52
+ }
53
+
54
+ torch::Tensor apply(torch::Tensor gmm_tensor, torch::Tensor input_tensor) {
55
+ c10::DeviceType device_type = input_tensor.device().type();
56
+
57
+ unsigned int dim = input_tensor.dim();
58
+ unsigned int batch_count = input_tensor.size(0);
59
+ unsigned int element_count = input_tensor.stride(1);
60
+
61
+ long int* output_size = new long int[dim];
62
+ memcpy(output_size, input_tensor.sizes().data(), dim * sizeof(long int));
63
+ output_size[1] = MIXTURE_COUNT;
64
+ torch::Tensor output_tensor =
65
+ torch::empty(c10::IntArrayRef(output_size, dim), torch::dtype(torch::kFloat32).device(device_type));
66
+ delete output_size;
67
+
68
+ const float* gmm = gmm_tensor.data_ptr<float>();
69
+ const float* input = input_tensor.data_ptr<float>();
70
+ float* output = output_tensor.data_ptr<float>();
71
+
72
+ if (device_type == torch::kCUDA) {
73
+ apply_cuda(gmm, input, output, batch_count, element_count);
74
+ } else {
75
+ apply_cpu(gmm, input, output, batch_count, element_count);
76
+ }
77
+
78
+ return output_tensor;
79
+ }
80
+
81
+ PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
82
+ m.def("init", torch::wrap_pybind_function(init));
83
+ m.def("learn", torch::wrap_pybind_function(learn));
84
+ m.def("apply", torch::wrap_pybind_function(apply));
85
+ }
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/gmm/gmm.h ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /*
2
+ Copyright (c) MONAI Consortium
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+ http://www.apache.org/licenses/LICENSE-2.0
7
+ Unless required by applicable law or agreed to in writing, software
8
+ distributed under the License is distributed on an "AS IS" BASIS,
9
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
10
+ See the License for the specific language governing permissions and
11
+ limitations under the License.
12
+ */
13
+
14
+ #if !defined(CHANNEL_COUNT) || !defined(MIXTURE_COUNT) || !defined(MIXTURE_SIZE)
15
+ #error Definition of CHANNEL_COUNT, MIXTURE_COUNT, and MIXTURE_SIZE required
16
+ #endif
17
+
18
+ #if CHANNEL_COUNT < 1 || MIXTURE_COUNT < 1 || MIXTURE_SIZE < 1
19
+ #error CHANNEL_COUNT, MIXTURE_COUNT, and MIXTURE_SIZE must be positive
20
+ #endif
21
+
22
+ #define MATRIX_COMPONENT_COUNT ((CHANNEL_COUNT + 1) * (CHANNEL_COUNT + 2) / 2)
23
+ #define SUB_MATRIX_COMPONENT_COUNT (CHANNEL_COUNT * (CHANNEL_COUNT + 1) / 2)
24
+ #define GMM_COMPONENT_COUNT (MATRIX_COMPONENT_COUNT + 1)
25
+ #define GMM_COUNT (MIXTURE_COUNT * MIXTURE_SIZE)
26
+
27
+ void learn_cpu(
28
+ const float* input,
29
+ const int* labels,
30
+ float* gmm,
31
+ float* scratch_memory,
32
+ unsigned int batch_count,
33
+ unsigned int element_count);
34
+ void apply_cpu(
35
+ const float* gmm,
36
+ const float* input,
37
+ float* output,
38
+ unsigned int batch_count,
39
+ unsigned int element_count);
40
+
41
+ void learn_cuda(
42
+ const float* input,
43
+ const int* labels,
44
+ float* gmm,
45
+ float* scratch_memory,
46
+ unsigned int batch_count,
47
+ unsigned int element_count);
48
+ void apply_cuda(
49
+ const float* gmm,
50
+ const float* input,
51
+ float* output,
52
+ unsigned int batch_count,
53
+ unsigned int element_count);
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/gmm/gmm_cpu.cpp ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /*
2
+ Copyright (c) MONAI Consortium
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+ http://www.apache.org/licenses/LICENSE-2.0
7
+ Unless required by applicable law or agreed to in writing, software
8
+ distributed under the License is distributed on an "AS IS" BASIS,
9
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
10
+ See the License for the specific language governing permissions and
11
+ limitations under the License.
12
+ */
13
+
14
+ #include <stdexcept>
15
+
16
+ #include "gmm.h"
17
+
18
+ void learn_cpu(
19
+ const float* input,
20
+ const int* labels,
21
+ float* gmm,
22
+ float* scratch_memory,
23
+ unsigned int batch_count,
24
+ unsigned int element_count) {
25
+ throw std::invalid_argument("GMM received a cpu tensor but is not yet implemented for the cpu");
26
+ }
27
+
28
+ void apply_cpu(
29
+ const float* gmm,
30
+ const float* input,
31
+ float* output,
32
+ unsigned int batch_count,
33
+ unsigned int element_count) {
34
+ throw std::invalid_argument("GMM received a cpu tensor but is not yet implemented for the cpu");
35
+ }
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/gmm/gmm_cuda.cu ADDED
@@ -0,0 +1,518 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /*
2
+ Copyright (c) MONAI Consortium
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+ http://www.apache.org/licenses/LICENSE-2.0
7
+ Unless required by applicable law or agreed to in writing, software
8
+ distributed under the License is distributed on an "AS IS" BASIS,
9
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
10
+ See the License for the specific language governing permissions and
11
+ limitations under the License.
12
+ */
13
+
14
+ #include <cuda.h>
15
+ #include <cuda_runtime.h>
16
+
17
+ #include "gmm.h"
18
+
19
+ #include "gmm_cuda_linalg.cuh"
20
+
21
+ #define EPSILON 1e-5
22
+ #define BLOCK_SIZE 32
23
+ #define TILE(SIZE, STRIDE) ((((SIZE)-1) / (STRIDE)) + 1)
24
+
25
+ template <int warp_count, int load_count>
26
+ __global__ void CovarianceReductionKernel(
27
+ int gaussian_index,
28
+ const float* g_image,
29
+ const int* g_alpha,
30
+ float* g_matrices,
31
+ int element_count) {
32
+ constexpr int block_size = warp_count * 32;
33
+
34
+ __shared__ float s_matrix_component[warp_count];
35
+
36
+ int batch_index = blockIdx.z;
37
+
38
+ const float* g_batch_image = g_image + batch_index * element_count * CHANNEL_COUNT;
39
+ const int* g_batch_alpha = g_alpha + batch_index * element_count;
40
+ float* g_batch_matrices = g_matrices + batch_index * GMM_COUNT * GMM_COMPONENT_COUNT * gridDim.x;
41
+
42
+ int local_index = threadIdx.x;
43
+ int block_index = blockIdx.x;
44
+ int warp_index = local_index >> 5;
45
+ int lane_index = local_index & 31;
46
+ int global_index = local_index + block_index * block_size * load_count;
47
+ int matrix_offset = (gaussian_index * gridDim.x + block_index) * GMM_COMPONENT_COUNT;
48
+
49
+ float matrix[MATRIX_COMPONENT_COUNT];
50
+
51
+ for (int i = 0; i < MATRIX_COMPONENT_COUNT; i++) {
52
+ matrix[i] = 0;
53
+ }
54
+
55
+ for (int load = 0; load < load_count; load++) {
56
+ global_index += load * block_size;
57
+
58
+ if (global_index < element_count) {
59
+ int my_alpha = g_batch_alpha[global_index];
60
+
61
+ if (my_alpha != -1) {
62
+ if (gaussian_index == (my_alpha & 15) + (my_alpha >> 4) * MIXTURE_COUNT) {
63
+ float feature[CHANNEL_COUNT + 1];
64
+
65
+ feature[0] = 1;
66
+
67
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
68
+ feature[i + 1] = g_batch_image[global_index + i * element_count];
69
+ }
70
+
71
+ for (int index = 0, i = 0; i < CHANNEL_COUNT + 1; i++) {
72
+ for (int j = i; j < CHANNEL_COUNT + 1; j++, index++) {
73
+ matrix[index] += feature[i] * feature[j];
74
+ }
75
+ }
76
+ }
77
+ }
78
+ }
79
+ }
80
+
81
+ __syncthreads();
82
+
83
+ for (int i = 0; i < MATRIX_COMPONENT_COUNT; i++) {
84
+ float matrix_component = matrix[i];
85
+
86
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 16);
87
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 8);
88
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 4);
89
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 2);
90
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 1);
91
+
92
+ if (lane_index == 0) {
93
+ s_matrix_component[warp_index] = matrix_component;
94
+ }
95
+
96
+ __syncthreads();
97
+
98
+ if (warp_index == 0) {
99
+ matrix_component = s_matrix_component[lane_index];
100
+
101
+ if (warp_count >= 32) {
102
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 16);
103
+ }
104
+ if (warp_count >= 16) {
105
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 8);
106
+ }
107
+ if (warp_count >= 8) {
108
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 4);
109
+ }
110
+ if (warp_count >= 4) {
111
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 2);
112
+ }
113
+ if (warp_count >= 2) {
114
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 1);
115
+ }
116
+
117
+ if (lane_index == 0) {
118
+ g_batch_matrices[matrix_offset + i] = matrix_component;
119
+ }
120
+ }
121
+
122
+ __syncthreads();
123
+ }
124
+ }
125
+
126
+ template <int warp_count, bool invert_matrix>
127
+ __global__ void CovarianceFinalizationKernel(const float* g_matrices, float* g_gmm, int matrix_count) {
128
+ constexpr int block_size = warp_count * 32;
129
+
130
+ __shared__ float s_matrix_component[warp_count];
131
+ __shared__ float s_gmm[GMM_COMPONENT_COUNT];
132
+
133
+ int batch_index = blockIdx.z;
134
+
135
+ const float* g_batch_matrices = g_matrices + batch_index * GMM_COUNT * GMM_COMPONENT_COUNT * matrix_count;
136
+ float* g_batch_gmm = g_gmm + batch_index * GMM_COUNT * GMM_COMPONENT_COUNT;
137
+
138
+ int local_index = threadIdx.x;
139
+ int warp_index = local_index >> 5;
140
+ int lane_index = local_index & 31;
141
+ int gmm_index = blockIdx.x;
142
+ int matrix_offset = gmm_index * matrix_count;
143
+
144
+ int load_count = TILE(matrix_count, block_size);
145
+
146
+ float norm_factor = 1.0f;
147
+
148
+ for (int index = 0, i = 0; i < CHANNEL_COUNT + 1; i++) {
149
+ for (int j = i; j < CHANNEL_COUNT + 1; j++, index++) {
150
+ float matrix_component = 0.0f;
151
+
152
+ for (int load = 0; load < load_count; load++) {
153
+ int matrix_index = local_index + load * block_size;
154
+
155
+ if (matrix_index < matrix_count) {
156
+ matrix_component += g_batch_matrices[(matrix_offset + matrix_index) * GMM_COMPONENT_COUNT + index];
157
+ }
158
+ }
159
+
160
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 16);
161
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 8);
162
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 4);
163
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 2);
164
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 1);
165
+
166
+ if (lane_index == 0) {
167
+ s_matrix_component[warp_index] = matrix_component;
168
+ }
169
+
170
+ __syncthreads();
171
+
172
+ if (warp_index == 0) {
173
+ matrix_component = s_matrix_component[lane_index];
174
+
175
+ if (warp_count >= 32) {
176
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 16);
177
+ }
178
+ if (warp_count >= 16) {
179
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 8);
180
+ }
181
+ if (warp_count >= 8) {
182
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 4);
183
+ }
184
+ if (warp_count >= 4) {
185
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 2);
186
+ }
187
+ if (warp_count >= 2) {
188
+ matrix_component += __shfl_down_sync(0xffffffff, matrix_component, 1);
189
+ }
190
+
191
+ if (lane_index == 0) {
192
+ float constant = i == 0 ? 0.0f : s_gmm[i] * s_gmm[j];
193
+
194
+ if (i != 0 && i == j) {
195
+ constant -= EPSILON;
196
+ }
197
+
198
+ s_gmm[index] = norm_factor * matrix_component - constant;
199
+
200
+ if (index == 0 && matrix_component > 0) {
201
+ norm_factor = 1.0f / matrix_component;
202
+ }
203
+ }
204
+ }
205
+
206
+ __syncthreads();
207
+ }
208
+ }
209
+
210
+ float* matrix = s_gmm + (CHANNEL_COUNT + 1);
211
+ float* det_ptr = s_gmm + MATRIX_COMPONENT_COUNT;
212
+
213
+ if (local_index == 0) {
214
+ float square_mat[CHANNEL_COUNT][CHANNEL_COUNT];
215
+ float cholesky_mat[CHANNEL_COUNT][CHANNEL_COUNT];
216
+
217
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
218
+ for (int j = 0; j < CHANNEL_COUNT; j++) {
219
+ square_mat[i][j] = 0.0f;
220
+ cholesky_mat[i][j] = 0.0f;
221
+ }
222
+ }
223
+
224
+ to_square(matrix, square_mat);
225
+ cholesky(square_mat, cholesky_mat);
226
+
227
+ *det_ptr = chol_det(cholesky_mat);
228
+
229
+ if (invert_matrix) {
230
+ chol_inv(cholesky_mat, square_mat);
231
+ to_triangle(square_mat, matrix);
232
+ }
233
+ }
234
+
235
+ if (local_index < GMM_COMPONENT_COUNT) {
236
+ g_batch_gmm[gmm_index * GMM_COMPONENT_COUNT + local_index] = s_gmm[local_index];
237
+ }
238
+ }
239
+
240
+ struct GMMSplit_t {
241
+ int idx;
242
+ float threshold;
243
+ float eigenvector[CHANNEL_COUNT];
244
+ };
245
+
246
+ // 1 Block, 32xMIXTURE_COUNT
247
+ __global__ void GMMFindSplit(GMMSplit_t* gmmSplit, int gmmK, float* gmm) {
248
+ int batch_index = blockIdx.z;
249
+
250
+ float* g_batch_gmm = gmm + batch_index * GMM_COUNT * GMM_COMPONENT_COUNT;
251
+ GMMSplit_t* g_batch_gmmSplit = gmmSplit + batch_index * MIXTURE_COUNT;
252
+
253
+ int gmm_idx = threadIdx.x * MIXTURE_COUNT + threadIdx.y;
254
+
255
+ float eigenvalue = 0;
256
+ float eigenvector[CHANNEL_COUNT];
257
+
258
+ if (threadIdx.x < gmmK) {
259
+ float* matrix = g_batch_gmm + gmm_idx * GMM_COMPONENT_COUNT + (CHANNEL_COUNT + 1);
260
+ largest_eigenpair(matrix, eigenvector, &eigenvalue);
261
+ }
262
+
263
+ float max_value = eigenvalue;
264
+
265
+ max_value = max(max_value, __shfl_xor_sync(0xffffffff, max_value, 16));
266
+ max_value = max(max_value, __shfl_xor_sync(0xffffffff, max_value, 8));
267
+ max_value = max(max_value, __shfl_xor_sync(0xffffffff, max_value, 4));
268
+ max_value = max(max_value, __shfl_xor_sync(0xffffffff, max_value, 2));
269
+ max_value = max(max_value, __shfl_xor_sync(0xffffffff, max_value, 1));
270
+
271
+ if (max_value == eigenvalue) {
272
+ GMMSplit_t split;
273
+
274
+ float* average_feature = gmm + gmm_idx * GMM_COMPONENT_COUNT + 1;
275
+
276
+ split.idx = threadIdx.x;
277
+ split.threshold = scalar_prod(average_feature, eigenvector);
278
+
279
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
280
+ split.eigenvector[i] = eigenvector[i];
281
+ }
282
+
283
+ g_batch_gmmSplit[threadIdx.y] = split;
284
+ }
285
+ }
286
+
287
+ #define DO_SPLIT_DEGENERACY 4
288
+
289
+ __global__ void GMMDoSplit(const GMMSplit_t* gmmSplit, int k, const float* image, int* alpha, int element_count) {
290
+ __shared__ GMMSplit_t s_gmmSplit[MIXTURE_COUNT];
291
+
292
+ int batch_index = blockIdx.z;
293
+
294
+ const GMMSplit_t* g_batch_gmmSplit = gmmSplit + batch_index * MIXTURE_COUNT;
295
+ const float* g_batch_image = image + batch_index * element_count * CHANNEL_COUNT;
296
+ int* g_batch_alpha = alpha + batch_index * element_count;
297
+
298
+ int* s_linear = (int*)s_gmmSplit;
299
+ int* g_linear = (int*)g_batch_gmmSplit;
300
+
301
+ if (threadIdx.x < MIXTURE_COUNT * sizeof(GMMSplit_t)) {
302
+ s_linear[threadIdx.x] = g_linear[threadIdx.x];
303
+ }
304
+
305
+ __syncthreads();
306
+
307
+ int index = threadIdx.x + blockIdx.x * BLOCK_SIZE * DO_SPLIT_DEGENERACY;
308
+
309
+ for (int i = 0; i < DO_SPLIT_DEGENERACY; i++) {
310
+ index += BLOCK_SIZE;
311
+
312
+ if (index < element_count) {
313
+ int my_alpha = g_batch_alpha[index];
314
+
315
+ if (my_alpha != -1) {
316
+ int select = my_alpha & 15;
317
+ int gmm_idx = my_alpha >> 4;
318
+
319
+ if (gmm_idx == s_gmmSplit[select].idx) {
320
+ // in the split cluster now
321
+ float feature[CHANNEL_COUNT];
322
+
323
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
324
+ feature[i] = g_batch_image[index + i * element_count];
325
+ }
326
+
327
+ float value = scalar_prod(s_gmmSplit[select].eigenvector, feature);
328
+
329
+ if (value > s_gmmSplit[select].threshold) {
330
+ // assign pixel to new cluster
331
+ g_batch_alpha[index] = k + select;
332
+ }
333
+ }
334
+ }
335
+ }
336
+ }
337
+ }
338
+
339
+ // Single block, 32xMIXTURE_COUNT
340
+ __global__ void GMMcommonTerm(float* g_gmm) {
341
+ int batch_index = blockIdx.z;
342
+
343
+ float* g_batch_gmm = g_gmm + batch_index * GMM_COUNT * GMM_COMPONENT_COUNT;
344
+
345
+ int gmm_index = (threadIdx.x * MIXTURE_COUNT) + threadIdx.y;
346
+
347
+ float gmm_n = threadIdx.x < MIXTURE_SIZE ? g_batch_gmm[gmm_index * GMM_COMPONENT_COUNT] : 0.0f;
348
+
349
+ float sum = gmm_n;
350
+
351
+ sum += __shfl_xor_sync(0xffffffff, sum, 1);
352
+ sum += __shfl_xor_sync(0xffffffff, sum, 2);
353
+ sum += __shfl_xor_sync(0xffffffff, sum, 4);
354
+ sum += __shfl_xor_sync(0xffffffff, sum, 8);
355
+ sum += __shfl_xor_sync(0xffffffff, sum, 16);
356
+
357
+ if (threadIdx.x < MIXTURE_SIZE) {
358
+ float det = g_batch_gmm[gmm_index * GMM_COMPONENT_COUNT + MATRIX_COMPONENT_COUNT] + EPSILON;
359
+ float commonTerm = det > 0.0f ? gmm_n / (sqrtf(det) * sum) : gmm_n / sum;
360
+
361
+ g_batch_gmm[gmm_index * GMM_COMPONENT_COUNT + MATRIX_COMPONENT_COUNT] = commonTerm;
362
+ }
363
+ }
364
+
365
+ __device__ float GMMTerm(float* feature, const float* gmm) {
366
+ const float* average_feature = gmm + 1;
367
+ const float* matrix = gmm + CHANNEL_COUNT + 1;
368
+
369
+ float diff[CHANNEL_COUNT];
370
+
371
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
372
+ diff[i] = feature[i] - average_feature[i];
373
+ }
374
+
375
+ float value = 0.0f;
376
+
377
+ for (int index = 0, i = 0; i < CHANNEL_COUNT; i++) {
378
+ for (int j = i; j < CHANNEL_COUNT; j++, index++) {
379
+ float term = diff[i] * diff[j] * matrix[index];
380
+
381
+ value += i == j ? term : 2 * term;
382
+ }
383
+ }
384
+
385
+ return gmm[MATRIX_COMPONENT_COUNT] * expf(-0.5 * value);
386
+ }
387
+
388
+ __global__ void GMMDataTermKernel(const float* image, const float* gmm, float* output, int element_count) {
389
+ int batch_index = blockIdx.z;
390
+
391
+ const float* g_batch_image = image + batch_index * element_count * CHANNEL_COUNT;
392
+ const float* g_batch_gmm = gmm + batch_index * GMM_COUNT * GMM_COMPONENT_COUNT;
393
+ float* g_batch_output = output + batch_index * element_count * MIXTURE_COUNT;
394
+
395
+ int index = blockIdx.x * blockDim.x + threadIdx.x;
396
+
397
+ if (index >= element_count)
398
+ return;
399
+
400
+ float feature[CHANNEL_COUNT];
401
+
402
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
403
+ feature[i] = g_batch_image[index + i * element_count];
404
+ }
405
+
406
+ float weights[MIXTURE_COUNT];
407
+ float weight_total = 0.0f;
408
+
409
+ for (int i = 0; i < MIXTURE_COUNT; i++) {
410
+ float mixture_weight = 0.0f;
411
+
412
+ for (int j = 0; j < MIXTURE_SIZE; j++) {
413
+ mixture_weight += GMMTerm(feature, &g_batch_gmm[(MIXTURE_COUNT * j + i) * GMM_COMPONENT_COUNT]);
414
+ }
415
+
416
+ weights[i] = mixture_weight;
417
+ weight_total += mixture_weight;
418
+ }
419
+
420
+ for (int i = 0; i < MIXTURE_COUNT; i++) {
421
+ // protecting against pixels with 0 in all mixtures
422
+ float final_weight = weight_total > 0.0f ? weights[i] / weight_total : 0.0f;
423
+ g_batch_output[index + i * element_count] = final_weight;
424
+ }
425
+ }
426
+
427
+ #define THREADS 512
428
+ #define WARPS 16
429
+ #define BLOCK (WARPS << 5)
430
+ #define LOAD 4
431
+
432
+ void GMMInitialize(
433
+ const float* image,
434
+ int* alpha,
435
+ float* gmm,
436
+ float* scratch_mem,
437
+ unsigned int batch_count,
438
+ unsigned int element_count) {
439
+ unsigned int block_count = TILE(element_count, BLOCK * LOAD);
440
+
441
+ float* block_gmm_scratch = scratch_mem;
442
+ GMMSplit_t* gmm_split_scratch = (GMMSplit_t*)scratch_mem;
443
+
444
+ int gmm_N = MIXTURE_COUNT * MIXTURE_SIZE;
445
+
446
+ for (unsigned int k = MIXTURE_COUNT; k < gmm_N; k += MIXTURE_COUNT) {
447
+ for (unsigned int i = 0; i < k; ++i) {
448
+ CovarianceReductionKernel<WARPS, LOAD>
449
+ <<<{block_count, 1, batch_count}, BLOCK>>>(i, image, alpha, block_gmm_scratch, element_count);
450
+ }
451
+
452
+ CovarianceFinalizationKernel<WARPS, false><<<{k, 1, batch_count}, BLOCK>>>(block_gmm_scratch, gmm, block_count);
453
+
454
+ GMMFindSplit<<<{1, 1, batch_count}, dim3(BLOCK_SIZE, MIXTURE_COUNT)>>>(gmm_split_scratch, k / MIXTURE_COUNT, gmm);
455
+ GMMDoSplit<<<{TILE(element_count, BLOCK_SIZE * DO_SPLIT_DEGENERACY), 1, batch_count}, BLOCK_SIZE>>>(
456
+ gmm_split_scratch, (k / MIXTURE_COUNT) << 4, image, alpha, element_count);
457
+ }
458
+ }
459
+
460
+ void GMMUpdate(
461
+ const float* image,
462
+ int* alpha,
463
+ float* gmm,
464
+ float* scratch_mem,
465
+ unsigned int batch_count,
466
+ unsigned int element_count) {
467
+ unsigned int block_count = TILE(element_count, BLOCK * LOAD);
468
+
469
+ float* block_gmm_scratch = scratch_mem;
470
+
471
+ unsigned int gmm_N = MIXTURE_COUNT * MIXTURE_SIZE;
472
+
473
+ for (unsigned int i = 0; i < gmm_N; ++i) {
474
+ CovarianceReductionKernel<WARPS, LOAD>
475
+ <<<{block_count, 1, batch_count}, BLOCK>>>(i, image, alpha, block_gmm_scratch, element_count);
476
+ }
477
+
478
+ CovarianceFinalizationKernel<WARPS, true><<<{gmm_N, 1, batch_count}, BLOCK>>>(block_gmm_scratch, gmm, block_count);
479
+
480
+ GMMcommonTerm<<<{1, 1, batch_count}, dim3(BLOCK_SIZE, MIXTURE_COUNT)>>>(gmm);
481
+ }
482
+
483
+ void GMMDataTerm(
484
+ const float* image,
485
+ const float* gmm,
486
+ float* output,
487
+ unsigned int batch_count,
488
+ unsigned int element_count) {
489
+ dim3 block(BLOCK_SIZE, 1);
490
+ dim3 grid(TILE(element_count, BLOCK_SIZE), 1, batch_count);
491
+
492
+ GMMDataTermKernel<<<grid, block>>>(image, gmm, output, element_count);
493
+ }
494
+
495
+ void learn_cuda(
496
+ const float* input,
497
+ const int* labels,
498
+ float* gmm,
499
+ float* scratch_memory,
500
+ unsigned int batch_count,
501
+ unsigned int element_count) {
502
+ int* alpha = (int*)scratch_memory;
503
+ float* scratch_mem = scratch_memory + batch_count * element_count;
504
+
505
+ cudaMemcpyAsync(alpha, labels, batch_count * element_count * sizeof(int), cudaMemcpyDeviceToDevice);
506
+
507
+ GMMInitialize(input, alpha, gmm, scratch_mem, batch_count, element_count);
508
+ GMMUpdate(input, alpha, gmm, scratch_mem, batch_count, element_count);
509
+ }
510
+
511
+ void apply_cuda(
512
+ const float* gmm,
513
+ const float* input,
514
+ float* output,
515
+ unsigned int batch_count,
516
+ unsigned int element_count) {
517
+ GMMDataTerm(input, gmm, output, batch_count, element_count);
518
+ }
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/_extensions/gmm/gmm_cuda_linalg.cuh ADDED
@@ -0,0 +1,144 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /*
2
+ Copyright (c) MONAI Consortium
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+ http://www.apache.org/licenses/LICENSE-2.0
7
+ Unless required by applicable law or agreed to in writing, software
8
+ distributed under the License is distributed on an "AS IS" BASIS,
9
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
10
+ See the License for the specific language governing permissions and
11
+ limitations under the License.
12
+ */
13
+
14
+ __device__ void to_square(float in[SUB_MATRIX_COMPONENT_COUNT], float out[CHANNEL_COUNT][CHANNEL_COUNT]) {
15
+ for (int index = 0, i = 0; i < CHANNEL_COUNT; i++) {
16
+ for (int j = i; j < CHANNEL_COUNT; j++, index++) {
17
+ out[i][j] = in[index];
18
+ out[j][i] = in[index];
19
+ }
20
+ }
21
+ }
22
+
23
+ __device__ void to_triangle(float in[CHANNEL_COUNT][CHANNEL_COUNT], float out[SUB_MATRIX_COMPONENT_COUNT]) {
24
+ for (int index = 0, i = 0; i < CHANNEL_COUNT; i++) {
25
+ for (int j = i; j < CHANNEL_COUNT; j++, index++) {
26
+ out[index] = in[j][i];
27
+ }
28
+ }
29
+ }
30
+
31
+ __device__ void cholesky(float in[CHANNEL_COUNT][CHANNEL_COUNT], float out[CHANNEL_COUNT][CHANNEL_COUNT]) {
32
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
33
+ for (int j = 0; j < i + 1; j++) {
34
+ float sum = 0.0f;
35
+
36
+ for (int k = 0; k < j; k++) {
37
+ sum += out[i][k] * out[j][k];
38
+ }
39
+
40
+ if (i == j) {
41
+ out[i][j] = sqrtf(in[i][i] - sum);
42
+ } else {
43
+ out[i][j] = (in[i][j] - sum) / out[j][j];
44
+ }
45
+ }
46
+ }
47
+ }
48
+
49
+ __device__ float chol_det(float in[CHANNEL_COUNT][CHANNEL_COUNT]) {
50
+ float det = 1.0f;
51
+
52
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
53
+ det *= in[i][i];
54
+ }
55
+
56
+ return det * det;
57
+ }
58
+
59
+ __device__ void chol_inv(float in[CHANNEL_COUNT][CHANNEL_COUNT], float out[CHANNEL_COUNT][CHANNEL_COUNT]) {
60
+ // Invert cholesky matrix
61
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
62
+ in[i][i] = 1.0f / (in[i][i] + 0.0001f);
63
+
64
+ for (int j = 0; j < i; j++) {
65
+ float sum = 0.0f;
66
+
67
+ for (int k = j; k < i; k++) {
68
+ sum += in[i][k] * in[k][j];
69
+ }
70
+
71
+ in[i][j] = -in[i][i] * sum;
72
+ }
73
+ }
74
+
75
+ // Dot with transpose of self
76
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
77
+ for (int j = 0; j < CHANNEL_COUNT; j++) {
78
+ out[i][j] = 0.0f;
79
+
80
+ for (int k = max(i, j); k < CHANNEL_COUNT; k++) {
81
+ out[i][j] += in[k][i] * in[k][j];
82
+ }
83
+ }
84
+ }
85
+ }
86
+
87
+ __device__ void normalize(float* v) {
88
+ float norm = 0.0f;
89
+
90
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
91
+ norm += v[i] * v[i];
92
+ }
93
+
94
+ norm = 1.0f / sqrtf(norm);
95
+
96
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
97
+ v[i] *= norm;
98
+ }
99
+ }
100
+
101
+ __device__ float scalar_prod(float* a, float* b) {
102
+ float product = 0.0f;
103
+
104
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
105
+ product += a[i] * b[i];
106
+ }
107
+
108
+ return product;
109
+ }
110
+
111
+ __device__ void largest_eigenpair(const float* M, float* evec, float* eval) {
112
+ float scratch[CHANNEL_COUNT];
113
+
114
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
115
+ scratch[i] = i + 1;
116
+ }
117
+
118
+ for (int itr = 0; itr < 10; itr++) {
119
+ *eval = 0.0f;
120
+
121
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
122
+ int index = i;
123
+
124
+ evec[i] = 0.0f;
125
+
126
+ for (int j = 0; j < CHANNEL_COUNT; j++) {
127
+ evec[i] += M[index] * scratch[j];
128
+
129
+ if (j < i) {
130
+ index += CHANNEL_COUNT - (j + 1);
131
+ } else {
132
+ index += 1;
133
+ }
134
+ }
135
+
136
+ *eval = max(*eval, evec[i]);
137
+ }
138
+
139
+ for (int i = 0; i < CHANNEL_COUNT; i++) {
140
+ evec[i] /= *eval;
141
+ scratch[i] = evec[i];
142
+ }
143
+ }
144
+ }
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/engines/__pycache__/__init__.cpython-38.pyc ADDED
Binary file (855 Bytes). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/engines/__pycache__/evaluator.cpython-38.pyc ADDED
Binary file (19.7 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/engines/__pycache__/multi_gpu_supervised_trainer.cpython-38.pyc ADDED
Binary file (5.46 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/engines/__pycache__/trainer.cpython-38.pyc ADDED
Binary file (16.8 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/engines/__pycache__/utils.cpython-38.pyc ADDED
Binary file (9.26 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/engines/__pycache__/workflow.cpython-38.pyc ADDED
Binary file (12.6 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/__init__.cpython-38.pyc ADDED
Binary file (1.97 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/checkpoint_saver.cpython-38.pyc ADDED
Binary file (12.6 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/classification_saver.cpython-38.pyc ADDED
Binary file (6.61 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/confusion_matrix.cpython-38.pyc ADDED
Binary file (3.68 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/decollate_batch.cpython-38.pyc ADDED
Binary file (3.58 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/earlystop_handler.cpython-38.pyc ADDED
Binary file (3.8 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/garbage_collector.cpython-38.pyc ADDED
Binary file (2.7 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/hausdorff_distance.cpython-38.pyc ADDED
Binary file (3.3 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/ignite_metric.cpython-38.pyc ADDED
Binary file (4.8 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/lr_schedule_handler.cpython-38.pyc ADDED
Binary file (3.11 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/mean_dice.cpython-38.pyc ADDED
Binary file (2.65 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/metric_logger.cpython-38.pyc ADDED
Binary file (5.36 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/metrics_saver.cpython-38.pyc ADDED
Binary file (7.44 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/mlflow_handler.cpython-38.pyc ADDED
Binary file (8.28 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/nvtx_handlers.cpython-38.pyc ADDED
Binary file (7.54 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/postprocessing.cpython-38.pyc ADDED
Binary file (2.64 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/probability_maps.cpython-38.pyc ADDED
Binary file (4.15 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/regression_metrics.cpython-38.pyc ADDED
Binary file (8.61 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/roc_auc.cpython-38.pyc ADDED
Binary file (2.64 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/smartcache_handler.cpython-38.pyc ADDED
Binary file (2.91 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/stats_handler.cpython-38.pyc ADDED
Binary file (10.4 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/surface_distance.cpython-38.pyc ADDED
Binary file (3.03 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/tensorboard_handlers.cpython-38.pyc ADDED
Binary file (18.3 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/handlers/__pycache__/validation_handler.cpython-38.pyc ADDED
Binary file (2.82 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/inferers/__pycache__/__init__.cpython-38.pyc ADDED
Binary file (363 Bytes). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/inferers/__pycache__/inferer.cpython-38.pyc ADDED
Binary file (13.2 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/inferers/__pycache__/utils.cpython-38.pyc ADDED
Binary file (11.4 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/__init__.cpython-38.pyc ADDED
Binary file (1.43 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/confusion_matrix.cpython-38.pyc ADDED
Binary file (12.1 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/cumulative_average.cpython-38.pyc ADDED
Binary file (2.17 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/froc.cpython-38.pyc ADDED
Binary file (4.98 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/generalized_dice.cpython-38.pyc ADDED
Binary file (6.43 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/hausdorff_distance.cpython-38.pyc ADDED
Binary file (7.58 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/meandice.cpython-38.pyc ADDED
Binary file (6.28 kB). View file
 
my_container_sandbox/workspace/anaconda3/lib/python3.8/site-packages/monai/metrics/__pycache__/metric.cpython-38.pyc ADDED
Binary file (14.7 kB). View file