Logs for PR #1046 (2026-02-25T21:25:10.363732+00:00):

=== СТАТУС: Успешно выполнены программы: main_mandelbrot, main_sum ===
=== main_mandelbrot stdout (exit code: -11 (segfault после выполнения)) ===
Found 1 GPUs in 0.31047 sec (CUDA: 0.120406 sec, OpenCL: 0.0375735 sec, Vulkan: 0.152432 sec)
Available devices:
Device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using OpenCL API...
______________________________________________________
Evaluating algorithm #1/3: CPU
algorithm times (in seconds) - 1 values (min=3.32178 10%=3.32178 median=3.32178 90%=3.32178 max=3.32178)
Mandelbrot effective algorithm GFlops: 3.01044 GFlops
saving image to 'mandelbrot CPU.bmp'...
CPU vs CPU average results difference: 0%
______________________________________________________
Evaluating algorithm #2/3: CPU with OpenMP
OpenMP threads: x4 threads
algorithm times (in seconds) - 10 values (min=1.02463 10%=1.02675 median=1.03189 90%=1.03494 max=1.03494)
Mandelbrot effective algorithm GFlops: 9.69094 GFlops
saving image to 'mandelbrot CPU with OpenMP.bmp'...
CPU with OpenMP vs CPU average results difference: 0%
______________________________________________________
Evaluating algorithm #3/3: GPU
Kernels compilation done in 0.0589184 seconds
algorithm times (in seconds) - 10 values (min=0.00427635 10%=0.00427861 median=0.00428206 90%=0.0632552 max=0.0632552)
Mandelbrot effective algorithm GFlops: 2335.33 GFlops
saving image to 'mandelbrot GPU.bmp'...
GPU vs CPU average results difference: 0.942446%
=== main_sum stdout (exit code: -11 (segfault после выполнения)) ===
Found 1 GPUs in 0.289337 sec (CUDA: 0.124282 sec, OpenCL: 0.0400632 sec, Vulkan: 0.124931 sec)
Available devices:
Device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using OpenCL API...
PCI-E median bandwidth, gb/s8.39736
______________________________________________________
Evaluating algorithm #1/6: CPU
algorithm times (in seconds) - 10 values (min=0.0362301 10%=0.0365077 median=0.0370137 90%=0.0376394 max=0.0376394)
sum median effective algorithm bandwidth: 10.0646 GB/s
______________________________________________________
Evaluating algorithm #2/6: CPU with OpenMP
algorithm times (in seconds) - 10 values (min=0.0157405 10%=0.0161578 median=0.0166301 90%=0.0170072 max=0.0170072)
sum median effective algorithm bandwidth: 22.4009 GB/s
______________________________________________________
Evaluating algorithm #3/6: 01 atomicAdd from each workItem
Kernels compilation done in 0.0540067 seconds
algorithm times (in seconds) - 10 values (min=0.00275269 10%=0.00275309 median=0.00275517 90%=0.0568703 max=0.0568703)
sum median effective algorithm bandwidth: 135.211 GB/s
______________________________________________________
Evaluating algorithm #4/6: 02 atomicAdd but each workItem loads K values
Kernels compilation done in 0.0452495 seconds
algorithm times (in seconds) - 10 values (min=0.00146507 10%=0.00146518 median=0.00146651 90%=0.0468228 max=0.0468228)
sum median effective algorithm bandwidth: 254.024 GB/s
______________________________________________________
Evaluating algorithm #5/6: 03 local memory and atomicAdd from master thread
Kernels compilation done in 0.0814909 seconds
algorithm times (in seconds) - 10 values (min=0.00789574 10%=0.00789604 median=0.00789708 90%=0.089494 max=0.089494)
sum median effective algorithm bandwidth: 47.173 GB/s
______________________________________________________
Evaluating algorithm #6/6: 04 local reduction
Kernels compilation done in 0.0469491 seconds
algorithm times (in seconds) - 10 values (min=0.00740907 10%=0.00741016 median=0.0074135 90%=0.0544471 max=0.0544471)
sum median effective algorithm bandwidth: 50.2501 GB/s