Logs for PR #1010 (2026-01-21T20:42:14.739451+00:00):

=== СТАТУС: Успешно выполнены программы: main_mandelbrot, main_sum ===
=== main_mandelbrot stdout (exit code: -11 (segfault после выполнения)) ===
Found 1 GPUs in 8.54059 sec (CUDA: 0.115785 sec, OpenCL: 0.70738 sec, Vulkan: 7.71736 sec)
Available devices:
Device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using OpenCL API...
______________________________________________________
Evaluating algorithm #1/3: CPU
algorithm times (in seconds) - 1 values (min=3.40858 10%=3.40858 median=3.40858 90%=3.40858 max=3.40858)
Mandelbrot effective algorithm GFlops: 2.93378 GFlops
saving image to 'mandelbrot CPU.bmp'...
CPU vs CPU average results difference: 0%
______________________________________________________
Evaluating algorithm #2/3: CPU with OpenMP
OpenMP threads: x4 threads
algorithm times (in seconds) - 10 values (min=1.0495 10%=1.0496 median=1.04992 90%=1.06077 max=1.06077)
Mandelbrot effective algorithm GFlops: 9.52457 GFlops
saving image to 'mandelbrot CPU with OpenMP.bmp'...
CPU with OpenMP vs CPU average results difference: 0%
______________________________________________________
Evaluating algorithm #3/3: GPU
Kernels compilation done in 3.58923 seconds
algorithm times (in seconds) - 10 values (min=0.00427744 10%=0.00428059 median=0.00428987 90%=3.59359 max=3.59359)
Mandelbrot effective algorithm GFlops: 2331.07 GFlops
saving image to 'mandelbrot GPU.bmp'...
GPU vs CPU average results difference: 0.942446%
=== main_sum stdout (exit code: -11 (segfault после выполнения)) ===
Found 1 GPUs in 0.334489 sec (CUDA: 0.127823 sec, OpenCL: 0.0386513 sec, Vulkan: 0.167954 sec)
Available devices:
Device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using OpenCL API...
PCI-E median bandwidth - 8.45462 GB/s
______________________________________________________
Evaluating algorithm #1/6: CPU
algorithm times (in seconds) - 10 values (min=0.0364211 10%=0.036488 median=0.0368963 90%=0.0372249 max=0.0372249)
sum median effective algorithm bandwidth: 10.0967 GB/s
______________________________________________________
Evaluating algorithm #2/6: CPU with OpenMP
algorithm times (in seconds) - 10 values (min=0.016925 10%=0.0169256 median=0.0172708 90%=0.0178131 max=0.0178131)
sum median effective algorithm bandwidth: 21.5699 GB/s
______________________________________________________
Evaluating algorithm #3/6: 01 atomicAdd from each workItem
Kernels compilation done in 0.067585 seconds
algorithm times (in seconds) - 10 values (min=0.0027527 10%=0.00275295 median=0.00275533 90%=0.0704515 max=0.0704515)
sum median effective algorithm bandwidth: 135.203 GB/s
______________________________________________________
Evaluating algorithm #4/6: 02 atomicAdd but each workItem loads K values
Kernels compilation done in 0.0512642 seconds
algorithm times (in seconds) - 10 values (min=0.00146345 10%=0.001464 median=0.00146541 90%=0.0528408 max=0.0528408)
sum median effective algorithm bandwidth: 254.215 GB/s
______________________________________________________
Evaluating algorithm #5/6: 03 local memory and atomicAdd from master thread
Kernels compilation done in 0.0498652 seconds
algorithm times (in seconds) - 10 values (min=0.010679 10%=0.0106874 median=0.0110821 90%=0.0606583 max=0.0606583)
sum median effective algorithm bandwidth: 33.6153 GB/s
______________________________________________________
Evaluating algorithm #6/6: 04 local reduction
Kernels compilation done in 0.0483242 seconds
algorithm times (in seconds) - 10 values (min=0.0239867 10%=0.0239924 median=0.024605 90%=0.0908858 max=0.0908858)
sum median effective algorithm bandwidth: 15.1404 GB/s