Petit sujet sur OPENCL et le GPGPU (CUDA et autres SDK proprio) en général, avez vous des retours sur le bon fonctionnement ou non?

Actuellement avec le pilote libre et mesa les programmes plantent quand ils demandent à l'utiliser sur la carte graphique. Le résultat est souvent des erreurs de segmentations, voir le non fonctionnement (certains benchmark)

Par contre en utilisant par le cpu avec pocl ça fonctionne avec un i7 4771. Par contre "beignet" qui vient de chez intel lui ne fonctionne pas chez moi.

Donc n'hésitez pas à parler de votre expérience, ça aidera sans doute à faire avancer les choses.

Si vous avez besoin d'aide pour au moins que ce soit détecté, n'hésitez pas non plus.

Edit :
- Donc je rajoute que CUDA fonctionne très bien.
- Il serait bien d'avoir des retours sur le pilote propriétaire et le SDK d'AMD.
- Voir aussi le SDK propriétaire de Intel.

- Pocl ne fonctionne pas parfaitement
- Beignet (pour les puces Intel) plante
J'ai un GPU Nvidia, et cuda installé.
Comment cela se teste-t-il ?
Tu peux le tester avec ces petits programmes :
https://code.google.com/p/ocltoys/
avec :
pyrit benchmark (dans les dépôts)
clpeak
Blender (pour cuda il me semble qu'il faut juste configurer le rendu sur le gpu Cuda dans les préférences)

phoronix benchmark en propose d'autres
ViennaCL
X264-CL

Il y a aussi darktable, mais je sais pas si il prend CUDA en charge.

Pour les soucis avec les radeon et mesa, il semble que ce soit LLVM 3.5 qui provoque ces soucis. A voir...
6 jours plus tard
Bon test avec la hd4600 intel et "beignet" (pour utiliser la puce intégré), résultat ça plante toujours.

Cela semble liée à LLVM 😢.
j'utilise CUDA depuis quelque temps, avec un geforce 780 GTX ( 2304 coeurs cuda)
en calcul pur avec NAMD ( dynamique moléculaire), fonctionne à merveille de même avec GROMACS.
pour ce qui est de la visualisation, le seul logiciels que j'utilise et qui le propose est VMD, la aussi ça fonctionne bien.

j'utilise pour ma carte nvidia le pilote via akmod-nvidia, la dernière version disponible.
avec en plus le dépot pour F20 de cuda, disponible via le site nvidia.
Ok je note.

Ce serait bien d'avoir un retour sur le SDK proprio d'AMD.

Comme je test les développements du libre, je n'ai pas de quoi exploiter une version proprio. Surtout que j'utilise la version de dev de fedora.

Bon j'ai rapporté les défauts sur le bugzilla, défauts pris en compte d'après les infos que j'ai.

Si je récupère le dernier disque dur qui manque à mon système pour pouvoir avoir une version stable, j'en profiterai pour ajouter les tests des pilotes proprio à la liste.
14 jours plus tard
un mois plus tard
Les améliorations de l'OpenCL pour Blender par AMD devrait arriver avec la version 2.75.

Vivement que ce soit disponible que l'on test cela :-P.
2 ans plus tard
Salut VINDICATORs, je vais avoir besoin de ton aide.

Je teste OpenCL dans le but de faire du calcul haute performance avec python. Première question : faut-il installer les paquets devel ou les paquets sans devel suffisent ? Deuxième question : pourquoi clpeak ne marche pas :-? ?

J'ai deux machines concernées :

Première machine sous Fedora 24
CPU : Intel(R) Xeon(R) CPU X5660 @ 2.80GHz, 6 cœurs + hyperthreading
3 GPU : GT 730 + GTX 780 + GTX 1060
RAM : 24 Go
Paquets installé : cuda-8.0.61-1.x86_64, pocl-0.13-4.fc24.x86_64

Voici la sortie de clinfo
clinfo: /usr/local/cuda-8.0/targets/x86_64-linux/lib/libOpenCL.so.1: no version information available (required by clinfo)
Number of platforms                               2
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 1.2 CUDA 8.0.0
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer
  Platform Extensions function suffix             NV

  Platform Name                                   Portable Computing Language
  Platform Vendor                                 The pocl project
  Platform Version                                OpenCL 2.0 pocl 0.13, LLVM 3.8.0
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             POCL

  Platform Name                                   NVIDIA CUDA
Number of devices                                 3
  Device Name                                     GeForce GTX 1060 6GB
  Device Vendor                                   NVIDIA Corporation
  Device Vendor ID                                0x10de
  Device Version                                  OpenCL 1.2 CUDA
  Driver Version                                  375.66
  Device OpenCL C Version                         OpenCL C 1.2 
  Device Type                                     GPU
  Device Available                                Yes
  Device Profile                                  FULL_PROFILE
  Device Topology (NV)                            PCI-E, 22:00.0
  Max compute units                               10
  Max clock frequency                             1733MHz
  Compute Capability (NV)                         6.1
  Device Partition                                (core)
    Max number of sub-devices                     1
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x64
  Max work group size                             1024
  Compiler Available                              Yes
  Linker Available                                Yes
  Preferred work group size multiple              32
  Warp size (NV)                                  32
  Preferred / native vector sizes                 
    char                                                 1 / 1       
    short                                                1 / 1       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 0 / 0        (n/a)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              6367739904 (5.93GiB)
  Error Correction support                        No
  Max memory allocation                           1591934976 (1.483GiB)
  Unified memory for Host and Device              No
  Integrated memory (NV)                          No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        163840 (160KiB)
  Global Memory cache line                        128 bytes
  Image support                                   Yes
    Max number of samplers per kernel             32
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             16384x32768 pixels
    Max 3D image size                             16384x16384x16384 pixels
    Max number of read image args                 256
    Max number of write image args                16
  Local memory type                               Local
  Local memory size                               49152 (48KiB)
  Registers per block (NV)                        65536
  Max constant buffer size                        65536 (64KiB)
  Max number of constant args                     9
  Max size of kernel argument                     4352 (4.25KiB)
  Queue properties                                
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Kernel execution timeout (NV)                 No
  Concurrent copy and kernel execution (NV)       Yes
    Number of async copy engines                  2
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer

  Device Name                                     GeForce GTX 780
  Device Vendor                                   NVIDIA Corporation
  Device Vendor ID                                0x10de
  Device Version                                  OpenCL 1.2 CUDA
  Driver Version                                  375.66
  Device OpenCL C Version                         OpenCL C 1.2 
  Device Type                                     GPU
  Device Available                                Yes
  Device Profile                                  FULL_PROFILE
  Device Topology (NV)                            PCI-E, 04:00.0
  Max compute units                               12
  Max clock frequency                             941MHz
  Compute Capability (NV)                         3.5
  Device Partition                                (core)
    Max number of sub-devices                     1
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x64
  Max work group size                             1024
  Compiler Available                              Yes
  Linker Available                                Yes
  Preferred work group size multiple              32
  Warp size (NV)                                  32
  Preferred / native vector sizes                 
    char                                                 1 / 1       
    short                                                1 / 1       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 0 / 0        (n/a)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              3167485952 (2.95GiB)
  Error Correction support                        No
  Max memory allocation                           791871488 (755.2MiB)
  Unified memory for Host and Device              No
  Integrated memory (NV)                          No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        196608 (192KiB)
  Global Memory cache line                        128 bytes
  Image support                                   Yes
    Max number of samplers per kernel             32
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             4096x4096x4096 pixels
    Max number of read image args                 256
    Max number of write image args                16
  Local memory type                               Local
  Local memory size                               49152 (48KiB)
  Registers per block (NV)                        65536
  Max constant buffer size                        65536 (64KiB)
  Max number of constant args                     9
  Max size of kernel argument                     4352 (4.25KiB)
  Queue properties                                
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Kernel execution timeout (NV)                 No
  Concurrent copy and kernel execution (NV)       Yes
    Number of async copy engines                  1
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer

  Device Name                                     GeForce GT 730
  Device Vendor                                   NVIDIA Corporation
  Device Vendor ID                                0x10de
  Device Version                                  OpenCL 1.2 CUDA
  Driver Version                                  375.66
  Device OpenCL C Version                         OpenCL C 1.2 
  Device Type                                     GPU
  Device Available                                Yes
  Device Profile                                  FULL_PROFILE
  Device Topology (NV)                            PCI-E, 03:00.0
  Max compute units                               2
  Max clock frequency                             901MHz
  Compute Capability (NV)                         3.5
  Device Partition                                (core)
    Max number of sub-devices                     1
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x64
  Max work group size                             1024
  Compiler Available                              Yes
  Linker Available                                Yes
  Preferred work group size multiple              32
  Warp size (NV)                                  32
  Preferred / native vector sizes                 
    char                                                 1 / 1       
    short                                                1 / 1       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 0 / 0        (n/a)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              2098724864 (1.955GiB)
  Error Correction support                        No
  Max memory allocation                           524681216 (500.4MiB)
  Unified memory for Host and Device              No
  Integrated memory (NV)                          No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        32768 (32KiB)
  Global Memory cache line                        128 bytes
  Image support                                   Yes
    Max number of samplers per kernel             32
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             4096x4096x4096 pixels
    Max number of read image args                 256
    Max number of write image args                16
  Local memory type                               Local
  Local memory size                               49152 (48KiB)
  Registers per block (NV)                        65536
  Max constant buffer size                        65536 (64KiB)
  Max number of constant args                     9
  Max size of kernel argument                     4352 (4.25KiB)
  Queue properties                                
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Kernel execution timeout (NV)                 Yes
  Concurrent copy and kernel execution (NV)       Yes
    Number of async copy engines                  1
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer

  Platform Name                                   Portable Computing Language
Number of devices                                 1
  Device Name                                     pthread-Intel(R) Xeon(R) CPU           X5660  @ 2.80GHz
  Device Vendor                                   GenuineIntel
  Device Vendor ID                                0x8086
  Device Version                                  OpenCL 2.0 pocl
  Driver Version                                  0.13
  Device OpenCL C Version                         OpenCL C 2.0
  Device Type                                     CPU, Default
  Device Available                                Yes
  Device Profile                                  FULL_PROFILE
  Max compute units                               12
  Max clock frequency                             2794MHz
  Device Partition                                (core)
    Max number of sub-devices                     12
    Supported partition types                     equally, by counts
  Max work item dimensions                        3
  Max work item sizes                             4096x4096x4096
  Max work group size                             4096
  Compiler Available                              Yes
  Linker Available                                Yes
  Preferred work group size multiple              8
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 8 / 8        (n/a)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              27418066944 (25.54GiB)
  Error Correction support                        No
  Max memory allocation                           27418066944 (25.54GiB)
  Unified memory for Host and Device              Yes
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   Yes
    Fine-grained system sharing                   No
    Atomics                                       Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Max size for global variable                    0
  Preferred total size of global vars             0
  Global Memory cache type                        Read/Write
  Global Memory cache size                        32768 (32KiB)
  Global Memory cache line                        64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            1713629184 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             32768x32768 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                128
    Max number of read/write image args           <printDeviceInfo:114: get CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS : error -30>
  Max number of pipe args                         16
  Max active pipe reservations                    1
  Max pipe packet size                            1024
  Local memory type                               Global
  Local memory size                               27418066944 (25.54GiB)
  Max constant buffer size                        27418066944 (25.54GiB)
  Max number of constant args                     8
  Max size of kernel argument                     1024
  Queue properties (on host)                      
    Out-of-order execution                        No
    Profiling                                     Yes
  Queue properties (on device)                    
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Preferred size                                16384 (16KiB)
    Max size                                      262144 (256KiB)
  Max queues on device                            1
  Max events on device                            1024
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            Yes
    SPIR versions                                 1.2
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_spir cl_khr_int64 cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [NV]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform
	NOTE:	your OpenCL library only supports OpenCL 1.2,
		but some installed platforms support OpenCL 2.0.
		Programs using 2.0 features may crash
		or behave unexepectedly
La sortie de clpeak -p 0
clpeak: /usr/local/cuda-8.0/targets/x86_64-linux/lib/libOpenCL.so.1: no version information available (required by clpeak)
clpeak: /usr/local/cuda-8.0/targets/x86_64-linux/lib/libOpenCL.so.1: no version information available (required by clpeak)

Platform: NVIDIA CUDA
  Device: GeForce GTX 1060 6GB
    Driver version  : 375.66 (Linux x64)
    Compute units   : 10
    Clock frequency : 1733 MHz
Illegal instruction (core dumped)
La sortie de clpeak -p 1
clpeak: /usr/local/cuda-8.0/targets/x86_64-linux/lib/libOpenCL.so.1: no version information available (required by clpeak)
clpeak: /usr/local/cuda-8.0/targets/x86_64-linux/lib/libOpenCL.so.1: no version information available (required by clpeak)

Platform: Portable Computing Language
  Device: pthread-Intel(R) Xeon(R) CPU           X5660  @ 2.80GHz
    Driver version  : 0.13 (Linux x64)
    Compute units   : 12
    Clock frequency : 2794 MHz
Illegal instruction (core dumped)
Deuxième machine sous Fedora 25
CPU : Intel Core i7-6500U @ 2.50GHz, 2 cœurs + hyperthreading
2 GPU : Intel HD Graphics 520 + AMD Radeon R7 M265
RAM : 16 Go
paquets installés : beignet-1.3.0-4.fc25.x86_64, mesa-libOpenCL-17.0.5-3.fc25.x86_64, pocl-0.14-0.3.git3fef5b5.fc25.x86_64

Voici la sortie de clinfo
unknown GPU generation
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
Number of platforms                               3
  Platform Name                                   Intel Gen OCL Driver
  Platform Vendor                                 Intel
  Platform Version                                OpenCL 2.0 beignet 1.3
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_gl_sharing
  Platform Extensions function suffix             Intel
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0

  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 17.0.5
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Portable Computing Language
  Platform Vendor                                 The pocl project
  Platform Version                                OpenCL 2.0 pocl 0.14-pre, LLVM 3.9.1
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             POCL

  Platform Name                                   Intel Gen OCL Driver
Number of devices                                 1
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
  Device Name                                     Intel(R) HD Graphics Skylake ULT GT2
  Device Vendor                                   Intel
  Device Vendor ID                                0x8086
  Device Version                                  OpenCL 2.0 beignet 1.3
  Driver Version                                  1.3
  Device OpenCL C Version                         OpenCL C 2.0 beignet 1.3
  Device Type                                     GPU
  Device Available                                Yes
  Device Profile                                  FULL_PROFILE
  Max compute units                               24
  Max clock frequency                             1000MHz
  Device Partition                                (core)
    Max number of sub-devices                     1
    Supported partition types                     None, None, None
  Max work item dimensions                        3
  Max work item sizes                             512x512x512
  Max work group size                             512
  Compiler Available                              Yes
  Linker Available                                Yes
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
  Preferred work group size multiple              16
  Preferred / native vector sizes                 
    char                                                16 / 8       
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 0 / 8        (cl_khr_fp16)
    float                                                4 / 4       
    double                                               0 / 2        (n/a)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (n/a)
  Address bits                                    64, Little-Endian
  Global memory size                              4294967296 (4GiB)
  Error Correction support                        No
  Max memory allocation                           3221225472 (3GiB)
  Unified memory for Host and Device              Yes
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Max size for global variable                    65536 (64KiB)
  Preferred total size of global vars             65536 (64KiB)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        8192 (8KiB)
  Global Memory cache line                        64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            65536 pixels
    Max 1D or 2D image array size                 2048 images
    Base address alignment for 2D image buffers   4096 bytes
    Pitch alignment for 2D image buffers          1 bytes
    Max 2D image size                             8192x8192 pixels
    Max 3D image size                             8192x8192x2048 pixels
    Max number of read image args                 128
    Max number of write image args                8
    Max number of read/write image args           8
  Max number of pipe args                         16
  Max active pipe reservations                    1
  Max pipe packet size                            1024
  Local memory type                               Global
  Local memory size                               65536 (64KiB)
  Max constant buffer size                        134217728 (128MiB)
  Max number of constant args                     8
  Max size of kernel argument                     1024
  Queue properties (on host)                      
    Out-of-order execution                        No
    Profiling                                     Yes
  Queue properties (on device)                    
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Preferred size                                16384 (16KiB)
    Max size                                      262144 (256KiB)
  Max queues on device                            1
  Max events on device                            1024
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      80ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            Yes
    SPIR versions                                 1.2
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                __cl_copy_region_align4;__cl_copy_region_align16;__cl_cpy_region_unalign_same_offset;__cl_copy_region_unalign_dst_offset;__cl_copy_region_unalign_src_offset;__cl_copy_buffer_rect;__cl_copy_image_1d_to_1d;__cl_copy_image_2d_to_2d;__cl_copy_image_3d_to_2d;__cl_copy_image_2d_to_3d;__cl_copy_image_3d_to_3d;__cl_copy_image_2d_to_buffer;__cl_copy_image_3d_to_buffer;__cl_copy_buffer_to_image_2d;__cl_copy_buffer_to_image_3d;__cl_fill_region_unalign;__cl_fill_region_align2;__cl_fill_region_align4;__cl_fill_region_align8_2;__cl_fill_region_align8_4;__cl_fill_region_align8_8;__cl_fill_region_align8_16;__cl_fill_region_align128;__cl_fill_image_1d;__cl_fill_image_1d_array;__cl_fill_image_2d;__cl_fill_image_2d_array;__cl_fill_image_3d;
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_gl_sharing cl_khr_fp16

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     AMD OLAND (DRM 2.49.0 / 4.11.4-200.fc25.x86_64, LLVM 3.9.1)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 Mesa 17.0.5
  Driver Version                                  17.0.5
  Device OpenCL C Version                         OpenCL C 1.1 
  Device Type                                     GPU
  Device Available                                Yes
  Device Profile                                  FULL_PROFILE
  Max compute units                               6
  Max clock frequency                             825MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Compiler Available                              Yes
  Preferred work group size multiple              64
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 0 / 0        (n/a)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              2147483648 (2GiB)
  Error Correction support                        No
  Max memory allocation                           1503238553 (1.4GiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        None
  Image support                                   No
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Max constant buffer size                        1503238553 (1.4GiB)
  Max number of constant args                     16
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Profiling timer resolution                      0ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64

  Platform Name                                   Portable Computing Language
Number of devices                                 1
  Device Name                                     pthread-Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz
  Device Vendor                                   GenuineIntel
  Device Vendor ID                                0x8086
  Device Version                                  OpenCL 2.0 pocl HSTR: pthread-x86_64-unknown-linux-gnu-haswell
  Driver Version                                  0.14-pre
  Device OpenCL C Version                         OpenCL C 2.0
  Device Type                                     CPU, Default
  Device Available                                Yes
  Device Profile                                  FULL_PROFILE
  Max compute units                               4
  Max clock frequency                             3100MHz
  Device Partition                                (core)
    Max number of sub-devices                     4
    Supported partition types                     equally, by counts
  Max work item dimensions                        3
  Max work item sizes                             4096x4096x4096
  Max work group size                             4096
  Compiler Available                              Yes
  Linker Available                                Yes
  Preferred work group size multiple              8
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 8 / 8        (n/a)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              18717843456 (17.43GiB)
  Error Correction support                        No
  Max memory allocation                           18717843456 (17.43GiB)
  Unified memory for Host and Device              Yes
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   Yes
    Fine-grained system sharing                   No
    Atomics                                       Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Max size for global variable                    0
  Preferred total size of global vars             0
  Global Memory cache type                        Read/Write
  Global Memory cache size                        32768 (32KiB)
  Global Memory cache line                        64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            1169865216 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             32768x32768 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                128
    Max number of read/write image args           128
  Max number of pipe args                         16
  Max active pipe reservations                    1
  Max pipe packet size                            1024
  Local memory type                               Global
  Local memory size                               18717843456 (17.43GiB)
  Max constant buffer size                        18717843456 (17.43GiB)
  Max number of constant args                     8
  Max size of kernel argument                     1024
  Queue properties (on host)                      
    Out-of-order execution                        No
    Profiling                                     Yes
  Queue properties (on device)                    
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Preferred size                                16384 (16KiB)
    Max size                                      262144 (256KiB)
  Max queues on device                            1
  Max events on device                            1024
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            Yes
    SPIR versions                                 1.2
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_spir cl_khr_int64 cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Intel Gen OCL Driver
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [Intel]
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
  clCreateContext(NULL, ...) [default]            Success [Intel]
  clCreateContext(NULL, ...) [other]              Success [MESA]
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  Success (1)
    Platform Name                                 Intel Gen OCL Driver
    Device Name                                   Intel(R) HD Graphics Skylake ULT GT2
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 Intel Gen OCL Driver
    Device Name                                   Intel(R) HD Graphics Skylake ULT GT2
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  Success (1)
    Platform Name                                 Intel Gen OCL Driver
    Device Name                                   Intel(R) HD Graphics Skylake ULT GT2
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  Success (1)
    Platform Name                                 Intel Gen OCL Driver
    Device Name                                   Intel(R) HD Graphics Skylake ULT GT2
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Intel Gen OCL Driver
    Device Name                                   Intel(R) HD Graphics Skylake ULT GT2

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.11
  ICD loader Profile                              OpenCL 2.1
La sortie de clpeak -p 0
unknown GPU generation
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0

Platform: Intel Gen OCL Driver
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
  Device: Intel(R) HD Graphics Skylake ULT GT2
    Driver version  : 1.3 (Linux x64)
    Compute units   : 24
    Clock frequency : 1000 MHz

    Global memory bandwidth (GBPS)
      float   : 25.40
      float2  : 25.08
      float4  : 25.80
      float8  : 27.26
      float16 : 24.87

    Single-precision compute (GFLOPS)
      float   : 363.73
      float2  : 378.40
      float4  : 377.82
      float8  : 377.00
      float16 : 375.11

    No double precision support! Skipped

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer         : 30.98
      enqueueReadBuffer          : 12.01
      enqueueMapBuffer(for read) : 143165.58
        memcpy from mapped ptr   : 12.03
      enqueueUnmap(after write)  : 186737.72
        memcpy to mapped ptr     : 12.51

    Kernel launch latency : 29.50 us
La sortie de clpeak -p 1 --> boucle sans fin, cf https://bugs.freedesktop.org/show_bug.cgi?id=96897
unknown GPU generation
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0

Platform: Clover
  Device: AMD OLAND (DRM 2.49.0 / 4.11.4-200.fc25.x86_64, LLVM 3.9.1)
    Driver version  : 17.0.5 (Linux x64)
    Compute units   : 6
    Clock frequency : 825 MHz
^C
La sortie de clpeak -p 2
unknown GPU generation
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0
DRM_IOCTL_I915_GEM_APERTURE failed: No such file or directory
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [13]
param: 4, val: 0

Platform: Portable Computing Language
  Device: pthread-Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz
    Driver version  : 0.14-pre (Linux x64)
    Compute units   : 4
    Clock frequency : 3100 MHz

    Global memory bandwidth (GBPS)
      float   : Illegal instruction (core dumped)
Quelqu'un a une idée pour utiliser opencl avec les 3 plateformes ?
Merci
Pfiou cela remonte.

Sinon le support de l'OpenCL par le pilote libre avance petit à petit, mais ce n'est pas encore totalement fonctionnel.

J'en reparle dans la semaine n'ayant pas trop le temps maintenant.
Clair, c'est un beau déterrage de 2 ans, mais je n'ai pas trouvé de topic plus récent qui en parle, et il me semble que c'est un sujet qui est en plein essor.
Merci à toi. J'attends impatiemment ton retour, vu que j'ai un stagiaire qui démarre dessus. Entre temps je vais probablement passer à intel sdk.
Bonne soirée
 ~]# clpeak -p 0

Platform: Clover
  Device: AMD HAWAII (DRM 2.49.0 / 4.11.5-300.fc26.x86_64, LLVM 4.0.0)
    Driver version  : 17.2.0-devel (Linux x64)
    Compute units   : 44
    Clock frequency : 1040 MHz
Tiens chez moi clpeak se met à ne pas planter, mais n'affiche rien de plus...
Je relance la puce intel pour voir si cela à évolué.

Sinon voici le lien pour savoir où ils en sont avec le support de l'OpenCL avec les pilotes libre : https://dri.freedesktop.org/wiki/GalliumCompute/

Blender 2.80-build https://builder.blender.org/download/ dispose d'un bon support, mais pas encore avec le pilote libre pour les radeon. A voir si c'est le cas avec l'intel.
Oui, clpeak bug avec les gpu amd, le bug est référencé https://bugs.freedesktop.org/show_bug.cgi?id=96897

Quel est l'intérêt d'utiliser blender, à part pour tester opencl pour du rendu ? Apporte-t-il un moteur de calcul différent de pocl, beignet et mesa-libOpenCL, ou bien s'appuie-t-il sur eux ?
Il s'appuie sur l'opencl pour calculer les rendu.

L'intérêt c'est que... c'est pour faire de la 3D, montage vidéo/audio, 2D, animation, etc... Rien à voir avec clpeak ou autre.

https://www.blender.org/



Faut avouer quand même que cela pète plus que de voir des chiffres :-P!
C'est sûr que c'est plus joli, mais c'est quand même moins parlant que des chiffres dans un environnement de calcul scientifique. 😉
Bah tu les as aussi en chiffre vu que tu mets normalement beaucoup moins de temps pour faire les rendus 😉.

Reste les tests disponible avec la suite de bench de phoronix-test-suite.

Mais bon ce n'est pas encore prêt vu qu’actuellement il semble que ce soit Vulkan qui est privilégié 😢.
Merci pour ces infos et vidéos.
N'ayant pas réussi à faire fonctionner correctement les logiciels opensource, je me tourne vers centos 7 qui me permet d'installer directement le sdk et le diver intel, et le driver amdgpu-pro. :-?

Dommage que personne ni ici ni sur phoronix n'ait pu m'aider à résoudre mon problème 🙁