AMD GPUã䜿çšããŠãããã¬ãã«ã§äœæ¥ããéã®äž»ãªåŽé¢ã匷調ãããã®ãã¯ãããžãŒã®éçãšã䜿çšããéã«èµ·ããããåé¡ã«ã€ããŠèª¬æããŸãã ç«ã®äžã§èª°ãæ°ã«ããŠãã ããã
å°å ¥ãã代ããã«
ç§ãAMD GPUã®ããã°ã©ãã³ã°ãç解ãå§ãããšãã圌ãã¯ç§ãããã«äœã䜿ãã®ãå°ããŠããŸããã ãATI CAL ããšçããã ãã¯ããATIã¯æ¬åœã«CALã§ã ããšç§ã¯çããŸããã
äžè¬ã«ãç¥èªCALã®çºé³ã¯ããããŸãããã人ãå°ãããªãããã«ãOãã§çºé³ããŸãã
ç°¡æœã«ããããã«ãæåã®éšåã§ã«ãŒãã«ã«ãã£ãŠèšè¿°ãããããã°ã©ã ã瀺ããŸã ã ã«ãŒãã«ãšã¯ãããã°ã©ã ã®ãœãŒã¹ã³ãŒããšãGPUã«ããŒããããã³ã³ãã€ã«æžã¿ã®ãã€ããªã³ãŒãã®äž¡æ¹ãæå³ããŸãã AMD CALãä»ããŠGPUã§åäœããããã°ã©ã ã®å šæã¯æäŸããŸããããäž»ãªäœæ¥ç¹ã«ã€ããŠèª¬æããŸãã
- ãã©ã€ããŒã®åæå
- ãµããŒããããŠãããã¹ãŠã®GPUã«é¢ããæ å ±ãååŸãã
- ã¡ã¢ãªã®å²ãåœãŠãšã³ããŒ
- GPUã§ã®ã«ãŒãã«ã®ã³ã³ãã€ã«ãšããŒã
- ã«ãŒãã«èµ·å
- CPUåæ
éå§ããã«ã¯ãAMD APP SDKã®2ã€ã®ããããŒãã¡ã€ã«ãå¿ èŠã§ãã
- cal.h-ãã©ã€ããŒã®äž»ãªæ©èœã«ã€ããŠèª¬æããŸããæ©èœã®å é ã«ã¯ãcalã ïŒaticalrt.dllã©ã€ãã©ãªïŒãä»ããŠããŸã
- calcl.h-ããã¹ãã«ãŒãã«ã³ã³ãã€ã©ã®åºæ¬æ©èœããã€ããªã³ãŒãã«èšè¿°ããŸããæ©èœã®å é ã«ã¯ãcalclã ïŒaticalcl.dllã©ã€ãã©ãªïŒãä»ããŸã
ã芧ã®ãšãããã©ã³ã¿ã€ã APIãšãã©ã€ããŒAPIãåããNvidia CUDAãšã¯ç°ãªããAMDã§ã¯ãã©ã€ããŒAPIã®ã¿ã䜿çšå¯èœã§ãã ãããã£ãŠãã¢ããªã±ãŒã·ã§ã³ã®æäœã®ããã«ãé©åãªã©ã€ãã©ãªãžã®ãªã³ã¯ãå¿ããªãã§ãã ããã
ã»ãšãã©ã®é¢æ°åŒã³åºãã¯ãCALresultåã®å€ãè¿ããŸãã åèš11ã®æ»ãã³ãŒãã䜿çšå¯èœã§ãã ç§ãã¡ã«ãšã£ãŠæãéèŠãªã®ã¯ãCAL_RESULT_OKãšããã³ãŒãã§ã0ã«çããïŒåŒã³åºããæ£åžžã«å®äºããããšã瀺ããŸãïŒã
è¡ããŸãããã
ãã©ã€ããŒã®åæå
ã«ãŒã«çªå·1ïŒ GPUã§ã®äœæ¥ãéå§ããåã«ã次ã®åŒã³åºãã§
CALresult result = calInit();
ã«ãŒã«çªå·2ïŒ GPUã§äœæ¥ããåŸããžã§ããæ£ããå®äºãã
CALresult result = calShutdown();
ãããã®2ã€ã®åŒã³åºãã¯åžžã«ãã¢ã«ããå¿ èŠããããŸãã ããã°ã©ã ã«ã¯ãããã®ããã€ãïŒãã®ãããªåŒã³åºãã®ãã¢ïŒãååšããå¯èœæ§ããããŸããããããã®åŒã³åºã以å€ã§GPUã䜿çšããªãã§ãã ããããã®åäœã¯ããŒããŠã§ã¢äŸå€ã䌎ãå ŽåããããŸã ã
GPUæ å ±ã®ååŸ
ãµããŒããããŠãã GPUã®æ°ã確èªããŸãïŒã·ã¹ãã å ã®AMD GPUã®ç·æ°ãããå°ãªãå ŽåããããŸãïŒã
unsigned int deviceCount = 0; CALresult result = calDeviceGetCount( &deviceCount );
ãã®èšäºã§ã¯ãGPUèå¥åã®äœ¿çšå Žæã瀺ããŸãããèå¥å0ã®äžã§GPUãã䜿çšãããŸããäžè¬ã«ããã®èå¥åã¯0ããïŒdeviceCount-1ïŒãŸã§ã®å€ãåããŸãã
GPUã«é¢ããæ å ±ãã芧ãã ããã
unsigned int deviceId = 0; // GPU CALdeviceinfo deviceInfo; CALresult result = calDeviceGetInfo( &deviceInfo, deviceId ); CALdeviceattribs deviceAttribs; deviceAttribs.struct_size = sizeof( deviceAttribs ); CALresult result = calDeviceGetAttribs( &deviceAttribs, deviceId );
CALdeviceinfoæ§é ã§æãéèŠãªããšã¯ãGPUãããèå¥åã§ãã ããã§ã¯ãããã€ã¹ã«ãŒãã«ISAãšåŒã³ãŸã ã
typedef struct CALdeviceinfoRec { CALtarget target; /**< Device Kernel ISA */ CALuint maxResource1DWidth; /**< Maximum resource 1D width */ CALuint maxResource2DWidth; /**< Maximum resource 2D width */ CALuint maxResource2DHeight; /**< Maximum resource 2D height */ } CALdeviceinfo;
æ§é äœã®æ®ãã®ãã£ãŒã«ãã¯ããã®GPUã«å²ãåœãŠãããšãã§ãã2ã€ã®åº§æšã§ã®ãã¯ã¹ãã£ã¡ã¢ãªã®æ倧ãµã€ãºã決å®ããŸãã
GPUå±æ§ãæ åœããCALdeviceattribsæ§é ã¯ãã¯ããã«èå³æ·±ããã®ã§ãïŒæ§é äœãã£ãŒã«ããããã€ã玹ä»ããŸãïŒã
typedef struct CALdeviceattribsRec { CALtarget target; /**< Asic identifier ( Device Kernel ISA) */ CALuint localRAM; /**< GPU RAM */ CALuint wavefrontSize; /**< warp'a ( ) */ CALuint numberOfSIMD; /**< */ CALboolean computeShader; /**< Compute Shader */ CALuint pitch_alignment; /**< calCreateRes */ /* */ } CALdeviceattribs;
ã«ãŒã«çªå·3ïŒ CALdeviceattribs.pitch_alignmentãã£ãŒã«ãã¯ããã€ãã§ã¯ãªãã¡ã¢ãªèŠçŽ ã§æž¬å®ãããŸãã ã¡ã¢ãªèŠçŽ ã¯ã8ã16ããŸãã¯32ãããã¬ãžã¹ã¿ã®1ã2ããŸãã¯4ã³ã³ããŒãã³ããã¯ãã«ã§ãã
ãããŠãCALdeviceinfo.targetãã£ãŒã«ãïŒCALdeviceattribs.targetïŒãåãããå€ã詳ããèŠãŠã¿ãŸãããïŒ
/** Device Kernel ISA */ typedef enum CALtargetEnum { CAL_TARGET_600, /**< R600 GPU ISA */ CAL_TARGET_610, /**< RV610 GPU ISA */ CAL_TARGET_630, /**< RV630 GPU ISA */ CAL_TARGET_670, /**< RV670 GPU ISA */ CAL_TARGET_7XX, /**< R700 class GPU ISA */ CAL_TARGET_770, /**< RV770 GPU ISA */ CAL_TARGET_710, /**< RV710 GPU ISA */ CAL_TARGET_730, /**< RV730 GPU ISA */ CAL_TARGET_CYPRESS, /**< CYPRESS GPU ISA */ CAL_TARGET_JUNIPER, /**< JUNIPER GPU ISA */ CAL_TARGET_REDWOOD, /**< REDWOOD GPU ISA */ CAL_TARGET_CEDAR, /**< CEDAR GPU ISA */ CAL_TARGET_RESERVED0, CAL_TARGET_RESERVED1, CAL_TARGET_WRESTLER, /**< WRESTLER GPU ISA */ CAL_TARGET_CAYMAN, /**< CAYMAN GPU ISA */ CAL_TARGET_RESERVED2, CAL_TARGET_BARTS, /**< BARTS GPU ISA */ } CALtarget;
ãã®ãã£ãŒã«ãã¯ãGPUãæ§ç¯ãããŠãããããã瀺ããŠããããšãããããŸãã ãããã£ãŠãAMD CALã䜿çšããŠãGPUãäžçã§äœãšåŒã°ããŠããã®ãïŒRadeon HD 3850ãªã©ïŒãæ£ç¢ºã«èŠã€ããããšã¯äžå¯èœã§ãã ããã«ãã®ãããªäŸ¿å©ãªæè¡ããããŸã...ããããäŸãã°ãRadeon HD 5750ãšRadeon HD 6750ã¯å®éã«ã¯åããããªã«ãŒãã§ãããšããããšã芳å¯ããã®ã¯é¢çœãã£ãã§ãïŒ ã¡ã¢ãªæäœã®é »åºŠã¯ãããã«ç°ãªããŸãïŒæ°ããŒã»ã³ã以å ïŒã
ãã1ã€æ³šæããŠãã ããããã®ãªã¹ãã«ã¯Evergreen GPUããããŸãããããã«ã€ããŠã¯ååã§èª¬æããŸããã ç§ã®æšæž¬ã§ã¯ãEvergreenãã¡ããªGPUã¯ãµã€ãã¬ã¹ãããïŒCAL_TARGET_CYPRESSïŒããå§ãŸããšæãããŸãã 以åã®ãã®ã¯ãæ°ããæ©èœïŒåŸªç°ã·ãããæäœãã©ã°ãš64ãããæäœã®ãµããŒãïŒããµããŒãããªãåäžä»£ã®ã¿ã§ãã
ããã«äœæ¥ãé²ããã«ã¯ãGPUãšããåãããããã€ã¹èšè¿°åïŒããã€ã¹ïŒãäœæããå¿ èŠããããŸãã
unsigned int deviceId = 0; // GPU CALdevice device; CALresult result = calDeviceOpen( &device, deviceId ); CALcontext context; result = calCtxCreate( &context, device );
ãã®GPUã䜿çšããŠã¢ããªã±ãŒã·ã§ã³å ã§åäœããã«ã¯ãã³ã³ããã¹ããå¿ èŠã§ãã ãã¹ãŠã®GPUäœæ¥ã¯ããã®ã³ã³ããã¹ãã䜿çšããŠè¡ãããŸãã ã³ã³ããã¹ããåé€ãããšããã«ãå²ãåœãŠããããã¹ãŠã®ãªãœãŒã¹ã解æŸããããšèŠãªãããGPUäžã®ãã¹ãŠã®äžå®å šãªã¿ã¹ã¯ã匷å¶çã«å®äºããŸãã
ããã€ã¹ã§ã®äœæ¥ãçµããåŸããã¢ã®åŒã³åºããå¿ããªãã§ãã ããã
calCtxDestroy( context ); calDeviceClose( device );
åŒã³åºãã¯ãã®é åºã§å®è¡ããå¿ èŠããããŸããããããªããšã ããŒããŠã§ã¢äŸå€ãçºçããŸãã
ããã§ãããã€ã¹ãšãã®ã³ã³ããã¹ããäœæããŸããã
ã¡ã¢ãªå²ãåœãŠ
ã¡ã¢ãªãæäœããã«ã¯ã ãªãœãŒã¹ãå²ãåœãŠãå¿ èŠããããŸãã ããã¥ã¡ã³ãã«ãããšããªãœãŒã¹ã¯ããŒã«ã«ã¡ã¢ãªïŒããŒã«ã«ã¡ã¢ãª=ã¹ããªãŒã ããã»ããµã¡ã¢ãªïŒããã³ãªã¢ãŒãã¡ã¢ãªïŒãªã¢ãŒãã¡ã¢ãª=ã·ã¹ãã ã¡ã¢ãªïŒã«é 眮ã§ããŸãã ç§ãç解ããŠããããã«ããªã¢ãŒãã¡ã¢ãªã¯RAMã«éãããããŒã«ã«ã¡ã¢ãªã¯GPUèªäœã®ã¡ã¢ãªã§ãã
ããŒã«ã«ã¡ã¢ãªãããã®ã«ãªã¢ãŒãã¡ã¢ãªãå¿ èŠãªã®ã¯ãªãã§ããïŒ ãŸããè€æ°ã®GPUéã§åãã¡ã¢ãªãå ±æããå¿ èŠããããŸãã ã€ãŸãããªã¢ãŒãã¡ã¢ãªãäžåºŠå²ãåœãŠãŠãè€æ°ã®GPUããæäœã§ããŸãã 第äºã«ããã¹ãŠã®GPUãã¡ã¢ãªãžã®çŽæ¥ã¢ã¯ã»ã¹ããµããŒãããŠããããã§ã¯ãããŸããïŒä»¥äžã®ãã¡ã¢ãªãžã®çŽæ¥ã¢ã¯ã»ã¹ã®ååŸããåç §ïŒã
CALresource resource; unsigned int memoryWidth; unsigned int memoryHight; CALformat memoryFormat; unsigned int flags; // // 1D CALresult result = calResAllocRemote1D( &resource, &device, 1, memoryWidth, memoryFormat, flags ); /* GPU, - , - (1 ) */ // 2D CALresult result = calResAllocRemote2D( &resource, &device, 1, memoryWidth, memoryHeight, memoryFormat, flags ); // // 1D CALresult result = calResAllocLocal1D( &resource, device, memoryWidth, memoryFormat, flags ); /* , , */ // 2D CALresult result = calResAllocLocal2D( &resource, device, memoryWidth, memoryHeight, memoryFormat, flags );
å²ãåœãŠããããªãœãŒã¹ã®å¹ ãšé«ãã¯ãã¡ã¢ãªèŠçŽ ã§æž¬å®ãããŸãã
ã¡ã¢ãªãŒèŠçŽ èªäœã¯ãmemoryFormatãã©ã¡ãŒã¿ãŒã«ãã£ãŠèšè¿°ãããŸãã
// , /** Data format representation */ typedef enum CALformatEnum { CAL_FORMAT_UNORM_INT8_1, /**< 1 component, normalized unsigned 8-bit integer value per component */ CAL_FORMAT_UNORM_INT8_4, /**< 4 component, normalized unsigned 8-bit integer value per component */ CAL_FORMAT_UNORM_INT32_1, /**< 1 component, normalized unsigned 32-bit integer value per component */ CAL_FORMAT_UNORM_INT32_4, /**< 4 component, normalized unsigned 32-bit integer value per component */ CAL_FORMAT_SNORM_INT8_1, /**< 1 component, normalized signed 8-bit integer value per component */ CAL_FORMAT_SNORM_INT8_4, /**< 4 component, normalized signed 8-bit integer value per component */ CAL_FORMAT_SNORM_INT32_1, /**< 1 component, normalized signed 32-bit integer value per component */ CAL_FORMAT_SNORM_INT32_4, /**< 4 component, normalized signed 32-bit integer value per component */ CAL_FORMAT_UNSIGNED_INT8_1, /**< 1 component, unnormalized unsigned 8-bit integer value per component */ CAL_FORMAT_UNSIGNED_INT8_4, /**< 4 component, unnormalized unsigned 8-bit integer value per component */ CAL_FORMAT_SIGNED_INT8_1, /**< 1 component, unnormalized signed 8-bit integer value per component */ CAL_FORMAT_SIGNED_INT8_4, /**< 4 component, unnormalized signed 8-bit integer value per component */ CAL_FORMAT_UNSIGNED_INT32_1, /**< 1 component, unnormalized unsigned 32-bit integer value per component */ CAL_FORMAT_UNSIGNED_INT32_4, /**< 4 component, unnormalized unsigned 32-bit integer value per component */ CAL_FORMAT_SIGNED_INT32_1, /**< 1 component, unnormalized signed 32-bit integer value per component */ CAL_FORMAT_SIGNED_INT32_4, /**< 4 component, unnormalized signed 32-bit integer value per component */ CAL_FORMAT_UNORM_SHORT_565, /**< 3 component, normalized 5-6-5 RGB image. */ CAL_FORMAT_UNORM_SHORT_555, /**< 4 component, normalized x-5-5-5 xRGB image */ CAL_FORMAT_UNORM_INT10_3, /**< 4 component, normalized x-10-10-10 xRGB */ CAL_FORMAT_FLOAT32_1, /**< A 1 component, 32-bit float value per component */ CAL_FORMAT_FLOAT32_4, /**< A 4 component, 32-bit float value per component */ CAL_FORMAT_FLOAT64_1, /**< A 1 component, 64-bit float value per component */ CAL_FORMAT_FLOAT64_2, /**< A 2 component, 64-bit float value per component */ } CALformat;
å€ããããªã«ãŒãïŒãšããŒã°ãªãŒã³ã§ã¯ãªãïŒã§ã®64ãããæäœãfloatåã®ããŒã¿ã§ã®ã¿å®è¡ã§ããã®ã¯æ®å¿µã§ã...
ã«ãŒã«çªå·4ïŒèŠçŽ ãã©ãŒãããã¯ãGPUããã®èŠçŽ ã«ããããŒã¿ã解éããæ¹æ³ã®ã¿ã説æããŸãã ç©ççã«ã¯ãèŠçŽ ã¯åžžã«16ãã€ãã®ã¡ã¢ãªãå æããŸãã
ããã¯ãæåã®éšåã§ãªãœãŒã¹ã次ã®ããã«èª¬æããããšãæãåºããšç解ã§ããŸãã
dcl_resource_id(0)_type(2d,unnorm)_fmtx(uint)_fmty(uint)_fmtz(uint)_fmtw(uint)
ãŸããAMD ILèšèªä»æ§ã«ãããšãfmtx-fmtwå€ãå¿ èŠã§ãã ã€ãŸãã次ã®ã³ãŒãïŒãã®ãããªãã®ã¯ã1ã³ã³ããŒãã³ããã¯ãã«åã®èŠçŽ ãæã€ãã¯ã¹ãã£ãŒãèšè¿°ã§ããŸãïŒã¯æ£ãããããŸããã
dcl_resource_id(0)_type(2d,unnorm)_fmtx(uint)
ã«ãŒã«çªå·5ïŒã«ãŒãã«ã§å®£èšããã¿ã€ããéµå®ãããªãœãŒã¹ãå²ãåœãŠããšãã ããããäžèŽããªãå ŽåããªãœãŒã¹ãã«ãŒãã«ã«ãã€ã³ãã§ããŸããã
ã«ãŒã«çªå·6ïŒå®æ°ã¡ã¢ãªã®å ŽåãèŠçŽ ã¿ã€ãã¯åžžã«floatã¿ã€ãã§ãªããã°ãªããŸããã
ãªããããè¡ãããã®ãã¯å®ãã§ã¯ãããŸããããªããªããå®æ°ã¡ã¢ãªããæŽæ°å€ãããŒãã§ããããã§ãïŒãã®äŸã§ã¯ãããå®è¡ããŠããŸãïŒã
ã¡ã¢ãªãå²ãåœãŠããšãã«å¿ èŠãªãã©ã°ã«ã€ããŠãããå°ã説æããŸãã
/** CAL resource allocation flags **/ typedef enum CALresallocflagsEnum { CAL_RESALLOC_GLOBAL_BUFFER = 1, /**< used for global import/export buffer */ CAL_RESALLOC_CACHEABLE = 2, /**< cacheable memory? */ } CALresallocflags;
ã»ã«ã³ããã©ã°ã䜿çšããããšã¯ãããŸããããããæå©ãªå Žåã¯ããããŸããã ãããŠãèè èªèº«ã®ã³ã¡ã³ãã«ããçå笊ããå€æãããšã圌ããç¥ããªãïŒåŸ®ç¬ïŒã
ãã ããã°ããŒãã«ãããã¡ãå²ãåœãŠãã«ã¯æåã®ãã©ã°ãå¿ èŠã§ãïŒ "g []"ïŒã
次ã«ãçè«ãå®éã«é©çšããŸãã åã®èšäºã§èª¬æããäŸã念é ã«çœ®ããŠãã«ãŒãã«ã®èµ·åãã©ã¡ãŒã¿ãŒãèšå®ããŸãã
unsigned int blocks = 4; // 4 unsigned int threads = 64; // 64 // cb0 CALresource constantResource; CALresult result = calResAllocLocal1D( &constantResource, device, 1, CAL_FORMAT_FLOAT32_4, 0 ); // i0 CALresource textureResource; result = calResAllocLocal2D( &textureResource, device, threads, blocks, CAL_FORMAT_UNSIGNED_INT32_4, 0 ); // g[] CALresource globalResource; result = calResAllocLocal1D( &globalResource, device, threads * blocks, CAL_FORMAT_UNSIGNED_INT32_4, CAL_RESALLOC_GLOBAL_BUFFER );
ãªãœãŒã¹ãäžèŠã«ãªã£ããããªãœãŒã¹ã解æŸããå¿ èŠããããŸãã
calResFree( constantResource ); calResFree( textureResource ); calResFree( globalResource );
ã³ããŒã¡ã¢ãª
ã¡ã¢ãªã«çŽæ¥ã¢ã¯ã»ã¹ãã
GPUãã¡ã¢ãªã®ãããã³ã°ïŒã¡ã¢ãªã¢ãã¬ã¹ãããã»ã¹ã¢ãã¬ã¹ç©ºéã«ãããã³ã°ïŒããµããŒãããŠããå Žåãä»ã®ã¡ã¢ãªãšåæ§ã«ããã®ã¡ã¢ãªãžã®ãã€ã³ã¿ãååŸããŠæäœã§ããŸãã
unsigned int pitch; unsigned char* mappedPointer; CALresult result = calResMap( (CALvoid**)&mappedPointer, &pitch, resource, 0 ); // , ,
ãããŠãã¡ã¢ãªã®æäœãçµäºãããããã€ã³ã¿ãŒã解æŸããå¿ èŠããããŸãã
CALresult result = calResUnmap( resource );
ã«ãŒã«çªå·7ïŒ GPUã¡ã¢ãªã䜿çšããå Žåã ã¢ã©ã€ã¡ã³ããèæ ®ããå¿ èŠãããããšãåžžã«å¿ããªãã§ãã ããã ãã®èª¿æŽã¯ãå¯å€ãããã«ãã£ãŠç¹åŸŽä»ããããŸãã
ã«ãŒã«çªå·8ïŒãããã¯ãã€ãåäœã§ã¯ãªãèŠçŽ åäœã§æž¬å®ãããŸãã
ãªããã®ã¢ã©ã€ã¡ã³ãã«ã€ããŠç¥ãå¿ èŠãããã®ã§ããïŒ å®éãRAMãšã¯ç°ãªããGPUã¡ã¢ãªã¯åžžã«é£ç¶ããé åã§ã¯ãããŸããã ããã¯ããã¯ã¹ãã£ãæäœããå Žåã«ç¹ã«åœãŠã¯ãŸããŸãã äŸã§èšãããããšã説æããŸãããïŒ100x100èŠçŽ ã®ãã¯ã¹ãã£ãæäœãããå ŽåãcalResMapïŒïŒé¢æ°ã200ã«çãããããã®å€ãè¿ããå Žåãããã¯å®éã«GPUã200x100ã®ãã¯ã¹ãã£ã§åäœããæåã®100ã ããåãã¯ã¹ãã£ã©ã€ã³ã§èæ ®ãããããšãæå³ããŸãèŠçŽ ã
ãããå€ã«åºã¥ãGPUã¡ã¢ãªãžã®ã³ããŒã¯ã次ã®ããã«æŽçã§ããŸãã
unsigned int pitch; unsigned char* mappedPointer; unsigned char* dataBuffer; CALresult result = calResMap( (CALvoid**)&mappedPointer, &pitch, resource, 0 ); unsigned int width; unsigned int height; unsigned int elementSize = 16; if( pitch > width ) { for( uint index = 0; index < height; ++index ) { memcpy( mappedPointer + index * pitch * elementSize, dataBuffer + index * width * elementSize, width * elementSize ); } } else { memcpy( mappedPointer, dataBuffer, width * height * elementSize ); }
åœç¶ãdataBufferã®ããŒã¿ã¯ãèŠçŽ ã®ã¿ã€ããèæ ®ããŠæºåããå¿ èŠããããŸãã ãã ããèŠçŽ ã®ãµã€ãºã¯åžžã«16ãã€ãã§ããããšã«æ³šæããŠãã ããã
ã€ãŸãã圢åŒCAL_FORMAT_UNSIGNED_INT16_2ã®èŠçŽ ã®å Žåãã¡ã¢ãªå ã®ãã€ãè¡šçŸã¯æ¬¡ã®ããã«ãªããŸãã
// w - word, 16 // wi.j - i- word, j- // x - [ w0.0 | w0.1 | x | x ][ w1.0 | w1.1 | x | x ][ x | x | x | x ][ x | x | x | x ]
ãªãœãŒã¹éã§ããŒã¿ãã³ããŒãã
ããŒã¿ã¯ãªãœãŒã¹éã§çŽæ¥ã³ããŒãããã®ã§ã¯ãªããã³ã³ããã¹ãã«ããããããå€éã§ã³ããŒãããŸãã ã³ããŒæäœã¯éåæã§ãããããã³ããŒæäœã®å®äºã確èªããããã«ãCALeventã¿ã€ãã®ã·ã¹ãã ãªããžã§ã¯ãã䜿çšãããŸãã
CALresource inputResource; CALresource outputResource; CALmem inputResourceMem; CALmem outputResourceMem; // CALresult result = calCtxGetMem( &inputResourceMem, context, inputResource ); result = calCtxGetMem( &outputResourceMem, context, outputResource ); // CALevent syncEvent; result = calMemCopy( &syncEvent, context, inputResourceMem, outputResourceMem, 0 ); // , , // while( calCtxIsEventDone( context, syncEvent ) == CAL_RESULT_PENDING );
GPUã§ã®ã«ãŒãã«ã®ã³ã³ãã€ã«ãšããŒã
ãéã§ã®ã³ã·ã§ã€ã®æ»ãåµã®éãã¢ãã«ã®åµããããã®ã¢ãã«ãèžã®ããã...ã
ã«ãŒãã«ãGPUã«ããŒãããããã»ã¹ã¯æ¬¡ã®ããã«èª¬æã§ããŸãïŒãœãŒã¹ïŒtxtïŒã¯ãªããžã§ã¯ãïŒãªããžã§ã¯ãïŒã«ã³ã³ãã€ã«ããã1ã€ä»¥äžã®ãªããžã§ã¯ãã¯ã€ã¡ãŒãžïŒã€ã¡ãŒãžïŒã«ãªã³ã¯ãããGPUã¢ãžã¥ãŒã«ïŒã¢ãžã¥ãŒã«ïŒã«ããŒããããŸããã«ãŒãã«ãšã³ããªãã€ã³ããžã®ãã€ã³ã¿ãŒïŒãã®ãã€ã³ã¿ãŒã«ãããå®è¡ã®ããã«ã«ãŒãã«ãéå§ã§ããŸãïŒã
ãããŠä»ãããã¯ã©ã®ããã«å®è£ ãããŠããŸããïŒ
const char* kernel; // // , GPU unsigned int deviceId = 0; // GPU CALdeviceinfo deviceInfo; CALresult result = calDeviceGetInfo( &deviceInfo, deviceId ); // CALobject obj; result = calclCompile( &obj, CAL_LANGUAGE_IL, kernel, deviceInfo.target ); // CALimage image; result = calclLink( &image, &obj, 1 ); // - , - // , result = calclFreeObject( obj ); // CALmodule module; result = calModuleLoad( &module, context, image ); // CALfunc function; result = calModuleGetEntry( &function, context, module, "main" );
ã«ãŒã«çªå·9ïŒãªã³ã¯åŸã®é¢æ°ã¯ãã¡ã€ã³ãé¢æ°ã®ã¿ã§ãããããã«ãŒãã«ãžã®ãšã³ããªãã€ã³ãã¯åžžã«1ã§ãã
ã€ãŸããNvidia CUDAãšã¯ç°ãªããAMD CALã³ã¢ã«ã¯1ã€ã®ã°ããŒãã«é¢æ°ãã¡ã€ã³ãããååšã§ããŸããã
ãæ°ã¥ããããããŸããããã³ã³ãã€ã©ãŒã¯ILã§èšè¿°ããããœãŒã¹ã³ãŒãã®ã¿ãåŠçã§ããŸãã
ç»åãã¢ãžã¥ãŒã«ã«ããŒãããããšã¯ãéžæããGPUã³ã³ããã¹ãã«ç»åãããŒãããå¿ èŠããããšããäºå®ã«ãã£ãŠèª¬æãããŸãã ãããã£ãŠã説æãããŠããã³ã³ãã€ã«ããã»ã¹ã¯åGPUã§å®è¡ããå¿ èŠããããŸãïŒ2ãåãGPUã®å Žåãé€ããŸãïŒäžåºŠã³ã³ãã€ã«ããŠãªã³ã¯ããã ãã§ååã§ãããããã§ãåã«ãŒãã®ã¢ãžã¥ãŒã«ã«ã€ã¡ãŒãžãããŒãããå¿ èŠããããŸãïŒã
è€æ°ã®ãªããžã§ã¯ãææè ããªã³ã¯ããå¯èœæ§ã«æ³šæãåèµ·ããããšæããŸãã ãã®æ©äŒã¯èª°ãã«åœ¹ç«ã€ãããããŸããã ç§ã®æèŠã§ã¯ãåããµãæ©èœã®ç°ãªãå®è£ ã®å Žåã«é©çšã§ããŸããAMDILã«ã¯#ifdefã®ãããªããªããã»ããµãã£ã¬ã¯ãã£ãããªãããããããã®å®è£ ã¯ç°ãªããªããžã§ã¯ãã«ç§»åã§ããŸãã
GPUã§ã®ã«ãŒãã«ã®å®è¡ãå®äºããããé©åãªãªãœãŒã¹ã解æŸããå¿ èŠããããŸãã
CALresult result = calclFreeImage( image ); result = calModuleUnload( context, module );
ã«ãŒãã«èµ·å
ã«ãŒãã«èµ·åãªãã·ã§ã³ã®èšå®
ãã®ããããªãœãŒã¹ããã«ã¡ã¢ãªãããã³ã³ã³ãã€ã«ãããã«ãŒãã«ãå²ãåœãŠãŸããã ãªãœãŒã¹ãç¹å®ã®ã³ã¢ã«ãã€ã³ãããŠå®è¡ããã ãã§ãã ãããè¡ãã«ã¯ãã«ãŒãã«ããèµ·åãã©ã¡ãŒã¿ãŒãååŸãããªãœãŒã¹ãã³ã³ããã¹ãã«ãããããå¿ èŠããããŸãã
const char* memoryName; // , // CALname kernelParameter; CALresult result = calModuleGetName( &kernelParameter, context, module, memoryName ); // CALmem resourceMem; result = calCtxGetMem( &resourceMem, context, resource ); // result = calCtxSetMem( context, kernelParameter, resourceMem );
ãããŠä»ãç§ãã¡ã¯äŸã®äžéšãšããŠãããè¡ããŸãïŒ
CALname kernelParameter; CALmem resourceMem; // CALresult result = calModuleGetName( &kernelParameter, context, module, "cb0" ); result = calCtxGetMem( &resourceMem, context, constantResource ); result = calCtxSetMem( context, kernelParameter, resourceMem ); // result = calModuleGetName( &kernelParameter, context, module, "i0" ); result = calCtxGetMem( &resourceMem, context, textureResource ); result = calCtxSetMem( context, kernelParameter, resourceMem ); // result = calModuleGetName( &kernelParameter, context, module, "g[]" ); result = calCtxGetMem( &resourceMem, context, globalResource ); result = calCtxSetMem( context, kernelParameter, resourceMem );
GPUã§ã®ã«ãŒãã«ã®å®è¡ãå®äºããããã«ãŒãã«ãããªãœãŒã¹ã解æŸããå¿ èŠããããŸãã ããã¯æ¬¡ã®ããã«å®è¡ã§ããŸãã
CALname kernelParameter; // CALresult result = calModuleGetName( &kernelParameter, context, module, "cb0" ); result = calCtxSetMem( context, kernelParameter, 0 ); // result = calModuleGetName( &kernelParameter, context, module, "i0" ); result = calCtxSetMem( context, kernelParameter, 0 ); // result = calModuleGetName( &kernelParameter, context, module, "g[]" ); result = calCtxSetMem( context, kernelParameter, 0 );
ããã§ãã«ãŒãã«ã¯ããŒã¿ãååŸããå ŽæãèªèããŸãã å°èŠæš¡ã®å Žåã¯ãã®ãŸãŸã§ãã
ã«ãŒãã«èµ·å
èŠããŠããããã«ãæåã®éšåã§PSã·ã§ãŒããŒãšCSã·ã§ãŒããŒã«èšåããŸããã åŸè ãGPUå±æ§ã§ãµããŒããããŠãããã©ããã確èªã§ããŸãïŒäžèšãåç §ïŒã
PSããŒã³ãïŒ
unsigned int blocks = 4; // 4 unsigned int threads = 64; // 64 CALdomain domain; domain.x = 0; domain.y = 0; domain.width = threads; domain.height = blocks; CALevent syncEvent; CALresult result = calCtxRunProgram( &syncEvent, context, function, &domain ); while( calCtxIsEventDone( context, syncEvent ) == CAL_RESULT_PENDING );
ããã§ãé¢æ°ã¯ãGPUã«ã«ãŒãã«ãããŒããã段éã§ååŸããã«ãŒãã«ãšã³ããªãã€ã³ãã§ãïŒäžèšã®ã GPUã«ã«ãŒãã«ãã³ã³ãã€ã«ããŠããŒãããããåç §ïŒã
ã«ãŒã«çªå·10ïŒ PSã¯å éšã®ã¹ã¬ããã®å€ãç¥ããªããããã¡ã¢ãªãä»ããŠéä¿¡ããå¿ èŠããããŸãïŒãã®äŸã§ã¯ãããã¯å®æ°ã¡ã¢ãªãä»ããŠè¡ãããŸãïŒã
CSããŒã³ãïŒ
unsigned int blocks = 4; // 4 unsigned int threads = 64; // 64 CALprogramGrid programGrid; programGrid.func = function; programGrid.flags = 0; programGrid.gridBlock.width = threads; programGrid.gridBlock.height = 1; programGrid.gridBlock.depth = 1; programGrid.gridSize.width = blocks; programGrid.gridSize.height = 1; programGrid.gridSize.depth = 1; CALevent syncEvent; CALresult result = calCtxRunProgramGrid( &syncEvent, context, &programGrid ); while( calCtxIsEventDone( context, syncEvent ) == CAL_RESULT_PENDING );
ã«ãŒã«çªå·11ïŒã¹ã¬ããã®å€ã¯ãã«ãŒãã«ã®ãœãŒã¹ã³ãŒãã§å£ããå€ã«å¯Ÿå¿ããå¿ èŠããããŸãã ã«ãŒãã«ã¯ã©ã®ãããªå Žåã§ãèµ·åãããŸãããã¡ã¢ãªã®å¶éãè¶ ããããšãã§ããŸãïŒã«ãŒãã«ã§å®£èšããããããå°ãªãã¹ã¬ããã®èµ·åïŒããŸãã¯ãã¹ãŠã®å ¥åããŒã¿ãåŠçãããããã§ã¯ãããŸããïŒã«ãŒãã«ã§å®£èšããããããå€ãã®ã¹ã¬ããã®èµ·åïŒã
ã§ããïŒ ã«ãŒãã«ãèµ·åãããã¹ãŠãããŸããã£ãå ŽåãåŠçãããããŒã¿ã¯åºåã¡ã¢ãªã«ãããŸãïŒ "g []"ïŒã ããããã³ããŒã¢ãŠãããããã ãã«æ®ããŸãïŒäžèšã®ãã¡ã¢ãªã®ã³ããŒãã»ã¯ã·ã§ã³ãåç §ïŒã
䟿å©ãªæ©èœ
æ¥åžžç掻ã§åœ¹ç«ã€ãããããªãããã€ãã®æ©èœã«èšåããã ãã§ãã
CALresult result; // CALdevicestatus status; result = calDeviceGetStatus( &status, device ); // GPU result = calCtxFlush( context ); // ( ) CALfunc function; CALfuncInfo functionInfo; result = calModuleGetFuncInfo( &functionInfo, context, module, function ); /* , ( , ) */ // aticalrt.dll const char* errorString = calGetErrorString(); // aticalcl.dll () const char* errorString = calclGetErrorString();
ã¯ãã¹ã¹ã¬ããåæ
Nvidia CUDAãšã¯ç°ãªããç°ãªãã¹ã¬ããã®GPUã䜿çšããŠããå Žåãã³ã³ããã¹ãã§è¿œå ã®ã¢ã¯ã·ã§ã³ãå®è¡ããå¿ èŠã¯ãããŸããã ãããããŸã ããã€ãã®å¶éããããŸãã
ã«ãŒã«çªå·12ïŒ CALã³ã³ãã€ã©ã®ãã¹ãŠã®æ©èœã¯ã¹ã¬ããã»ãŒãã§ã¯ãããŸãã ã 1ã€ã®ã¢ããªã±ãŒã·ã§ã³å ã§ã¯ãäžåºŠã«1ã€ã®ã¹ã¬ããã®ã¿ãã³ã³ãã€ã©ãŒã§åäœã§ããŸãã
ã«ãŒã«13ïŒç¹å®ã®ã³ã³ããã¹ã/ããã€ã¹èšè¿°åïŒã³ã³ããã¹ã/ããã€ã¹ïŒã§åäœããã¡ã€ã³CALã©ã€ãã©ãªã®ãã¹ãŠã®æ©èœã¯ã¹ã¬ããã»ãŒãã§ãã ä»ã®ãã¹ãŠã®é¢æ°ã¯ã¹ã¬ããã»ãŒãã§ã¯ãããŸãã ã
ã«ãŒã«çªå·14ïŒç¹å®ã®ã³ã³ããã¹ãã§åäœã§ããã¢ããªã±ãŒã·ã§ã³ã¹ã¬ããã¯äžåºŠã«1ã€ã ãã§ãã
ãããã«
AMD CALããã³AMD ILãã¯ãããžãŒãæãã¢ã¯ã»ã¹ããããæ¹æ³ã§èšè¿°ããããšããããã誰ã§ãAMD GPUåãã®ã·ã³ãã«ãªã¢ããªã±ãŒã·ã§ã³ãã»ãŒãŒãããäœæã§ããŸããäž»ãªãã®ã¯ãåžžã«1ã€ã®é»éåŸãèŠããŠããããšã§ãïŒRTFMïŒ
ããªãããããèªãã®ãé¢çœããšæã£ãŠãã ããã