çŸåšãèšå€§ãªæ°ã®ã¿ã¹ã¯ã«ã¯é«ãã·ã¹ãã ããã©ãŒãã³ã¹ãå¿ èŠã§ãã ç©ççãªå¶éã«ãããããã»ããµãããäžã®ãã©ã³ãžã¹ã¿ã®æ°ãç¡éã«å¢ããããšã¯ã§ããŸããã ãã©ã³ãžã¹ã¿ã®å¹ŸäœåŠç寞æ³ãç©ççã«çž®å°ããããšã¯ã§ããŸããã蚱容å¯èœãªãµã€ãºãè¶ ãããšã倧ããªãµã€ãºã®ã¢ã¯ãã£ããšã¬ã¡ã³ãã§ã¯ç®ç«ããªãçŸè±¡ãçŸãå§ãããããéåãµã€ãºå¹æã匷ã圱é¿ãå§ããŸãã ãã©ã³ãžã¹ã¿ã¯ãã©ã³ãžã¹ã¿ã®ããã«æ©èœãå§ããŸããã
ã ãŒã¢ã®æ³åã¯ãããšã¯äœã®é¢ä¿ããããŸããã ããã¯äŸ¡å€ã®æ³åã§ãããçŸåšãå€ãããããããäžã®ãã©ã³ãžã¹ã¿æ°ã®å¢å ã¯æ³ã®çµæã§ããå¯èœæ§ãé«ãã§ãã ãããã£ãŠãã³ã³ãã¥ãŒã¿ãŒã·ã¹ãã ã®èœåãé«ããã«ã¯ãä»ã®æ¹æ³ãæ¢ãå¿ èŠããããŸãã ããã¯ããã«ãããã»ããµããã«ãã³ã³ãã¥ãŒã¿ã®äœ¿çšã§ãã ãã®ã¢ãããŒãã¯ãå€æ°ã®ããã»ããµèŠçŽ ã«ãã£ãŠç¹åŸŽä»ããããåã³ã³ãã¥ãŒãã£ã³ã°ããã€ã¹äžã§ãµãã¿ã¹ã¯ãç¬ç«ããŠå®è¡ããŸãã
䞊ååŠçæ¹æ³ïŒ
䞊è¡æ§ã®ãœãŒã¹ | å é | ããã°ã©ããŒã®åªå | 人æ°åºŠ |
---|---|---|---|
å€ãã®ã³ã¢ | 2x-128x | äžçšåºŠ | é«ã |
å€ãã®è» | 1x Infinity | ããé«ã | é«ã |
ãã¯ãã«å | 2x-8x | äžçšåºŠ | äœã |
ã°ã©ãã£ãã¯ã¹ã¢ããã¿ãŒ | 128x-2048x | é«ã | äœã |
ã³ããã»ããµãŒ | 40x-80x | ããé«ã | éåžžã«äœã |
ã·ã¹ãã ã®å¹çãåäžãããæ¹æ³ã¯ãããããããŸãããã©ãããŸã£ããéããŸãã ãã®ãããªæ¹æ³ã®1ã€ã¯ããã¯ãã«ããã»ããµã®äœ¿çšã§ããããã«ãããèšç®é床ãå€§å¹ ã«åäžããŸãã åœä»€ããšã«1ã€ã®ããŒã¿èŠçŽ ïŒSISDïŒãåŠçããã¹ã«ã©ãŒããã»ããµãšã¯ç°ãªãããã¯ãã«ããã»ããµã¯åœä»€ããšã«è€æ°ã®ããŒã¿èŠçŽ ïŒSIMDïŒãåŠçã§ããŸãã ææ°ã®ããã»ããµã®ã»ãšãã©ã¯ã¹ã«ã©ãŒã§ãã ãããã圌ãã解決ããã¿ã¹ã¯ã®å€ãã¯ããããªããµãŠã³ãåŠçãã°ã©ãã£ãã¯ã¹ãç§åŠèšç®ãªã©ã倧éã®èšç®ãå¿ èŠãšããŸãã èšç®ããã»ã¹ãé«éåããããã«ãããã»ããµã¡ãŒã«ãŒã¯ãè¿œå ã®ã¹ããªãŒãã³ã°SIMDæ¡åŒµæ©èœãããã€ã¹ã«çµ±åãå§ããŸããã
ãããã£ãŠãç¹å®ã®ããã°ã©ãã³ã°ææ³ã«ãããããã»ããµã§ããŒã¿ã®ãã¯ãã«åŠçã䜿çšã§ããããã«ãªããŸããã æ¢åã®æ¡åŒµæ©èœïŒMMXãSSEãããã³AVXã è¿œå ã®ããã»ããµæ©èœã䜿çšããŠã倧èŠæš¡ãªããŒã¿é åã®åŠçãé«éåã§ããŸãã åæã«ããã¯ãã«åã«ãããæãããªäžŠååŠçãªãã§é«éåãå¯èœã«ãªããŸãã ã€ãŸã ããŒã¿åŠçã®èŠ³ç¹ããã¯ååšããŸãããããã°ã©ããŒã®èŠ³ç¹ããã¯ã競åãŸãã¯åæç¶æ ãé²ãããã®ç¹å¥ãªã¢ã«ãŽãªãºã ã®éçºã«è²»çšãå¿ èŠãšãããéçºã¹ã¿ã€ã«ã¯åæãšå€ãããŸããã ããŸãæéããããã«å éããã»ãŒå®å šã«ç¡æã§ãã ããã«éæ³ã¯ãããŸããã
SSEãšã¯äœã§ããïŒ
SSEïŒEngãStreaming SIMD Extensionsãã¹ããªãŒãã³ã°SIMD-extension of processorïŒã¯ãSIMDïŒEngãSingle InstructionãMultiple DataãOne instruction-lot of dataïŒã®åœä»€ã»ããã§ãã SSEã«ã¯ãããã»ããµã¢ãŒããã¯ãã£ã«8ã€ã®128ãããã¬ãžã¹ã¿ãšåœä»€ã»ãããå«ãŸããŠããŸãã SSEãã¯ãããžãŒã¯ã1999幎ã«Pentium IIIã§åããŠå°å ¥ãããŸããã æéãçµã€ã«ã€ããŠããã®åœä»€ã»ããã¯ããè€éãªæäœãè¿œå ããããšã«ããæ¹åãããŸããã 8ã€ã®ïŒx86-64ã§ã¯-16ïŒ128ãããã¬ãžã¹ã¿ãããã»ããµã«è¿œå ãããŸããïŒxmm0ããxmm7ã
åœåããããã®ã¬ãžã¹ã¿ã¯å粟床èšç®ïŒã€ãŸããfloatåïŒã«ã®ã¿äœ¿çšã§ããŸããã ãã ããSSE2ã®ãªãªãŒã¹åŸããããã®ã¬ãžã¹ã¿ã¯ä»»æã®ããªããã£ãããŒã¿åã«äœ¿çšã§ããŸãã ãã®ããã«æšæºã®32ããããã·ã³ã䜿çšãããšã䞊è¡ããŠæ ŒçŽããã³åŠçã§ããŸãã
- 2ããã«
- é·ã2
- 4ãããŒã
- 4 int
- ã·ã§ãŒã8
- 16æå
AVXãã¯ãããžãŒã䜿çšããå Žåã¯ããã§ã«256ãããã®ã¬ãžã¹ã¿ããããã1ã€ã®åœä»€ã§ããå€ãæäœããããšã«ãªããŸãã ãããã£ãŠããã§ã«512ãããã®ã¬ãžã¹ã¿ããããŸãã
ãŸããC ++ãäŸãšããŠïŒèå³ã®ãªã人ã¯ã¹ãããã§ããŸãïŒãfloatåã®8èŠçŽ ã®2ã€ã®é åãåèšããããã°ã©ã ãäœæããŸãã
C ++ãã¯ãã«åã®äŸ
C ++ã®SSEãã¯ãããžãŒã¯ãã¢ã»ã³ããªåœä»€ãåæ ããæ¬äŒŒã³ãŒãã®åœ¢åŒã§æ瀺ãããäœã¬ãã«ã®åœä»€ã«ãã£ãŠå®è£ ãããŸãã ãããã£ãŠãããšãã°ãã³ãã³ã__m128 _mm_add_psïŒ__ m128 aã__m128 bïŒ; ã¢ã»ã³ãã©ãŒåœä»€ADDPS operand1ãoperand2ã«å€æãããŸã ã ãããã£ãŠãã³ãã³ã__m128 _mm_add_ssïŒ__ m128 aã__ m128 bïŒ; ADDSSåœä»€operand1ãoperand2ã«å€æãããŸã ã ãããã®2ã€ã®ã³ãã³ãã¯ã»ãŒåãããšãè¡ããŸããé åã®èŠçŽ ãåèšããŸããããããã«ç°ãªãæ¹æ³ã§ãã _mm_add_psã¯å®å šã«ã¬ãžã¹ã¿ã±ãŒã¹ã§ããããã次ã®ããã«ãªããŸãã
- r0ïŒ= a0 + b0
- r1ïŒ= a1 + b1
- r2ïŒ= a2 + b2
- r3ïŒ= a3 + b3
ããã«ãã¬ãžã¹ã¿__m128å šäœãã»ããr0-r3ã§ãã ãã ãã _mm_add_ssã³ãã³ãã¯ã¬ãžã¹ã¿ã®äžéšã®ã¿ãå ç®ããããã次ã®ããã«ãªããŸãã
- r0ïŒ= a0 + b0
- r1ïŒ= a1
- r2ïŒ= a2
- r3ïŒ= a3
ä»ã®ã³ãã³ãã¯ãæžç®ãé€ç®ãå¹³æ¹æ ¹ãæå°ãæ倧ããã®ä»ã®æŒç®ãªã©ãåãåçã§é 眮ãããŸãã
ããã°ã©ã ãäœæããã«ã¯ããããŒãã®__m128ãããã«ã®__m128dãintãshortãcharã®__m128iãªã©ã®128ãããã¬ãžã¹ã¿ãæäœã§ããŸãã åæã«ã__ m128åã®é åã¯äœ¿çšã§ããŸãããã__ m128 *åãžã®floaté åã®æå®ããããã€ã³ã¿ãŒã䜿çšã§ããŸãã
ãã®å Žåãããã€ãã®åŽåæ¡ä»¶ãèæ ®ããå¿ èŠããããŸãã
- __m128ãªããžã§ã¯ãã«ããŒãããã³ä¿åããããããŒãããŒã¿ã¯ã16ãã€ãã®ã¢ã©ã€ã¡ã³ããå¿ èŠã§ã
- äžéšã®çµã¿èŸŒã¿é¢æ°ã¯ãåœä»€ã®æ§è³ªäžãåŒæ°ãæŽæ°å®æ°åã§ããå¿ èŠããããŸã
- 2ã€ã®NANåŒæ°ã«äœçšããç®è¡æŒç®ã®çµæã¯å®çŸ©ãããŠããŸãã
çè«ãžã®ãã®ãããªå°ããªäœè«ã ãã ããSSEã䜿çšãããµã³ãã«ããã°ã©ã ãæ€èšããŠãã ããã
#include "iostream" #include "xmmintrin.h" int main() { const auto N = 8; alignas(16) float a[] = { 41982.0, 81.5091, 3.14, 42.666, 54776.45, 342.4556, 6756.2344, 4563.789 }; alignas(16) float b[] = { 85989.111, 156.5091, 3.14, 42.666, 1006.45, 9999.4546, 0.2344, 7893.789 }; __m128* a_simd = reinterpret_cast<__m128*>(a); __m128* b_simd = reinterpret_cast<__m128*>(b); auto size = sizeof(float); void *ptr = _aligned_malloc(N * size, 32); float* c = reinterpret_cast<float*>(ptr); for (size_t i = 0; i < N/2; i++, a_simd++, b_simd++, c += 4) _mm_store_ps(c, _mm_add_ps(*a_simd, *b_simd)); c -= N; std::cout.precision(10); for (size_t i = 0; i < N; i++) std::cout << c[i] << std::endl; _aligned_free(ptr); system("PAUSE"); return 0; }
- alignasïŒïŒïŒã¯ãå€æ°ãšãŠãŒã¶ãŒã¿ã€ãã®ã«ã¹ã¿ã ã¢ã©ã€ã³ã¡ã³ããèšå®ããæšæºC ++ããŒã¿ãã«æ¹æ³ã§ãã C ++ 11ã§äœ¿çšãããVisual Studio 2015ã§ãµããŒããããŠããŸããå¥ã®ãªãã·ã§ã³__declspecïŒalignïŒïŒïŒïŒdeclaratorã䜿çšã§ããŸãã éçã¡ã¢ãªå²ãåœãŠäžã®ã¢ã©ã€ã¡ã³ãçšã®ããŒã¿ç®¡çããŒã«ã åçéžæã§ã®ã¢ã©ã€ã¡ã³ããå¿ èŠãªå Žåã¯ã void * _aligned_mallocïŒsize_tãµã€ãºãsize_tã¢ã©ã€ã¡ã³ãïŒã䜿çšããŸãã
- 次ã«ãreinterpret_castã䜿çšããŠãé
åaããã³bãžã®ãã€ã³ã¿ãŒã_m128 *åã«å€æããŸããããã«ããããã€ã³ã¿ãŒãä»ã®åã®ãã€ã³ã¿ãŒã«å€æã§ããŸãã
- ãã®åŸãåè¿°ã®é¢æ°ã䜿çšããŠãã¢ã©ã€ã¡ã³ãæžã¿ã¡ã¢ãªãåçã«å²ãåœãŠãŸã_aligned_mallocïŒN * sizeofïŒfloatïŒã16ïŒ;
- åã®æ¬¡å
ãèæ
®ããŠãèŠçŽ ã®æ°ã«åºã¥ããŠå¿
èŠãªãã€ãæ°ãéžæããŸãã16ã¯2ã®ã¹ãä¹ã®ã¢ã©ã€ã¡ã³ãå€ã§ãã ãããŠããã®ã¡ã¢ãªãžã®ãã€ã³ã¿ã¯å¥ã®ã¿ã€ãã®ãã€ã³ã¿ã«çž®å°ããããããfloatåã®æ¬¡å
ãé
åãšããŠèæ
®ããŠäœæ¥ããããšãã§ããŸãã
ãããã£ãŠãSSEäœæ¥ã®æºåã¯ãã¹ãŠå®äºããŸããã ããã«ã«ãŒãã§ã¯ãé åã®èŠçŽ ããŸãšããŸãã ãã®ã¢ãããŒãã¯ããã€ã³ã¿ãŒæŒç®ã«åºã¥ããŠããŸãã a_simd ã b_simd ãããã³cã¯ãã€ã³ã¿ãŒã§ãããããããããå¢ãããšãã¡ã¢ãªãŒããsizeofïŒTïŒã ããªãã»ãããããŸãã ããšãã°ãåçé åcã䜿çšãããšã c [0]ãš* cã¯åãå€ã衚瀺ããŸãã cã¯é åã®æåã®èŠçŽ ãæããŸãã ã€ã³ã¯ãªã¡ã³ãcã«ããããã€ã³ã¿ãŒã¯4ãã€ãåæ¹ã«ã·ãããããã€ã³ã¿ãŒã¯é åã®2ã€ã®èŠçŽ ãæãããã«ãªããŸãã ãããã£ãŠããã€ã³ã¿ãå¢æžããããšã§ãé åãååŸã«ç§»åã§ããŸãã ããããåæã«ãé åã®æ¬¡å ãèæ ®ããå¿ èŠããããŸãããªããªãããã®å¢çãè¶ããŠä»ã®èª°ãã®èšæ¶ã«åããã®ã¯ç°¡åã ããã§ãã a_simdãã€ã³ã¿ãŒãšb_simdãã€ã³ã¿ãŒã®åäœã¯äŒŒãŠããŸãããã€ã³ã¿ãŒãã€ã³ã¯ãªã¡ã³ãããã ãã§128ãããã®é²ã¿ãçºçããfloatåã®èŠ³ç¹ããã¯ãé åaãšbã®4ã€ã®å€æ°ã¯ã¹ããããããŸãã ååãšããŠããã€ã³ã¿ãŒa_simdããã³a㯠ãããããb_simdããã³bãšåæ§ã«ããã€ã³ã¿ãŒã®ã¿ã€ãã®æ¬¡å ãèæ ®ããŠç°ãªãæ¹æ³ã§åŠçãããããšãé€ããŠãã¡ã¢ãªãŒå ã®1ã€ã®ã»ã¯ã·ã§ã³ãæããŸãã
for (int i = 0; i < N/2; i++, a_simd++, b_simd++, c += 4) _mm_store_ps(c, _mm_add_ps(*a_simd, *b_simd));
ããã§ããã®ã«ãŒãã«ãã®ãããªãã€ã³ã¿ãŒã®å€æŽãããçç±ãæããã«ãªããŸããã ãµã€ã¯ã«ã®åå埩ã§ã4ã€ã®èŠçŽ ãè¿œå ãããçµæãã¬ãžã¹ã¿xmm0ïŒãã®ããã°ã©ã ã®å ŽåïŒãããã€ã³ã¿cã®ã¢ãã¬ã¹ã«ä¿åãããŸãã ã€ãŸã ãã®ã¢ãããŒãã§ã¯ããœãŒã¹ããŒã¿ã¯å€æŽãããŸããããã¬ãžã¹ã¿ã«éé¡ãä¿åãããå¿ èŠã«å¿ããŠå¿ èŠãªãªãã©ã³ãã«è»¢éãããŸãã ããã«ããããªãã©ã³ããåå©çšããå¿ èŠãããå Žåã«ãããã°ã©ã ã®ããã©ãŒãã³ã¹ãåäžãããããšãã§ããŸãã
ã¢ã»ã³ãã©ãŒã_mm_add_psã¡ãœããçšã«çæããã³ãŒããæ€èšããŠãã ããã
mov eax,dword ptr [b_simd] ;// b_simd eax( , ) mov ecx,dword ptr [a_simd] ;// a_simd ecx movups xmm0,xmmword ptr [ecx] ;// 4 ecx xmm0; xmm0 = {a[i], a[i+1], a[i+2], a[i+3]} addps xmm0,xmmword ptr [eax] ;// : xmm0 = xmm0 + b_simd ;// xmm0[0] = xmm[0] + b_simd[0] ;// xmm0[1] = xmm[1] + b_simd[1] ;// xmm0[2] = xmm[2] + b_simd[2] ;// xmm0[3] = xmm[3] + b_simd[3] movaps xmmword ptr [ebp-190h],xmm0 ;// movaps xmm0,xmmword ptr [ebp-190h] ;// mov edx,dword ptr [c] ;// ecx movaps xmmword ptr [edx],xmm0 ;// ecx , (ecx) . xmmword , _m128 - 128- , 4
ã³ãŒããããããããã«ã1ã€ã®addpsåœä»€ã4ã€ã®å€æ°ãäžåºŠã«åŠçããŸããããã¯ãããŒããŠã§ã¢ããã»ããµã«ãã£ãŠå®è£ ããã³ãµããŒããããŠããŸãã ã·ã¹ãã ã¯ãããã®å€æ°ã®åŠçã«é¢äžããªããããäžå¿ èŠãªå€éšã³ã¹ãããããã«ããã©ãŒãã³ã¹ãåäžããŸãã
ãã®å Žåããã®äŸã§ã¯ã³ã³ãã€ã©ãmovupsåœä»€ã䜿çšãããšãã1ã€ã®æ©èœã«æ³šç®ããããšæããŸãããã®åœä»€ã¯ã16ãã€ãå¢çã«æŽåããå¿ èŠã®ãããªãã©ã³ããå¿ èŠãšããŸããã ãããããé åaãæŽåã§ããŸããã§ãã ã ãã ããé åbã¯äœçœ®åããããå¿ èŠããããŸããããããªããšã128ãããã®ã¡ã¢ãªäœçœ®ã§ã¬ãžã¹ã¿ãè¿œå ãããããã addpsæäœã§ã¡ã¢ãªã®èªã¿åãã«å€±æããŸãã å¥ã®ã³ã³ãã€ã©ãŸãã¯ç°å¢ã«ã¯ä»ã®åœä»€ãååšããå Žåãããããããã®ãããªæäœã«é¢ä¿ãããã¹ãŠã®ãªãã©ã³ããå¢çæŽåãè¡ãæ¹ãé©åã§ãã ãããã«ããŠããã¡ã¢ãªã®åé¡ãåé¿ããããã
äœçœ®åãããè¡ããã1ã€ã®çç±ã¯ãé åã®èŠçŽ ãæäœãããšãïŒããã³èŠçŽ ã ãã§ãªãïŒã64ãã€ãã®ãµã€ãºã®ãã£ãã·ã¥ã©ã€ã³ãåžžã«æäœãããšãã§ãã SSEããã³AVXãã¯ãã«ã¯ããããã16ãã€ãããã³32ãã€ãã§äœçœ®åãããããŠããå Žåãåžžã«åããã£ãã·ã¥ã©ã€ã³ã«åé¡ãããŸãã ããããããŒã¿ãæŽåãããŠããªãå Žåã¯ãå¥ã®ãè¿œå ã®ããã£ãã·ã¥ã©ã€ã³ãããŒãããå¿ èŠããããŸãã ãã®ããã»ã¹ã¯ããã©ãŒãã³ã¹ã«é倧ãªåœ±é¿ãäžããŸããã¢ã¬ã€ã®èŠçŽ ããããã£ãŠã¡ã¢ãªãäžè²«æ§ãªãæ±ãå Žåããã¹ãŠãããã«æªåããå¯èœæ§ããããŸãã
.NETã§ã®SIMDãµããŒã
SIMDãã¯ãããžãŒã«å¯ŸããJITãµããŒãã®æåã®èšåã¯ã2014幎4æã«.NETããã°ã§çºè¡šãããŸããã ãã®åŸãéçºè ã¯ãSIMDæ©èœãæäŸããRyuJITã®æ°ãããã¬ãã¥ãŒããŒãžã§ã³ãçºè¡šããŸããã è¿œå ã®çç±ã¯ãCïŒãšSIMDã®ãµããŒããªã¯ãšã¹ãã®äººæ°ãããªãé«ãã£ãããšã§ãã ãµããŒããããŠããã¿ã€ãã®åæã»ããã¯å€§ãããªããæ©èœã«å¶éããããŸããã æåã¯ãSSEã»ããããµããŒããããAVXããªãªãŒã¹ã«è¿œå ãããããšãçŽæãããŸããã ãã®åŸã®æŽæ°ããªãªãŒã¹ãããSIMDããµããŒãããæ°ããã¿ã€ããšããããæäœããæ°ããæ¹æ³ãè¿œå ãããŸãããæè¿ã®ããŒãžã§ã³ã§ã¯ãããŒããŠã§ã¢ããŒã¿åŠççšã®åºç¯ã§äŸ¿å©ãªã©ã€ãã©ãªãè¡šããŸãã
ãã®ã¢ãããŒãã«ãããCPUäŸåã®ã³ãŒããèšè¿°ããå¿ èŠã®ãªãéçºè ã®äœæ¥ã楜ã«ãªããŸãã 代ããã«ãCLRã¯ãå®è¡æïŒJITïŒãŸãã¯ã€ã³ã¹ããŒã«äžïŒNGENïŒã«ã³ãŒãããã·ã³åœä»€ã«å€æããä»®æ³ã©ã³ã¿ã€ã ãæäŸããããšã«ãããããŒããŠã§ã¢ãæœè±¡åããŸãã CLRã³ãŒãçæãçµäºãããšããã®ç¹å®ã®CPUã«åºæã®æé©åããããããããšãªããç°ãªãããã»ããµãæèŒããç°ãªãã³ã³ãã¥ãŒã¿ãŒã§åãMSILã³ãŒãã䜿çšã§ããŸãã
çŸåšã.NETã§ã®ãã®ãã¯ãããžãŒã®ãµããŒãã¯System.Numerics.Vectorsåå空éã§è¡šãããSIMDããŒããŠã§ã¢ã¢ã¯ã»ã©ã¬ãŒã·ã§ã³ãå©çšã§ãããã¯ã¿ãŒã¿ã€ãã®ã©ã€ãã©ãªã§ãã ããŒããŠã§ã¢ã¢ã¯ã»ã©ã¬ãŒã·ã§ã³ã¯ãæ°åŠããã°ã©ãã³ã°ãç§åŠããã°ã©ãã³ã°ãããã³ã°ã©ãã£ãã¯ããã°ã©ãã³ã°ã®çç£æ§ã®å€§å¹ ãªåäžã«ã€ãªãããŸãã 次ã®ã¿ã€ããå«ãŸããŸãã
- åççŽ æ-æ®éçãªãã¯ãã«ãæäœããããã®éçãªäŸ¿å©ãªã¡ãœããã®ã³ã¬ã¯ã·ã§ã³
- Matrix3x2-3x2ãããªãã¯ã¹ãè¡šããŸã
- Matrix4x4-4x4ãããªãã¯ã¹ãè¡šããŸã
- å¹³é¢-3次å ã®å¹³é¢ãè¡šããŸã
- ã¯ã©ãŒã¿ããªã³-3次å ã®ç©ççãªå転ããšã³ã³ãŒãããããã«äœ¿çšããããã¯ãã«ãè¡šããŸã
- Vector <ïŒOf <ïŒ<'T>ïŒ>ïŒ>ã¯ã䞊åã¢ã«ãŽãªãºã ã®äœã¬ãã«ã®æé©åã«é©ããæå®ãããæ°å€åã®ãã¯ãã«ãè¡šããŸã
- Vector2-2ã€ã®å粟床浮åå°æ°ç¹å€ãæã€ãã¯ãã«ãè¡šããŸã
- Vector3-3ã€ã®å粟床浮åå°æ°ç¹å€ãæã€ãã¯ãã«ãè¡šããŸã
- Vector4-4ã€ã®å粟床浮åå°æ°ç¹å€ãæã€ãã¯ãã«ãè¡šããŸã
Vectorã¯ã©ã¹ã¯ããã¯ãã«ã®æå°å€ãšæ倧å€ãããã³ä»ã®å€ãã®å€æãè¿œå ãæ¯èŒãæ€çŽ¢ããããã®ã¡ãœãããæäŸããŸãã åæã«ãæäœã¯SIMDãã¯ãããžãŒã䜿çšããŠåäœããŸãã ä»ã®ã¿ã€ããããŒããŠã§ã¢ã¢ã¯ã»ã©ã¬ãŒã·ã§ã³ããµããŒãããŠãããç¹å®ã®å€æãå«ãŸããŠããŸãã è¡åã®å Žåãããã¯ãã¯ãã«ã®å Žåããã€ã³ãéã®ãŠãŒã¯ãªããè·é¢ãªã©ã®ä¹ç®ã«ãªããŸãã
CïŒããã°ã©ã ã®äŸ
ããã§ã¯ããã®ãã¯ãããžãŒã䜿çšããã«ã¯äœãå¿ èŠã§ããïŒ æåã«RyuJITã³ã³ãã€ã©ãš.NETããŒãžã§ã³4.6ãå¿ èŠã§ãã ããŒãžã§ã³ãäœãå ŽåãNuGetãä»ããSystem.Numerics.Vectorsã¯ã€ã³ã¹ããŒã«ãããŸããã ãã ããã©ã€ãã©ãªãã€ã³ã¹ããŒã«ãããŠããŠããããŒãžã§ã³ãããŠã³ã°ã¬ãŒããããšããã¹ãŠãæ£åžžã«æ©èœããŸããã 次ã«ãx64åãã«ãã«ãããå¿ èŠããããŸãããã®ããã«ã¯ããããžã§ã¯ãããããã£ã®ã32ããããã©ãããã©ãŒã ãåªå ããåé€ããå¿ èŠããããä»»æã®CPUã§ãã«ãã§ããŸãã
ãªã¹ãïŒ
using System; using System.Numerics; class Program { static void Main(string[] args) { const Int32 N = 8; Single[] a = { 41982.0F, 81.5091F, 3.14F, 42.666F, 54776.45F, 342.4556F, 6756.2344F, 4563.789F }; Single[] b = { 85989.111F, 156.5091F, 3.14F, 42.666F, 1006.45F, 9999.4546F, 0.2344F, 7893.789F }; Single[] c = new Single[N]; for (int i = 0; i < N; i += Vector<Single>.Count) // Count 16 char, 4 float, 2 double .. { var aSimd = new Vector<Single>(a, i); // i var bSimd = new Vector<Single>(b, i); Vector<Single> cSimd = aSimd + bSimd; // Vector<Single> c_simd = Vector.Add(b_simd, a_simd); cSimd.CopyTo(c, i); // } for (int i = 0; i < a.Length; i++) { Console.WriteLine(c[i]); } Console.ReadKey(); } }
äžè¬çãªèŠ³ç¹ããã.NETã®C ++ã¢ãããŒãã¯ããªã䌌ãŠããŸãã ãœãŒã¹ããŒã¿ãå€æ/ã³ããŒããæçµé åã«ã³ããŒããå¿ èŠããããŸãã ãã ããCïŒã䜿çšããã¢ãããŒãã¯ã¯ããã«åçŽã§ãããå€ãã®ããšãããªãã®ããã«è¡ããã䜿çšããŠæ¥œããã ãã§æžã¿ãŸãã ããŒã¿ã®ã¢ã©ã€ã¡ã³ãã«ã€ããŠèããã¡ã¢ãªãå²ãåœãŠãç¹å®ã®æŒç®åã§éçãŸãã¯åçã«è¡ãå¿ èŠã¯ãããŸããã äžæ¹ããã€ã³ã¿ãŒã䜿çšããŠäœãèµ·ãã£ãŠãããããã詳现ã«å¶åŸ¡ã§ããŸãããäœãèµ·ãã£ãŠããã®ãã«ã€ããŠã責任ããããŸãã
ãããŠãã«ãŒãã§ã¯ãã¹ãŠãC ++ã®ã«ãŒããšåãããã«èµ·ãããŸãã ãããŠãç§ã¯ãã€ã³ã¿ãŒã«ã€ããŠè©±ããŠããŸããã èšç®ã¢ã«ãŽãªãºã ã¯åãã§ãã æåã®å埩ã§ããœãŒã¹é åã®æåã®4ã€ã®èŠçŽ ãaSimdæ§é ãšbSimdæ§é ã«å ¥åããåèšããŠæçµé åã«æ ŒçŽããŸãã 次ã«ã次ã®å埩ã§ããªãã»ããã䜿çšããŠæ¬¡ã®4ã€ã®èŠçŽ ãå ¥åããããããåèšããŸãã ããã¯ãããè¿ éãã€ç°¡åã«è¡ãããæ¹æ³ã§ãã ã³ã³ãã€ã©ãŒããã®ã³ãã³ãvar cSimd = aSimd + bSimdã«å¯ŸããŠçæããã³ãŒããæ€èšããŠãã ããã
addps xmm0,xmm1
C ++ããŒãžã§ã³ãšã®éãã¯ãäž¡æ¹ã®ã¬ãžã¹ã¿ãããã«è¿œå ãããã ãã§ãã¬ãžã¹ã¿ãã¡ã¢ãªã§æãç³ãŸããŠããããšã ãã§ãã ã¬ãžã¹ã¿å ã®é 眮ã¯ã aSimdããã³bSimdã®åæåäžã«çºçããŸãã äžè¬ã«ããã®ã¢ãããŒãã¯ãC ++ã³ã³ãã€ã©ãš.NETã³ã³ãã€ã©ã®ã³ãŒããæ¯èŒãããšããç¹ã«éãã¯ãªããã»ãŒåçã®ããã©ãŒãã³ã¹ãæäŸããŸãã ãã ãããã€ã³ã¿ãŒã䜿çšãããªãã·ã§ã³ã¯åŒãç¶ãé«éã«åäœããŸãã ã³ãŒãã®æé©åãæå¹ã«ãããšãSIMDåœä»€ãçæãããããšã«æ³šæããŠãã ããã ã€ãŸã ãããã°ã®éã¢ã»ã³ãã©ã§ãããã確èªããããšã¯ã§ããŸãããããã¯é¢æ°åŒã³åºããšããŠå®è£ ãããŠããŸãã ãã ããæé©åãæå¹ã«ãªã£ãŠãããªãªãŒã¹ã§ã¯ããããã®æ瀺ãæ瀺çãªïŒçµã¿èŸŒã¿ïŒåœ¢åŒã§åãåããŸãã
æåŸã«
ç§ãã¡ã¯äœãæã£ãŠããŸãïŒ
- å€ãã®å Žåããã¯ãã«åã«ããããã©ãŒãã³ã¹ã4ã8ååäžããŸã
- æŽç·Žãããã¢ã«ãŽãªãºã ã«ã¯å·¥å€«ãå¿ èŠã§ããããããªãã§ã¯ã©ãã«ããããŸãã
- System.Numerics.Vectorsã¯çŸåšãsimdåœä»€ã®äžéšã®ã¿ãã«ããŒããŠããŸãã ããæ·±å»ãªã¢ãããŒãã«ã¯ãC ++ãå¿ èŠã§ãã
- ãã¯ãã«å以å€ã«ãå€ãã®æ¹æ³ããããŸãïŒãã£ãã·ã¥ã®æ£ãã䜿çšããã«ãã¹ã¬ãããç°çš®ã³ã³ãã¥ãŒãã£ã³ã°ãã¡ã¢ãªã䜿çšããæèœãªäœæ¥ïŒã¬ããŒãžã³ã¬ã¯ã¿ãŒãæ±ããããªãããã«ïŒãªã©ã
.NETã§ããŒããŠã§ã¢ã¢ã¯ã»ã©ã¬ãŒã·ã§ã³ãæ€èšããSasha GoldsteinïŒã.NETãã©ãããã©ãŒã ã§ã®ã¢ããªã±ãŒã·ã§ã³æé©åãã®èè ã®1人ïŒãšã®ç°¡åãªtwitterã®éä¿¡ã§ã.NETã§å®è£ ãããŠããSIMDãµããŒããšãã®å 容ãå°ããŸããã C ++ãšã®æ¯èŒã 圌ã¯çããŸããããééããªããCïŒãããC ++ã®æ¹ãå€ãã®ããšãã§ããŸãã ããããCïŒã§ã¯ãã¹ããã»ããµã®ãµããŒããæ¬åœã«åŸãããšãã§ããŸãã ããšãã°ãSSE4ãšAVXã®éã®èªåéžæã äžè¬çã«ãããã¯æå ±ã§ãã å°ãã®åŽåã§ãå¯èœãªéããã¹ãŠã®ããŒããŠã§ã¢ãªãœãŒã¹ãå©çšããŠãã·ã¹ãã ããå¯èœãªéãå€ãã®ããã©ãŒãã³ã¹ãåŸãããšãã§ããŸãã
ç§ã«ãšã£ãŠãããã¯çç£çãªããã°ã©ã ãéçºããéåžžã«è¯ãæ©äŒã§ãã å°ãªããšãç©çããã»ã¹ã®ã¢ããªã³ã°ã«é¢ããè«æã§ã¯ãåºæ¬çã«ãäžå®æ°ã®ã¹ã¬ãããäœæããç°çš®èšç®ã䜿çšããããšã§å¹çãéæããŸããã CUDAãšC ++ AMPã®äž¡æ¹ã䜿çšããŸãã éçºã¯Windows 10ã®ãŠãããŒãµã«ãã©ãããã©ãŒã ã§è¡ãããWinRTã«éåžžã«é åãæããŠããŸããããã«ãããCïŒãšC ++ / CXã®äž¡æ¹ã§ããã°ã©ã ãäœæã§ããŸãã åºæ¬çã«ãé·æã§ã¯å€§èŠæš¡ãªèšç®çšã®ã«ãŒãã«ãäœæããŸãïŒBoostïŒããCïŒã§ã¯æ¢ã«ããŒã¿ãæäœããŠã€ã³ã¿ãŒãã§ã€ã¹ãéçºããŠããŸãã åœç¶ã2ã€ã®èšèªã®çžäºäœçšã®ããã®ãã€ããªABIã€ã³ã¿ãŒãã§ã€ã¹ãä»ããããŒã¿è»¢éã«ã¯äŸ¡æ ŒããããŸããïŒããã»ã©å€§ããã¯ãããŸããïŒãC ++ã©ã€ãã©ãªã®ããåççãªéçºãå¿ èŠã§ãã ãã ããå¿ èŠãªå Žåã«ã®ã¿ããŒã¿ãéä¿¡ãããçµæã衚瀺ããããã ãã«éä¿¡ãããããšã¯ã»ãšãã©ãããŸããã
CïŒã§ããŒã¿ãæäœããå¿ èŠãããå Žåã¯ãWinRTåã§åäœããªãããã«.NETåã«å€æããŸããããã«ãããCïŒã§æ¢ã«åŠçããã©ãŒãã³ã¹ãåäžããŸãã ããšãã°ãæ°åãŸãã¯æ°äžã®èŠçŽ ãåŠçããå¿ èŠãããå ŽåããŸãã¯åŠçèŠä»¶ã«ç¹å¥ãªä»æ§ããªãå Žåãã©ã€ãã©ãªã䜿çšããã«CïŒã§ããŒã¿ãèšç®ã§ããŸãïŒæ§é ã®300ãã1,000äžã€ã³ã¹ã¿ã³ã¹ãã1åã®å埩ã§ã«ãŠã³ããããå ŽåããããŸãïŒã ãã®ãããããŒããŠã§ã¢ã¢ã¯ã»ã©ã¬ãŒã·ã§ã³ã¢ãããŒãã¯ãã¿ã¹ã¯ãç°¡çŽ åããé«éåããŸãã
èšäºãæžããšãã®ãœãŒã¹ã®ãªã¹ã
- æ°ããSIM察å¿JITã«é¢ãã.NETããã°èšäºã®ç¿»èš³
- .NETããã°ã®System.Numerics.Vectorã«ã€ããŠ
- ã¢ã©ã€ã³ã¡ã³ããšã¯äœã§ããïŒããã°ã©ã ã«ã©ã®ããã«åœ±é¿ããŸããïŒ
- ã«ãŒããã¯ã¿ãªã³ã°ã§ã®ããŒã¿èª¿æŽã«é¢ããã€ã³ãã«ã®ããã°
ããŠãæäŸããããã«ããšæ å ±ãæäŸããŠãããSasha Goldsteinã«æè¬ããŸãã