ãã®åºçç©ã¯ãèšäºãIntelãšã³ãžãã¢ã®ã¹ãã³ã·ã«èšç®ã«é©çšãããç¹æ§è©äŸ¡ãšæé©åã®æ¹æ³è« ãã®æåã®éšåã翻蚳ãããã®ã§ãã ãã®ããŒãã§ã¯ãããªãäžè¬çãªã³ã³ãã¥ãŒãã£ã³ã°ã«ãŒãã«ã®äŸã䜿çšããŠããã©ãŒãã³ã¹ã®åæãšã«ãŒãã©ã€ã³ã¢ãã«ã®æ§ç¯ã«å°å¿µããŸããããã«ããããã®ãã©ãããã©ãŒã ã§ã¢ããªã±ãŒã·ã§ã³ãæé©åããèŠéããè©äŸ¡ã§ããŸãã
次ã®ããŒãã§ã¯ãæåŸ ãããããã©ãŒãã³ã¹å€ã«è¿ã¥ããããã«ã©ã®æé©åãé©çšããããã説æããŸãã ãã®èšäºã§èª¬æããæé©åææ³ã«ã¯ãããšãã°æ¬¡ã®ãã®ããããŸãã
- ã¹ã±ãŒã©ãã«ãªäžŠååïŒå ±åã¹ã¬ããããããã³ã°ïŒ
- ã¡ã¢ãªåž¯åå¹ ã®å¢å ïŒãã£ãã·ã¥ããããã³ã°ãã¬ãžã¹ã¿ã®åå©çšïŒ
- ããã»ããµããã©ãŒãã³ã¹ã®åäžïŒãã¯ãã«åããµã€ã¯ã«ã®ååå²ïŒã
èšäºã®ç¬¬3éšã§ã¯ãã¢ããªã±ãŒã·ã§ã³ã®èµ·åãšãã«ãã«æé©ãªãã©ã¡ãŒã¿ãŒãèªåçã«éžæããã¢ã«ãŽãªãºã ã«ã€ããŠèª¬æããŸãã ãããã®ãã©ã¡ãŒã¿ãŒã¯éåžžãããã°ã©ã ã®ãœãŒã¹ã³ãŒãã®å€æŽïŒã«ãŒãããããã³ã°å€ãªã©ïŒãã³ã³ãã€ã©ãŒã®ãã©ã¡ãŒã¿ãŒïŒãµã€ã¯ã«ã¹ã€ãŒããã¡ã¯ã¿ãŒïŒãã³ã³ãã¥ãŒã¿ãŒã·ã¹ãã ã®ç¹æ§ïŒãã£ãã·ã¥ãµã€ãºãªã©ïŒã«é¢é£ä»ããããŠããŸãã çµæã®ã¢ã«ãŽãªãºã ã¯ãåŸæ¥ã®ãããŒãŠã§ã€ãæ€çŽ¢ææ³ãããé«éã§ããããšãå€æããŸããã æãåçŽãªå®è£ ããæãæé©åãããå®è£ ãŸã§ãIntel Xeonããã»ããµE5-2697v2ã§ããã©ãŒãã³ã¹ã6åã第1äžä»£Intel Xeon Phiã³ããã»ããµã§çŽ3åã®ããã©ãŒãã³ã¹ãåŸãããŸããã ããã«å ããŠãäžèšã®èªåãã¥ãŒãã³ã°ã®æ¹æ³ã§ã¯ãå ¥åããŒã¿ã®ã»ããã«æé©ãªéå§ãã©ã¡ãŒã¿ãŒãéžæãããŸãã
å³1. Ivy Bridge 2S E5-2697 v2ã®ã«ãŒãã©ã€ã³Iso3DFDã¢ãã«ã èµ€ãšæããç·ã®ç·ã¯ãããããçŸåšã®ãã©ãããã©ãŒã ã®çè«äžã®äžéãšéæå¯èœãªäžéã瀺ããŠããŸãã æ°Žå¹³ã®éãç·ã¯ãå ç®ãšä¹ç®ã®ç¹å®ã®äžåè¡¡ïŒ#ADD; #MULïŒãèæ ®ããã¹ããªãŒã ãã©ã€ã¢ããã³ãããŒã¯ã䜿çšããŠå¹³ååãããã¡ã¢ãªåž¯åå¹ ã®æ倧å€ãåæ ããŠããŸãïŒè¶è²ã®æšªç·ïŒã æ¿ãç·è²ã®çžŠç·ã¯ãIso3DFDã¢ã«ãŽãªãºã ã®ã³ã¢ã®ç®è¡åŒ·åºŠã«å¯Ÿå¿ããŠããŸãã æ®ãã®ç·ãšã®äº€ç¹ã¯ã察å¿ããéæå¯èœãªå¶éãäžããŸãã
çãã¬ãã¥ãŒ
ãã®èšäºã§ã¯ãçæ¹æ§åªäœïŒIso3DFDïŒã®äžå®ãŸãã¯å¯å€å¯åºŠã§é³é¿æ¹çšåŒã解ãããã«äœ¿çšããã3Dæéå·®åã¢ã«ãŽãªãºã ïŒ3DFDïŒã®ç¹æ§è©äŸ¡ïŒç¿»èš³è ã®ã¡ã¢-ç¹æ§è©äŸ¡-ç¹æ§ã®èå¥ïŒããã³æé©åæ¹æ³ã«ã€ããŠèª¬æããŸãã 3DFDã®æãåçŽãªå®è£ ããå§ããŠãç¹å®ã®ã³ã³ãã¥ãŒãã£ã³ã°ã·ã¹ãã äžã§ç¹å®ã®ã¢ã«ãŽãªãºã ãç¹åŸŽä»ããããšã«ãã£ãŠãç¹å®ã®ã¢ã«ãŽãªãºã ã§åŸãããæé«ã®ããã©ãŒãã³ã¹ãè©äŸ¡ããæ¹æ³ã説æããŸãã
ã¯ããã«
æéé åã§ã®æéå·®åæ³ã¯ãããšãã°ãæ³¢ã®çŸè±¡ãå°éæ¢æ»ã®è§£æãªã©ãåºã䜿çšãããŠããæ³¢ã®ã¢ããªã³ã°ææ³ã§ãã ãã®æ¹æ³ã¯ãéæéãã€ã°ã¬ãŒã·ã§ã³ãå®å šãªæ³¢åœ¢å転ãªã©ã®èé解ææè¡ã䜿çšããå Žåã«ãã䜿çšãããŸãã ãã®æ¹æ³ã®çš®é¡ã«ã¯ãæ³¢ãé³é¿ãŸãã¯åŒŸæ§ãšã¿ãªãããšãå«ãŸããäŒæåªäœã¯ç°æ¹æ§ã§ãããå¯åºŠãå€åããŸãã
ãåãã®ããã«ãåå°é¢æ°ã®è¿äŒŒã®ããã®ç¹å®ã®æ°å€ã¹ããŒã ã®éžæã¯ãå®è£ ã®ããã©ãŒãã³ã¹ã«å€§ããªåœ±é¿ãåãŒããŸã[1]ã ç¹ã«ãããã¯3DFDã¢ã«ãŽãªãºã ã®æŒç®åŒ·åºŠïŒéä¿¡ãããåãã·ã³ã¯ãŒãã®æµ®åå°æ°ç¹æŒç®ã®æ°ïŒã«åœ±é¿ããŸãã ãã®ç®è¡åŒ·åºŠã¯ãã«ãŒãã©ã€ã³ã¢ããªã³ã°ææ³[2]ã䜿çšããŠãäºæ³ãããããã©ãŒãã³ã¹ã«ããã«é¢é£ä»ããããšãã§ããŸãã ãã®æ¹æ³ã«ãããç¹å®ã®ã³ã³ãã¥ãŒãã£ã³ã°ã·ã¹ãã ã§éæå¯èœãªæ倧å€ãšæ¯èŒããŠãå®è£ ã®ããã©ãŒãã³ã¹ã¬ãã«ãè©äŸ¡ã§ããŸãã ã€ãŸããã«ãŒãã©ã€ã³ã¢ãã«ã¯ãããã°ã©ã ã®ãœãŒã¹ã³ãŒããæé©åããããšã§éæã§ããçç£æ§åäžã®ãã¬ãŒã ã¯ãŒã¯ãèšå®ããŸãã å®è£ ããã©ãŒãã³ã¹ãç¹å®ã®ã¬ãã«ã«éããåŸãã¢ã«ãŽãªãºã èªäœãå€æŽããããšã«ãã£ãŠã®ã¿ãçç£æ§ãããã«åäžãããããšãã§ããŸãã
ã©ã®ã³ã³ãã¥ãŒã¿ãŒã§ãããã®ä»æ§ã¯ãæµ®åå°æ°ç¹æŒç®ã®æ°ïŒFLOP / sïŒããã³ã¡ã¢ãªãŒãšã®ããŒã¿è»¢éïŒã¡ã¢ãªãŒåž¯åå¹ ïŒã®ããŒã¯å€ãå®çŸ©ããŸãã LINPACK [3]ãSTREAM triad [4]ãªã©ã®æšæºãã³ãããŒã¯ãèµ·åããããšã§ã察å¿ããæ倧éæå¯èœã€ã³ãžã±ãŒã¿ãŒãååŸã§ããŸãã
ãã®èšäºã®æåã®éšåã¯ããã¥ã¢ã«ãœã±ãããµãŒããŒãšã³ããã»ããµãŒã§ã®Iso3DFDã¢ã«ãŽãªãºã ã®ã³ã¢ã®éæå¯èœãªæ倧ããã©ãŒãã³ã¹ãè©äŸ¡ããããšãç®çãšããŠããŸãã 次ã«ãããã©ãŒãã³ã¹ã«éèŠãªåœ±é¿ãäžããå¯èœæ§ã®ããããã€ãã®ææ³ã«ã€ããŠèª¬æããŸãã ãã€ãã®ããã«ããã®ãããªæé©åã«ã¯ãœãŒã¹ã³ãŒãã®å€å°ã®åªåãšä¿®æ£ãå¿ èŠã«ãªãå ŽåããããŸãã ãã®åŸãæé©ã§ã¯ãªãã«ããŠããã¢ããªã±ãŒã·ã§ã³ãã³ã³ãã€ã«ããŠå®è¡ããããã®æé©ãªãã©ã¡ãŒã¿ãŒã»ãããããçšåºŠèŠã€ããããã®è£å©ããŒã«ã瀺ããŸãã
å³2. Xeon Phi 7120Pã³ããã»ããµãŒã®ã«ãŒãã©ã€ã³Iso3DFDã¢ãã«ã èµ€ãšæããç·ã®ç·ã¯ãããããçŸåšã®ãã©ãããã©ãŒã ã®çè«äžã®äžéãšéæå¯èœãªäžéã瀺ããŠããŸãã æ°Žå¹³ã®éãç·ã¯ãå ç®ãšä¹ç®ã®ç¹å®ã®äžåè¡¡ïŒ#ADD; #MULïŒãèæ ®ããã¹ããªãŒã ãã©ã€ã¢ããã³ãããŒã¯ã䜿çšããŠå¹³ååãããã¡ã¢ãªåž¯åå¹ ã®æ倧å€ãåæ ããŠããŸãïŒè¶è²ã®æšªç·ïŒã æ¿ãç·è²ã®çžŠç·ã¯ãIso3DFDã¢ã«ãŽãªãºã ã®ã³ã¢ã®ç®è¡åŒ·åºŠã«å¯Ÿå¿ããŠããŸãã æ®ãã®ç·ãšã®äº€ç¹ã¯ã察å¿ããéæå¯èœãªå¶éãäžããŸãã
æ§èœè©äŸ¡
ã³ã¢Iso3DFDã¢ã«ãŽãªãºã ã¯ã16空éãµã³ããªã³ã°ãš2æéãµã³ããªã³ã°ã®é³é¿çæ¹æ§æ³¢åæ¹çšåŒã解ããŸãã ãã®3DFDã«ãŒãã«ã®æšæºå®è£ ã¯ãéåžžã1ç§ãããã®æµ®åå°æ°ç¹æŒç®ïŒFLOP / sïŒã§ããŒã¯ã³ã³ãã¥ãŒãã£ã³ã°ã·ã¹ãã ã®ããã©ãŒãã³ã¹ã®10ïŒ æªæºãéæããŸãã CPUããã³Xeon Phiã³ããã»ããµãŒäžã®Iso3DFDã³ã³ãã¥ãŒãã£ã³ã°ã³ã¢ã®ã«ãŒãã©ã€ã³ã¢ãã«[2]ãååŸããæ¹æ³ãæ€èšããŸãã ãã®ã¢ããªã±ãŒã·ã§ã³ã®æ倧ããã©ãŒãã³ã¹ãèŠã€ããã«ã¯ã以äžãèŠã€ããå¿ èŠããããŸãã
- ããŒã¯ããã©ãŒãã³ã¹ãšã¡ã¢ãªåž¯åå¹ ïŒçè«ïŒïŒå粟床ã§2420 GFLOP / sãIntel Xeon Phi 7120Aã§352 GB / sã å粟床ã§1036 GFLOP / sã1866 MHz DDR3ã¡ã¢ãªãæèŒãã2ã€ã®Intel Xeonããã»ããµãŒE5-2697v2ã§119 GB / sã
- LinpackïŒãŸãã¯GEMMïŒããã³STREAMãã©ã€ã¢ããã³ãããŒã¯ã§åŸãããå€ã¯ããã©ãããã©ãŒã äžã®å¯Ÿå¿ããæ倧ããã©ãŒãã³ã¹ã€ã³ãžã±ãŒã¿ãŒãæäŸããŸããIntelXeon Phi 7120Aã®å Žåã¯2178 GFLOP / sããã³200 GB / sã 1866 MHz DDR3ã¡ã¢ãªãæèŒãã2ã€ã®Intel Xeon E5-2697v2ããã»ããµã®å Žåã930 GFLOP / sããã³100 GB / s
- ã¢ããªã±ãŒã·ã§ã³ã®æŒç®åŒ·åºŠã¯ãæµ®åå°æ°ç¹æ°ã®å ç®ãšä¹ç®ïŒADDãMULïŒã®æ°ãã¡ã¢ãªããéä¿¡ããããã€ãæ°ãããã³ã¡ã¢ãªãžã®ç¹å®ã®ããŠã³ããŒããšæžã蟌ã¿ïŒLOADãSTOREïŒã«åºã¥ããŠèšç®ãããŸãã
æåŸã®ãã€ã³ãã¯ãã³ã³ãã¥ãŒãã£ã³ã°ã·ã¹ãã ã«ç¡éã®ã¡ã¢ãªåž¯åå¹ ãšãµã€ãºã®ãã£ãã·ã¥ããããããŒã¿ã¢ã¯ã»ã¹ã®ã¬ã€ãã³ã·ïŒã¬ã€ãã³ã·ïŒããŒãã§ãããšããä»®å®ããåŸãããŸãã ããã«ããã1ã€ã®èŠçŽ ã®ã¿ãå¿ èŠãªå Žåã§ãä»»æã®é åãå®å šã«ããŒãããããäžçš®ã®å®ç§ãªã¡ã¢ãªãµãã·ã¹ãã ãå®çŸ©ãããŸãã
ä»ã®ããã€ãã®èŠå ãã3DFDã³ã¢ã䜿çšããã¢ããªã±ãŒã·ã§ã³å šäœã®ããã©ãŒãã³ã¹ã«åœ±é¿ãäžããå¯èœæ§ããããŸã-å¢çæ¡ä»¶ã®éžæãæéãé転ããããšãã®IOã¹ããŒã ãããã³äžŠåããã°ã©ãã³ã°ã®ãã¯ãããžãŒãŸãã¯ã¢ãã«ã ãã ããããã§çŽ¹ä»ããåæã§ã¯ãå¢çæ¡ä»¶ãšIOãèæ ®ããŠããŸããã ãã®åé¡ã«å¯Ÿãããœãªã¥ãŒã·ã§ã³ã®äžŠåå®è£ ã§ã¯ãOpenMPã䜿çšããã³ã³ãã¥ãŒãã£ã³ã°ããŒãã§ã®ã¹ã¬ãã䞊ååŠçãšãšãã«ãMPIæšæºã䜿çšããåæ£ã·ã¹ãã ã®ãã¡ã€ã³å解æ¹æ³ã䜿çšããŸãã ãã®ããŒããŒã§ã¯ãã³ã³ãã¥ãŒãã£ã³ã°ã·ã¹ãã ã®1ã€ã®ããŒãäžã®ãµããã¡ã€ã³ã§ã®èšç®ãæ€èšããŸãã
ãã©ãããã©ãŒã ã®æŒç®åŒ·åºŠ
ãã¹ãã·ã¹ãã ã¯ãCPUããã12ã³ã¢ã®2ã€ã®Xeon E5-2697 CPUïŒ2S-E5ïŒã§æ§æããããããããã¿ãŒãã¢ãŒããªãã§2.7 GHzã®åšæ³¢æ°ã§å®è¡ãããŸãã ãããã®ããã»ããµã¯ã256ãããå¹ ã®ãã¯ãã«ã¬ãžã¹ã¿ã«ããAVXåœä»€ã»ããæ¡åŒµããµããŒãããŠããŸãã ãããã®åœä»€ã¯ãåæã«ïŒCPUã¯ããã¯ãµã€ã¯ã«ããšã«ïŒå粟床ïŒ32ãããïŒã§8ã€ã®æµ®åå°æ°ç¹æ°ã䜿çšããŠèšç®ãå®è¡ã§ããŸãã ãããã£ãŠãçè«äžã®ããŒã¯ããã©ãŒãã³ã¹ã¯ã2.7ïŒGHzïŒx 8ïŒSP FPïŒx 2ïŒADD / MULLïŒx 12ïŒã³ã¢ïŒx 2ïŒCPUïŒ= 1036.8 GFLOP / sãšããŠèšç®ã§ããŸãã ããŒã¯åž¯åå¹ ã¯ãã¡ã¢ãªåšæ³¢æ°ïŒ1866 GHzïŒãã¡ã¢ãªãã£ãã«æ°[4]ãã¯ããã¯ãµã€ã¯ã«ããšã«éä¿¡ããããã€ãæ°ïŒ8ïŒã䜿çšããŠèšç®ããããã¥ã¢ã«ããã»ããµããŒãã®å Žåã¯1866 x 4 x 8 x 2ïŒCPUïŒ= 119 GB / sã«ãªããŸã2S-E5ã ãŸããã¢ããªã±ãŒã·ã§ã³ã®åäœãç¹åŸŽä»ããããã«ãã¹ã«ãŒããããšããã©ãŒãã³ã¹ã®çŸå®çã«éæå¯èœãªå€ãè©äŸ¡ããå¿ èŠããããŸãã æåã®è¿äŒŒãšããŠãå®éã®ã¢ããªã±ãŒã·ã§ã³ã®ããã©ãŒãã³ã¹ã¯ãã¹ããªãŒã ãã©ã€ã¢ããŸãã¯ããã»ããµé床ïŒåèšFLOP / sããŠã³ããŸãã¯ã³ã³ãã¥ãŒãã£ã³ã°ããŠã³ããŸãã¯CPUããŠã³ãïŒã䜿çšããŠæšå®ãããã¡ã¢ãªåž¯åå¹ ïŒåèšåž¯åå¹ å¶éïŒã«ãã£ãŠå¶éããããšä»®å®ããŸãã Linpackãã³ãããŒã¯ã瀺ããŸãã ããã2ã€ã®ãã³ãããŒã¯ã®éžæã¯çŽç²ã«ä»®èª¬ã«ãããŸããããçæ³çãªæšå®å€ãã倧ããå€ããŠããå Žåã¯ãã³ã³ãã¥ãŒãã£ã³ã°ã·ã¹ãã ã®çè«çãªããŒã¯å€ãããè¿äŒŒãšããŠç¢ºãã«é©ããŠãããšèšããŸãã
2S-E5ã·ã¹ãã ã§ã¯ãLinpackã¯930 GFLOP / sããã³Stream triad 100 GB / sãæäŸããŸãã ããã«ãçè«çããã³å®éã®æ倧ææšã®ç®è¡åŒ·åºŠïŒAIïŒã¯ããããã次ã®ããã«èšç®ã§ããŸãã
AIïŒçè«ãCPUïŒ= 1036.8 / 119 = 8.7 FLOP /ãã€ã
AIïŒéæå¯èœãCPUïŒ= 930/100 = 9.3 FLOP /ãã€ã
ãããã®å€ã䜿çšããŠãä»»æã®ã³ã³ãã¥ãŒãã£ã³ã°ã³ã¢ã次ã®ããã«ç¹åŸŽä»ããããšãã§ããŸãïŒã³ã¢ã®ç®è¡åŒ·åºŠã9.3 FLOP /ãã€ãããã倧ããïŒå°ããïŒå Žåããã®ã³ã¢ã¯ããã»ããµé床-CPUããŠã³ãïŒã¡ã¢ãªåž¯åå¹ ïŒã«ãã£ãŠå¶éãããŠãããšèšããŸãã
Xeon Phiã®Linpackããã³Streamãã©ã€ã¢ãã§åæ§ã®èšç®ãè¡ããšããããã2178 GFLOP / sããã³200 GB / sã«ãªããŸãã çè«äžã®ããŒã¯æšå®å€ã¯2420 GFLOP / sããã³352 GB / sã§ãã ãããã£ãŠãç®è¡åŒ·åºŠã¯æ¬¡ã®ããã«ãªããŸãã
AIïŒçè«çããã¡ã€ïŒ= 2420.5 / 352 = 6.87 FLOP /ãã€ã
AIïŒéæå¯èœããã¡ã€ïŒ= 2178/200 = 10.89 FLOP /ãã€ã
ã³ã³ãã¥ãŒãã£ã³ã°ã³ã¢ã®æŒç®åŒ·åºŠ
Rooflineã¢ãã«ã§ã¯ãç¹å®ã®ã¢ããªã±ãŒã·ã§ã³ã®ç®è¡åŒ·åºŠã®èšç®ãå¿ èŠã§ãã ã³ãŒãã®ç®èŠæ€æ»ãŸãã¯èšç®ã·ã¹ãã ã®ã«ãŠã³ã¿ãŒã«ã¢ã¯ã»ã¹ã§ããç¹å¥ãªæ段ã«ãããç®è¡æŒç®ãšã¡ã¢ãªãŒã¢ã¯ã»ã¹ã®æ°ãã«ãŠã³ãããããšã§ååŸã§ããŸãã å·®åã¹ããŒã [5]ã®æšæºçãªèšç®ã«ãŒãã«å ã§ã¯ã4ã€ã®ããŠã³ããŒãïŒcoeffãprevãnextãvelïŒã1ã€ã®ã¬ã³ãŒãïŒnextïŒã51ã®å ç®ïŒã€ã³ããã¯ã¹èšç®ã¯èæ ®ãããŸããïŒã27ã®ä¹ç®ïŒå³3ïŒãèŠã€ããããšãã§ããŸãã
for(int bz=HALF_LENGTH; bz<n3; bz+=n3_Tblock) for(int by=HALF_LENGTH; by<n2; by+=n2_Tblock) for(int bx=HALF_LENGTH; bx<n1; bx+=n1_Tblock) { int izEnd = MIN(bz+n3_Tblock, n3); int iyEnd = MIN(by+n2_Tblock, n2); int ixEnd = MIN(n1_Tblock, n1-bx); int ix; for(int iz=bz; iz<izEnd; iz++) { for(int iy=by; iy<iyEnd; iy++) { float* next = ptr_next_base + iz*n1n2 + iy*n1 + bx; float* prev = ptr_prev_base + iz*n1n2 + iy*n1 + bx; float* vel = ptr_vel_base + iz*n1n2 + iy*n1 + bx; for(int ix=0; ix<ixEnd; ix++) { float value = 0.0; value += prev[ix]*coeff[0]; for(int ir=1; ir<=HALF_LENGTH; ir++) { value += coeff[ir] * (prev[ix + ir] + prev[ix - ir]) ; value += coeff[ir] * (prev[ix + ir*n1] + prev[ix - ir*n1]); value += coeff[ir] * (prev[ix + ir*n1n2] + prev[ix - ir*n1n2]); } next[ix] = 2.0f* prev[ix] - next[ix] + value*vel[ix]; } }}}
å³3.ãã£ãã·ã¥ããããã³ã°ã䜿çšããã«ãŒãã«ãœãŒã¹ã³ãŒãã®èšç®
ç®è¡åŒ·åºŠã¯æ¬¡ã®åŒã§èšç®ã§ããŸãã
AI =ïŒ#ADD + #MULïŒ/ïŒïŒ#LOAD + #STOREïŒxã¯ãŒããµã€ãºïŒïŒ1ïŒ
ããã«ããã3.9 FLOP /ãã€ãã®ç®è¡åŒ·åºŠãåŸãããŸããããã«åãã©ãããã©ãŒã ã®çè«ã¹ã«ãŒããããæããŠããã®ã¢ã«ãŽãªãºã ã§éæå¯èœãªæ倧ããã©ãŒãã³ã¹ã®æåã®æšå®å€ãååŸããŸãã Xeon Phiã§ã¯1372.8 GFLOP / sã2S-E5ã§ã¯461.1 GFLOP / sã«ãªããŸãã ãã ããçè«äžã®ããŒã¯ããã©ãŒãã³ã¹å€ã¯ã2ã€ã®ãã€ãã©ã€ã³ïŒ1ã€ã¯ADDããã1ã€ã¯MULïŒã®äžŠå䜿çšãæå³ããŸãããå ç®ãšä¹ç®ã®äžåè¡¡ã«ãããã®èšç®ã³ã¢ã§ã¯äžå¯èœã§ããããããã®ã³ãŒãã¯ãã®æšå®æ倧å€ãéæã§ããŸããã ãããŠãããã¯éæå¯èœãªæ倧å€ã次ã®ãã®ã§å¹³åãããã¹ãã§ããããšãæå³ããŸãïŒ
ïŒ#ADD + #MULïŒ/ïŒ2 xæ倧ïŒè¿œå ãmulïŒïŒãïŒ2ïŒ
1ã€ã®256ãããAVX SIMDã³ã³ãã¥ãŒãã£ã³ã°ãŠãããã®äœ¿çšãæ³å®ããŠããµã€ã¯ã«ããã16ã®æµ®åå°æ°ç¹æŒç®ã§å¯èœãªæŒç®ã®åèšæ°ïŒops /ãµã€ã¯ã«ïŒãš8 ops /ãµã€ã¯ã«ã§å®è¡ãããå ç®ãšä¹ç®ã®æ倧æ°ã®æ¯çãåæ ããŸãã ããã«ãããå ç®ãšä¹ç®ã®äžåè¡¡ãèæ ®ããŠãããŒã¯ããã©ãŒãã³ã¹ã®çè«çãªæšå®å€ãåŸãããŸãã
å³1ããã³2ã¯ã2S-E5ããã³Xeon Phiã®äžéããããã354.9 GFLOP / sããã³1049.8 GFLOP / sã®ã«ãŒãã©ã€ã³ã¢ãã«ã瀺ããŠããŸãã
ããçŸå®çãªã«ãŒãã©ã€ã³ã¢ãã«ã¯ãStreamãã©ã€ã¢ããã³ãããŒã¯ã®åž¯åå¹ ã«ã³ã³ãã¥ãŒãã£ã³ã°ã³ã¢ã®æŒç®åŒ·åºŠïŒãããã390 GFLOP / sããã³780 GFLOP / sïŒãæããŠååŸã§ããŸãã èµ€ãç¹ç·ã§ç€ºãããŠããããã«ãå ç®ãšä¹ç®ã®äžåè¡¡ãèæ ®ãããšïŒïŒ2ïŒã䜿çšããŠïŒããã«çŸå®çãªã¢ãã«ãååŸã§ããŸãã æ°ããäžéã¯ã2S-E5ã§ã¯çŽ298 GFLOP / sãXeon Phiã§ã¯596 GFLOP / sã§ãã ãã®ã¢ãã«ã¯å®ç§ãªãã£ãã·ã¥ã¢ãã«ã«åºã¥ããŠãããããçµæã®å€ã¯äŸç¶ãšããŠéæå¯èœãªæ倧ããã©ãŒãã³ã¹å€ã®å€§ãŸããªæšå®å€ã§ãããšæ³å®ããŠããŸãã [2]ã«ç€ºãããŠããããã«ãã¡ã¢ãªãã£ãã·ã¥ã®å¹æãå¶éãªã©ãã³ã³ãã¥ãŒãã£ã³ã°ã·ã¹ãã ã®ç¹æ§ã«æ°ãããšã³ãã£ãã£ãè¿œå ããããšã§ãçµæãšããŠçããã«ãŒãã©ã€ã³ãæ¹åã§ããŸãã
ç¶è¡ããã«ã¯...
åç §è³æ
- D. ImbertãKãImmadouedineãPãThierryãHãChaurisãLãBorgesããExpanded Abstractsãã®ãæéå·®åãši / o-less fwiã®ãã³ããšã³ããã Socã 説æ Geophysãã2011ãppã 3174-3178ã
- S.ãŠã£ãªã¢ã ãºãAããŠã©ãŒã¿ãŒãã³ãDããã¿ãŒãœã³ããã«ãŒãã©ã€ã³ïŒãã«ãã³ã¢ã¢ãŒããã¯ãã£ã®æŽå¯åã«å¯ãã èŠèŠããã©ãŒãã³ã¹ã¢ãã«ããCommunications of the ACM-A Direct Path to Dependable Softwareãvolã 52ãppã 65ã76ã2009幎4æã
- J. DongarraãPãLuszczekãããã³A. Petitetããlinpackãã³ãããŒã¯ïŒéå»ãçŸåšãæªæ¥ãã䞊è¡æ§ãšèšç®ïŒå®è·µãšçµéšãvolã 15ããããã 9ãppã 803â820ã2003ãdoiïŒ10.1002 / cpe.728ã
- JD McCalpinããã¹ããªãŒã ïŒé«æ§èœã³ã³ãã¥ãŒã¿ãŒã®æç¶å¯èœã¡ã¢ãªåž¯åå¹ ããããŒãžãã¢å€§åŠãããŒãžãã¢å·ã·ã£ãŒããããã«ãTechã Repãã1991-2007ãç¶ç¶çã«æŽæ°ãããæè¡å ±åæžã www.cs.virginia.edu/stream
- L. BorgesããIntel Xeon Phiã³ããã»ããµåãã®å°éã€ã¡ãŒãžã³ã°ã³ãŒãã®éçºçµéšãã2012幎ãsoftware.intel.com/en-us/blogs/2012/10/26/experiences-in-developing-seismic-imaging-code -for-intel-xeon-phi-coprocessor
- JH HollandããéºäŒçã¢ã«ãŽãªãºã ãšè©Šè¡ã®æé©ãªå²ãåœãŠããSIAM Journal of Computingãvolã 2ããããã 2ãppã 88-105ã1973ã