MPIæšæºïŒMessage Passing InterfaceïŒãå®è£ ããã©ã€ãã©ãªã¯ãã¯ã©ã¹ã¿ãŒã§èšç®ãæŽçããããã®æãäžè¬çãªã¡ã«ããºã ã§ãã MPIã䜿çšãããšãããŒãïŒãµãŒããŒïŒéã§ã¡ãã»ãŒãžã転éã§ããŸããã1ã€ã®ããŒãã§è€æ°ã®MPIããã»ã¹ãå®è¡ããããšã誰ãæ°ã«ãããè€æ°ã®ã³ã¢ã®å¯èœæ§ãå®çŸããŸãã HPCã¢ããªã±ãŒã·ã§ã³ã¯é »ç¹ã«äœæããããããç°¡åã§ãã ãŸãã1ã€ã®ããŒãäžã®ã³ã¢ã®æ°ã¯å°ãªãã£ããã®ã®ããã¯ãªãŒã³MPIãã¢ãããŒãã«åé¡ã¯ãããŸããã§ããã ããããä»æ¥ãã³ã¢ã®æ°ã¯ãIntel Xeon-Phiã³ããã»ããµãŒã®å Žåãæ°åããŸãã¯æ°çŸã«ããªããŸãã ãŸãããã®ãããªç¶æ³ã§ã¯ã1å°ã®ãã·ã³ã§æ°åã®ããã»ã¹ãå®è¡ããããšã¯å®å šã«å¹æçã§ã¯ãããŸããã
å®éãMPIããã»ã¹ã¯ãããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ãä»ããŠéä¿¡ããŸãïŒãã ãã1å°ã®ãã·ã³ã®å ±æã¡ã¢ãªãä»ããŠå®è£ ãããŸãïŒã ããã«ã¯ãè€æ°ã®ãããã¡éã§ã®ããŒã¿ã®åé·ã³ããŒãšã¡ã¢ãªæ¶è²»ã®å¢å ã䌎ããŸãã
å ±æã¡ã¢ãªãåããåããã·ã³å ã§ã®äžŠåã³ã³ãã¥ãŒãã£ã³ã°ã«ã¯ãã¹ã¬ãããšã¹ã¬ããéã®ã¿ã¹ã¯ã®åæ£ãã¯ããã«é©ããŠããŸãã ããã§ãHPCã®äžçã§æã人æ°ãããã®ã¯OpenMPæšæºã§ãã
ã©ããã-ãŸããããŒãå ã§OpenMPã䜿çšããããŒãééä¿¡ã«MPIã䜿çšããŠããŸãã ããããããã»ã©åçŽã§ã¯ãããŸããã 1ã€ã§ã¯ãªã2ã€ã®ãã¬ãŒã ã¯ãŒã¯ïŒMPIãšOpenMPïŒã䜿çšãããšãããã°ã©ãã³ã°ãããã«è€éã«ãªãã ãã§ãªããå°ãªããšãããã«ã§ã¯ãªããåžžã«æãŸããããã©ãŒãã³ã¹ãåäžããããã§ã¯ãããŸããã MPIãšOpenMPã®éã§ã³ã³ãã¥ãŒãã£ã³ã°ãåæ£ããæ¹æ³ã決å®ããå Žåã«ãã£ãŠã¯åã¬ãã«ã«åºæã®åé¡ã解決ããå¿ èŠããããŸãã
ãã®èšäºã§ã¯ããã€ããªããã¢ããªã±ãŒã·ã§ã³ã®äœæã«ã€ããŠã¯èª¬æããŸãããæ å ±ãèŠã€ããã®ã¯é£ãããããŸããã Intel Parallel StudioããŒã«ã䜿çšããŠãã€ããªããã¢ããªã±ãŒã·ã§ã³ãåæããæé©ãªæ§æãéžæããããŸããŸãªã¬ãã«ã§ããã«ããã¯ãæé€ããæ¹æ³ãæ€èšããŸãã
ãã¹ãã«ã¯ãNASA Parallel Benchmarkã䜿çšããŸãã
- CPUïŒIntel Xeonããã»ããµãŒE5-2697 v2 @ 2.70GHzã2ãœã±ãããå12ã³ã¢ã
- OSïŒRHEL 7.0 x64
- Intel Parallel Studio XE 2016 Cluster Edition
- ã³ã³ãã€ã©ãŒïŒã€ã³ãã«Â®ã³ã³ãã€ã©ãŒ16.0
- MPIïŒIntel MPIã©ã€ãã©ãª5.1.1.109
- ã¯ãŒã¯ããŒãïŒNPB 3.3.1ããCG-å ±åœ¹åŸé ãäžèŠåãªã¡ã¢ãªã¢ã¯ã»ã¹ãšéä¿¡ãã¢ãžã¥ãŒã«ãã¯ã©ã¹B
ãã³ãããŒã¯ã¯ãã§ã«ãã€ããªãããšããŠå®è£ ãããŠãããMPIããã»ã¹ãšOpenMPã¹ããªãŒã ã®æ°ãæ§æã§ããŸãã ïŒã¢ããªã±ãŒã·ã§ã³ã®äžéšãšããŠïŒããŒãééä¿¡ã«ã¯MPIã«ä»£ãããã®ããªãããšã¯æããã§ãã é°è¬ã¯ãåäžããŒãïŒMPIãŸãã¯OpenMPïŒã§å®è¡ãããããšã§ãã
MPIããã©ãŒãã³ã¹ã¹ãããã·ã§ãã
24åã®ã³ã¢ãèªç±ã«äœ¿çšã§ããŸãã åŸæ¥ã®ã¢ãããŒãããå§ããŸããã-MPIã®ã¿ã 24 MPIããã»ã¹ãå1ã¹ã¬ããã ããã°ã©ã ãåæããã«ã¯ãææ°ããŒãžã§ã³ã®Intel Parallel Studioã§ãªãªãŒã¹ãããæ°ããããŒã«-MPI Performance Snapshotã䜿çšããŸãã ã-mpsãã¹ã€ãããmpirunèµ·åè¡ã«è¿œå ããã ãã§ãã
source /opt/intel/vtune_amplifier_xe/amplxe-vars.sh source /opt/intel/itac/9.1.1.017/intel64/bin/mpsvars.sh --vtune mpirun -mps ân 24 ./bt-mz.B.24 mps -g stats.txt app_stat.txt _mps
æåã®2è¡ã¯ç®çã®ç°å¢ãèšå®ãã3è¡ç®ã¯MPSãããã¡ã€ãªã³ã°ã§ããã°ã©ã ãéå§ããŸãã æåŸã®è¡ã¯ãhtml圢åŒã®ã¬ããŒããçæããŸãã -gãæå®ããªãå Žåãã¬ããŒãã¯ã³ã³ãœãŒã«ã«è¡šç€ºãããŸã-ã¯ã©ã¹ã¿ãŒã§ããã«è¡šç€ºããã®ã«äŸ¿å©ã§ãããHTMLã§ã¯ããçŸãããªããŸãã
MPSã¯ããããã¬ãã«ã®ããã©ãŒãã³ã¹è©äŸ¡ãæäŸããŸãã å®è¡ã®ãªãŒããŒãããã¯éåžžã«å°ããã倧èŠæš¡ïŒãã¹ãæžã¿ã®32,000ããã»ã¹ïŒã§ãã¢ããªã±ãŒã·ã§ã³ãè¿ éã«è©äŸ¡ã§ããŸãã
éå§ããã«ã¯ãMPIæéãšèšç®æéã®ã·ã§ã¢ãèŠãŠãã ããã æéã®32ïŒ ãMPIã«è²»ãããŠããŸããããã®ã»ãšãã©ã¯è² è·ã®äžåè¡¡ã«ãããã®ã§ããäžéšã®ããã»ã¹ã¯ä»ã®ããã»ã¹ã®æ€èšãåŸ ã£ãŠããŸãã å³åŽã®ãããã¯ã«ã¯æšå®å€ããããŸã-MPIæéã¯HIGHãšããŒã¯ãããŠããŸã-éä¿¡ã«ç¡é§ãå€ãããŸãã MPIã®åé¡ã詳现ã«åæããããã®å¥ã®ããŒã«ãIntel Trace Analyzer and CollectorïŒITACïŒãžã®åç §ããããŸãã OpenMPã«ã€ããŠã¯ãåé¡ã¯ç¹ã«åŒ·èª¿ãããŠããŸããããå®éã«ã¯ç¡å¹ã«ãããããé©ãããšã§ã¯ãããŸããã
MPSã¯ãGFPLOSãCPIãããã³ãã¡ã¢ãªããŠã³ããã¡ããªãã¯ãªã©ã®ããŒããŠã§ã¢ããã©ãŒãã³ã¹ã¡ããªãã¯ãèæ ®ããŸããããã¯ãã¡ã¢ãªããã©ãŒãã³ã¹ã®å šäœçãªè©äŸ¡ã§ãã ãŸããã¡ã¢ãªæ¶è²»ïŒ1ã€ã®MPIããã»ã¹ïŒ-æ倧ããã³å¹³åã
Intel Trace Analyzer and Collector
MPSã¯ãäž»ãªåé¡ãMPIã®ã24x1ãæ§æã§ããããšã瀺ããŸããã çç±ã調ã¹ãããã«ãITACãããã¡ã€ã«ãåéããŸãã
source /opt/intel/itac/9.1.1.017/intel64/bin/itacvars.sh mpirun -trace -n 24 ./bt-mz.B.24
ITAC GUIã§ãã©ãã¯ãéããŸã-WindowsããŒãžã§ã³ã䜿çšããŸããã å®éçã¿ã€ã ã©ã€ã³ã°ã©ãã¯ãMPIã®å²åã倧ãããéä¿¡ãäžå®ã®åšææ§ã§åæ£ãããŠããããšãæ確ã«ç€ºããŠããŸãã äžçªäžã®ã°ã©ãã¯ãMPIã¢ã¯ãã£ããã£ã®åšæçãªããŒã¹ãã瀺ããŠããŸãã
ã€ãã³ãã¿ã€ã ã©ã€ã³ã§ãã®ãããªããŒã¹ããè€æ°åŒ·èª¿è¡šç€ºãããŠããå Žåãéä¿¡ãåçã«åæ£ãããŠããªãããšãããããŸãã ã©ã³ã¯0ã4ã®ããã»ã¹ã¯ããã«ã«ãŠã³ããããã©ã³ã¯15ã23ã®ããã»ã¹ã¯ããã«å€ããªããŸãã è² è·ã®äžåè¡¡ã¯æããã§ãã
ã¡ãã»ãŒãžãããã¡ã€ã«ã°ã©ãã§ã¯ãã©ã®ããã»ã¹ãã¡ãã»ãŒãžã亀æããŠãããéä¿¡ãæãé·ãå Žæãæ£ç¢ºã«è©äŸ¡ã§ããŸãã
ããšãã°ãã©ã³ã¯17ãš5ã16ãš0ã18ãš7ãªã©ã®ããã»ã¹éã®ã¡ãã»ãŒãžã¯ãä»ã®ã¡ãã»ãŒãžãããé·ãç¶ããŸãã ã€ãã³ãã¿ã€ã ã©ã€ã³ãããã«åŒ·åããããšã§ãã©ã³ã¯17ã®é»ãç·ãã¯ãªãã¯ããŠã転éã®è©³çŽ°ïŒèª°ããã誰ãžã®ã¡ãã»ãŒãžãµã€ãºãé話ã®éåä¿¡ïŒã確èªã§ããŸãã
ããã©ãŒãã³ã¹ã¢ã·ã¹ã¿ã³ãããã«ã«ã¯ãéžæããé åã§ããŒã«ã«ãã£ãŠæ€åºãããç¹å®ã®åé¡ã衚瀺ãããŸãã ããšãã°ããé 延éä¿¡ãïŒ
MPIã®äžåè¡¡ã¯ãéä¿¡ã¹ããŒã ã®æ¬ é¥ã ãã§ãªããäžéšã®ããã»ã¹ãä»ã®ããã»ã¹ãããé ããšèŠãªãããæçšãªã³ã³ãã¥ãŒãã£ã³ã°ã®åé¡ã«ãã£ãŠãçºçããå¯èœæ§ããããŸãã ãã®ã¢ããªã±ãŒã·ã§ã³ãããã»ã¹ã®1ã€ã®å éšã§æéã浪費ããŠããããšãããã³åé¡ã®å¯èœæ§ã«é¢å¿ãããå ŽåãITACã¯ãã®ã©ã³ã¯ã®Intel VTune Amplifierãèµ·åããã³ãã³ãã©ã€ã³ãçæã§ããŸãïŒ2çªç®ãªã©ïŒã
ãã ããåŸã§VTune Amplifierã«æ»ããŸãã ãšã«ãããITACã¯MPIéä¿¡ã®è©³çŽ°ãªç 究ã®ããã®å€ãã®æ©äŒãæäŸããŸãããç§ãã¡ã®ä»äºã¯OpenMPãšMPIã®éã®æé©ãªãã©ã³ã¹ãéžæããããšã§ãã ãã®ããã24ã©ã³ã¯ã®MPIéä¿¡ãããã«ä¿®æ£ããå¿ èŠã¯ãããŸãããä»ã®ãªãã·ã§ã³ãæåã«è©Šãããšãã§ããŸãã
ãã®ä»ã®ãªãã·ã§ã³
ãããã£ãŠãçµéšçã«ã12x4ããã³6x4ãã£ã¹ããªãã¥ãŒã·ã§ã³ã¯ä»ã®ãã£ã¹ããªãã¥ãŒã·ã§ã³ãããåªããŠããããšãå€æããŸããã ããã»ã¹ããã2ã€ã®OpenMPã¹ããªãŒã ã§ããã2ã€ã®MPIããã»ã¹ãããå€§å¹ ã«é«éã§ãã ãã ããã¹ã¬ããæ°ã®å¢å ã«äŒŽããåäœæéãåã³å¢å ãå§ããŸãã2x12ã¯ãçŽç²ãªMPIããããããã«æªåãã1x24ã¯æå³ããããŸããã ãããŠãæ¬ ç¹ã¯äœæ¥ã®äžåè¡¡ã§ãããããã¯å€æ°ã®OpenMPã¹ããªãŒã ã«ååã«åæ£ãããŠããŸããã ãªãã·ã§ã³2x12ã«ã¯æ倧30ïŒ ã®äžåè¡¡ããããŸãã
ããã§åæ¢ãããããããŸããããªããªã 劥åç¹ã¯12x4ãŸãã¯6x4ã«éããŸããã ããããããã«æ·±ãæãäžããããšãã§ããŸã-OpenMPã¹ã±ãŒãªã³ã°ã®åé¡ã調æ»ããããã«ã
VTuneã¢ã³ã
OpenMPã®åé¡ã®è©³çŽ°ãªåæã«ã¯ãã€ã³ãã«VTuneã¢ã³ãXEãæé©ã§ããããã«ã€ããŠã¯ãã§ã«è©³ãã説æããŸããã
source /opt/intel/vtune_amplifier_xe/amplxe-vars.sh mpirun -gtool "amplxe-cl -c advanced_hotspots -r my_result:1" -n 24 ./bt-mz.B.24
VTune AmplifierãIntel Advisor XEãªã©ã®ã¢ãã©ã€ã¶ãŒãå®è¡ããã«ã¯ãgtoolãªãã·ã§ã³æ§æïŒIntel MPIã®ã¿ïŒã䜿çšãããšéåžžã«äŸ¿å©ã«ãªããŸããã MPIã¢ããªã±ãŒã·ã§ã³ã®èµ·åã©ã€ã³ã«çµã¿èŸŒãŸããŠãããããéžæããããã»ã¹ïŒãã®äŸã§ã¯ã©ã³ã¯1ã®ã¿ïŒã§ã®ã¿åæãå®è¡ã§ããŸãã
ã2 MPIããã»ã¹ã12 OpenMPã¹ããªãŒã ããªãã·ã§ã³ã®ãããã¡ã€ã«ãèŠãŠã¿ãŸãããã æãé«äŸ¡ãªäžŠåã«ãŒãã®1ã€ã§ã¯ã1.5ã®ãã¡0.23ç§ãäžåè¡¡ã«ãªããŸãã ããã«è¡šã§ã¯ãã¹ã±ãžã¥ãŒãªã³ã°ã®ã¿ã€ãã¯éçã§ãããäœæ¥ã®ååé ã¯è¡ãããªãããšãããããŸãã ããã«ãã«ãŒãã«ã¯41åã®å埩ããããé£æ¥ããã«ãŒãã«ã¯10ã20åã®å埩ããããŸãã ã€ãŸã 12ã¹ã¬ããã§ã¯ãåã¹ã¬ããã¯3ã4åã®å埩ããååŸããŸããã ã©ããããããã¯å¹æçãªè² è·åæ£ã«ã¯äžååã§ãã
2ã4åã®ã¹ã¬ããã䜿çšãããšãåã¹ã¬ããã®åŠçéãå¢ããäžåè¡¡ã«ãã£ãŠçºçããã¢ã¯ãã£ããªåŸ æ©ã®çžå¯Ÿæéãççž®ãããŸãã ã6x4ããããã¡ã€ã«ã§ç¢ºèªãããããš-äžåè¡¡ã¯ã¯ããã«äœããªããŸãã
ããã«ãIntel VTune Amplifier 2016ã§ã¯ãMPIæéã衚瀺ãããŸãã-ãMPI Communication Spinningãåãšã¿ã€ã ã©ã€ã³äžã®é»è²ã®ããŒã¯ã 1ã€ã®ããŒãã§è€æ°ã®ããã»ã¹ã®VTuneãããã¡ã€ã«ãäžåºŠã«å®è¡ããããããã®OpenMPã¡ããªãã¯ãšãšãã«MPIã®å転ã芳å¯ã§ããŸãã
Intel Advisor XE
ã¯ã©ã¹ã¿ãŒã¹ã±ãŒã«ïŒMPIïŒãã1ã€ã®ããŒãã®ãããŒïŒOpenMPïŒãŸã§ã䞊ååŠçã®ã¬ãã«ãäžãããšãåããããŒå ã®ããŒã¿ïŒSIMDåœä»€ã«åºã¥ããã¯ãã«åïŒã«åŸã£ãŠäžŠååŠçãè¡ãããŸãã ããã§ããæé©åã®é倧ãªå¯èœæ§ãããå¯èœæ§ããããŸãããæåŸã«å°éããã®ã¯ç¡é§ã§ã¯ãããŸããã§ããããŸããMPIããã³OpenMPã¬ãã«ã§åé¡ã解決ããå¿ èŠããããŸãã æœåšçã«åã€ããšãã§ããå¯èœæ§ããããŸãã Advisorã«ã€ããŠã®èšäºã¯2ã€åïŒ 1ã€ç®ãš2ã€ç® ïŒã§ããã®ã§ãããã§ã¯ããŒã³ãã©ã€ã³ã«éå®ããŸãã
source /opt/intel/advisor_xe/advixe-vars.sh mpirun -gtool "advixe-cl -collect survey --project-dir ./my_proj:1" -n 2 ./bt-mz.2
次ã«ãåã«æžããããã«ãã³ãŒãã®ãã¯ãã«åãåæããŸãã ã¢ããã€ã¶ãŒã¯ããšã³ã·ã¹ãã åæã¯ã©ã¹ã¿ãŒMPIããã°ã©ã ã®éèŠãªéšåã§ãã Advisorã¯ãã³ãŒããã¯ãã«åã®è©³çŽ°ãªèª¿æ»ã«å ããŠããã«ãã¹ã¬ããå®è¡ã®ãããã¿ã€ããäœæããã¡ã¢ãªã¢ã¯ã»ã¹ãã¿ãŒã³ãæ€èšŒããŸãã
ãŸãšã
Intel Parallel Studioã¯ããã€ããªããHPCã¢ããªã±ãŒã·ã§ã³ã®ããã©ãŒãã³ã¹ãåæããããã®4ã€ã®ããŒã«ãæäŸããŸãã
- MPIããã©ãŒãã³ã¹ã¹ãããã·ã§ããïŒã¯ã©ã¹ã¿ãŒã¬ãã«ïŒ-è¿ éãªããã©ãŒãã³ã¹è©äŸ¡ãæå°éã®ãªãŒããŒããããæ倧32000 MPIããã»ã¹ã®ãããã¡ã€ãªã³ã°ãMPIãšOpenMPã®äžåè¡¡ã®è¿ éãªè©äŸ¡ãäžè¬çãªããã©ãŒãã³ã¹è©äŸ¡ïŒGFLOPSãCPIïŒã
- Intel Trace Analyzer and CollectorïŒã¯ã©ã¹ã¿ãŒã¬ãã«ïŒ-MPIã®è©³çŽ°ãªèª¿æ»ãéä¿¡ãã¿ãŒã³ã®èå¥ãç¹å®ã®ããã«ããã¯ã®ããŒã«ãªãŒãŒã·ã§ã³ã
- Intel VTune Amplifier XEïŒã·ã³ã°ã«ããŒãã¬ãã«ïŒ-ãœãŒã¹ã³ãŒããšã¹ã¿ãã¯ãäžåè¡¡ãªã©ã®OpenMPã®åé¡ããã£ãã·ã¥ãšã¡ã¢ãªäœ¿çšéã®åæãªã©ã®è©³çŽ°ãªãããã¡ã€ã«ã
- ã€ã³ãã«Â®Advisor XEïŒã·ã³ã°ã«ããŒãã¬ãã«ïŒ-ãã¯ãã«åœä»€ã®äœ¿çšã®åæãšéå¹çã®çç±ã®ç¹å®ããã«ãã¹ã¬ããå®è¡ã®ãããã¿ã€ãã³ã°ãã¡ã¢ãªãŒã¢ã¯ã»ã¹ãã¿ãŒã³ã®åæã