ããŒã1ïŒã¯ããã«ãšåå¥è§£æ
ããŒã2ïŒããŒãµãŒãšASTã®å®è£
ããŒã3ïŒLLVM IRã³ãŒãçæ
ããŒã4ïŒJITããã³ãªããã£ãã€ã¶ãŒãµããŒãã®è¿œå
ããŒã5ïŒèšèªæ¡åŒµïŒå¶åŸ¡ãããŒ
ããŒã6ïŒèšèªæ¡åŒµïŒãŠãŒã¶ãŒå®çŸ©æŒç®å
ããŒã7ïŒèšèªæ¡åŒµïŒå¯å€å€æ°
ããŒã8ïŒãªããžã§ã¯ãã³ãŒããžã®ã³ã³ãã€ã«
ããŒã9ïŒãããã°æ å ±ã®è¿œå
ããŒã10ïŒçµè«ãšãã®ä»ã®LLVMã®å©ç¹

7.1ã ã¯ããã«
ã¬ã€ããLLVMã䜿çšããããã°ã©ãã³ã°èšèªã®äœæãã®ç¬¬7ç« ã«ããããã 1ã6ç« ã§ã¯ãåçŽã§ã¯ãããå®å šãªé¢æ°åããã°ã©ãã³ã°èšèªãæ§ç¯ããŸããã ãã®ãã¹ã§ãããã€ãã®è§£æææ³ãASTã®æ§ç¯æ¹æ³ãšè¡šçŸæ¹æ³ãLLVM IRã®æ§ç¯æ¹æ³ãçµæã³ãŒãã®æé©åæ¹æ³ãJITã®ã³ã³ãã€ã«æ¹æ³ãåŠã³ãŸããã
ã«ã¬ã€ãã¹ã³ãŒãã¯é¢æ°åèšèªãšããŠèå³æ·±ããã®ã§ãããé¢æ°åã§ãããšããäºå®ã«ãããLLVM IRçæã¯åçŽãããŸãã ç¹ã«ããã®èšèªã®æ©èœã«ãããLLVM IRãSSA圢åŒã§çŽæ¥ç°¡åã«æ§ç¯ã§ããŸãã LLVMã§ã¯å ¥åã³ãŒããSSA圢åŒã§ããå¿ èŠããããããããã¯éåžžã«åªããæ©èœã§ãããåå¿è ã«ãšã£ãŠã¯ãå¯å€å€æ°ã䜿çšããåœä»€åèšèªã®ã³ãŒãã®çææ¹æ³ãäžæ確ãªããšããããããŸãã
ãã®ç« ã®çãïŒãããŠå¹žããªïŒèŠçŽã¯ãããã³ããšã³ãã§SSAãã©ãŒã ãäœæããå¿ èŠããªããšããããšã§ããLLVMã¯ãããã«å¯ŸããŠé©åã«èª¿æŽãããååã«ãã¹ãããããµããŒããæäŸããŸãã
7.2ã ãªããããé£ããã¿ã¹ã¯ã§ããïŒ
å¯å€å€æ°ãSSAã®æ§ç¯ãå°é£ã«ããçç±ãç解ããããã«ãéåžžã«åçŽãªCã®äŸãèããŠã¿ãŸãããã
int G, H; int test(_Bool Condition) { int X; if (Condition) X = G; else X = H; return X; }
ãã®äŸã§ã¯ãå€æ°ãXããããããã®å€ã¯ããã°ã©ã ã®ãã¹ã«äŸåããŸãã returnã³ãã³ãã®åã«2ã€ã®Xå€ãååšããå¯èœæ§ããããããäž¡æ¹ã®å€ãçµåããPHIããŒããæ¿å ¥ãããŸãã ãã®äŸã®LLVM IRã¯æ¬¡ã®ããã«ãªããŸãã
@G = weak global i32 0 ; type of @G is i32* @H = weak global i32 0 ; type of @H is i32* define i32 @test(i1 %Condition) { entry: br i1 %Condition, label %cond_true, label %cond_false cond_true: %X.0 = load i32* @G br label %cond_next cond_false: %X.1 = load i32* @H br label %cond_next cond_next: %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ] ret i32 %X.2 }
ãã®äŸã§ã¯ãã°ããŒãã«å€æ°Gããã³HããLLVM IRãžã®ããŒãã¯æ瀺çã«è¡ãããifã¹ããŒãã¡ã³ãã®then / elseãã©ã³ãïŒcond_true / cond_falseïŒã§æå¹ã§ãã å ¥åå€ãçµåããããã«ãcond_nextãããã¯ã®X.2 phiããŒãã¯ãå¶åŸ¡ãããŒã®çºçå ã«å¿ããŠæ£ããå€ãéžæããŸããcond_falseãããã¯ããã®å ŽåãX.2ã¯X.1ã®å€ãååŸããŸãã
å¶åŸ¡ãããŒãcond_trueããæ¥ãå Žåãå€æ°ã¯å€X.0ãååŸããŸãã ãã®ç« ã§ã¯ãSSAãã©ãŒã ã®è©³çŽ°ã«ã€ããŠã¯èª¬æããŸããã 詳现ã«ã€ããŠã¯ãå€ãã®ãªã³ã©ã€ã³ãã£ã¬ã¯ããªã®ãããããåç §ããŠãã ããã
ãã®ç« ã®è³ªåã¯ãå€æ°ãžã®ä»£å ¥ãéããããšãã«ãã¡ã€ããŒãã誰ãé åžãããã§ãã ãã®åé¡ã¯ãLLVMãIRãSSA圢åŒã«ããå¿ èŠãããããšã§ãããSSAãªããã¢ãŒãã¯ãããŸããã ãã ããSSAã®èšèšã«ã¯éèŠãªã¢ã«ãŽãªãºã ãšããŒã¿æ§é ãå¿ èŠã§ããããã¹ãŠã®ãã¡ã³ãã§ãã®ããžãã¯ãåçŸããã®ã¯äžäŸ¿ã§è²»çšãããããŸãã
7.3ã LLVMã®ã¡ã¢ãª
LLVMã§ã¯ããã¹ãŠã®ã¬ãžã¹ã¿å€ãSSA圢åŒã§ããããšãå¿ èŠã§ãããã¡ã¢ãªãªããžã§ã¯ããSSA圢åŒã§ããããšã¯èš±å¯ãããŸããïŒèš±å¯ãããŸããïŒã äžèšã®äŸã§ã¯ãGããã³Hããã®ããŒãã³ãã³ããGããã³Hã«çŽæ¥ã¢ã¯ã»ã¹ã§ããããšã«æ³šæããŠãã ããããããã®ã³ãã³ãã¯ååå€æŽãçªå·ä»ãããããŠããŸããã ããã¯ãã¡ã¢ãªå ã®ãªããžã§ã¯ãã«ããŒãžã§ã³çªå·ãå²ãåœãŠãããšããä»ã®ã³ã³ãã€ã©ã·ã¹ãã ãšã¯å¯Ÿç §çã§ãã LLVMã§ã¯ãLLVM IRã§ã¡ã¢ãªããŒã¿ã¹ããªãŒã ãåæãã代ããã«ããªã³ããã³ãã§å®è¡ãããåæãã¹ã§ãããçºçããŸãã
åºæ¬çãªèãæ¹ã¯ãé¢æ°å ã®ãã¹ãŠã®å¯å€ãªããžã§ã¯ãã«å¯ŸããŠã¹ã¿ãã¯å€æ°ïŒã¹ã¿ãã¯ã§ããããã¡ã¢ãªå ã«ååšããïŒãäœæã§ãããšããããšã§ãã ãã®ã¢ãããŒãã®å©ç¹ãç解ããã«ã¯ãLLVMã§ã¹ã¿ãã¯å€æ°ãã©ã®ããã«è¡šããããã«ã€ããŠè©±ãå¿ èŠããããŸãã
LLVMã§ã¯ããã¹ãŠã®ã¡ã¢ãªã¢ã¯ã»ã¹æäœã¯ãããŒã/ã¹ãã¢ã³ãã³ãã䜿çšããŠæ瀺çã«å®è¡ãããã¢ãã¬ã¹æŒç®åãäžèŠã«ãªãããã«æ éã«èšèšãããŠããŸãã ã°ããŒãã«å€æ°@ G / @ Hã¯ãå€æ°èªäœããi32ããšããŠå®£èšãããŠããå Žåã§ããå®éã«ã¯ãi32 *ãã§ããããšã«æ³šæããŠãã ããã ããã¯ã@ Gãã°ããŒãã«ããŒã¿é åã§i32ã®å Žæãå®çŸ©ããããšãæå³ããŸããããã®å€æ°ã®ååã¯å®éã«ã¯ãã®ã¹ããŒã¹ã®ã¢ãã¬ã¹ãåç §ããŸãã ã¹ã¿ãã¯å€æ°ã¯åãããã«æ©èœããŸãããã°ããŒãã«å€æ°ãšã¯ç°ãªãããããã¯allocaã³ãã³ãã§å®£èšãããŸãã
define i32 @example() { entry: %X = alloca i32 ; %X - i32*. ... %tmp = load i32* %X ; %X . %tmp2 = add i32 %tmp, 1 ; store i32 %tmp2, i32* %X ; ...
ãã®ã³ãŒãã¯ãLLVM IRã§ã¹ã¿ãã¯å€æ°ã宣èšããŠæäœããæ¹æ³ã®äŸã瀺ããŠããŸãã allocaã³ãã³ãã«ãã£ãŠå²ãåœãŠãããã¹ã¿ãã¯ã¡ã¢ãªã¯å®å šã«äžè¬åãããŠããŸããã¹ã¿ãã¯ã¹ãããã®ã¢ãã¬ã¹ãé¢æ°ã«æž¡ããããå¥ã®å€æ°ã«ä¿åãããã§ããŸãã äžèšã®äŸã§ã¯ããallocaãææ³ã䜿çšããŠPHIããŒãã®äœ¿çšãåé¿ããããã«ãã®äŸãæžãæããããšãã§ããŸãã
@G = weak global i32 0 ; type of @G is i32* @H = weak global i32 0 ; type of @H is i32* define i32 @test(i1 %Condition) { entry: %X = alloca i32 ; type of %X is i32*. br i1 %Condition, label %cond_true, label %cond_false cond_true: %X.0 = load i32* @G store i32 %X.0, i32* %X ; X br label %cond_next cond_false: %X.1 = load i32* @H store i32 %X.1, i32* %X ; X br label %cond_next cond_next: %X.2 = load i32* %X ; X ret i32 %X.2 }
ããã§ãPHIããŒããäœæããããšãªããä»»æã®å¯å€å€æ°ãåŠçããæ¹æ³ãçºèŠããŸããã
åå¯å€å€æ°ã¯ã¹ã¿ãã¯é åã«ãªããŸãã
å€æ°ã®åèªã¿åãå€ã¯ãã¹ã¿ãã¯ããã®è² è·ã«ãªããŸãã
å€æ°ã®åãšã³ããªã¯ãã¹ã¿ãã¯ã®ä¿åã«ãªããŸãã
å€æ°ã®ã¢ãã¬ã¹ãååŸããããšã¯ãåã«ã¹ã¿ãã¯äžã®ã¢ãã¬ã¹å€ãçŽæ¥äœ¿çšããããšã§ãã
åé¡ã¯è§£æ±ºããŸããããå¥ã®åé¡ãçºçããŸãããçŸåšãéåžžã«åçŽã§äžè¬çãªæäœã®ããã«ãã¹ã¿ãã¯ãšã®éäžçãªäº€æãè¡ãããŠããŸãã 幞ããªããšã«ãLLVMãªããã£ãã€ã¶ãŒã«ã¯ãmem2regããšåŒã°ããé©åã«èª¿æŽãããæé©åãã¹ãããããã®ãããªã±ãŒã¹ãåŠçããallocaã³ãã³ããSSAã¬ãžã¹ã¿ã«å€æããå¿ èŠã«å¿ããŠPhiããŒããæ¿å ¥ããŸãã äžèšã®äŸã®ã³ãŒãããã®ããã»ãŒãžã«æž¡ããšã次ã®ãã®ãåŸãããŸãã
$ llvm-as < example.ll | opt -mem2reg | llvm-dis @G = weak global i32 0 @H = weak global i32 0 define i32 @test(i1 %Condition) { entry: br i1 %Condition, label %cond_true, label %cond_false cond_true: %X.0 = load i32* @G br label %cond_next cond_false: %X.1 = load i32* @H br label %cond_next cond_next: %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ] ret i32 %X.01 }
mem2regãã¹ã¯ãSSAãã©ãŒã ãæ§ç¯ããããã®æšæºã®å埩ãããã³ã¹ããã³ãã£ã¢ã¢ã«ãŽãªãºã ãå®è£ ããïŒéåžžã«äžè¬çãªïŒéåã®ã±ãŒã¹ãå éããããã€ãã®æé©åãåããŠããŸãã mem2regæé©åãã¹ã¯ãå¯å€å€æ°ãã©ãããããšãã質åã«å¯Ÿããçãã§ãããããã«é Œãããšã匷ããå§ãããŸãã mem2regã¯ãç¹å®ã®æ¡ä»¶äžã§ã®ã¿å€æ°ã䜿çšã§ããããšã«æ³šæããŠãã ããã
- mem2regã¯allocaã³ãã³ãïŒã¹ã¿ãã¯äžã®ã¡ã¢ãªã®å²ãåœãŠïŒã«çŠç¹ãåœãŠãŠããŸãïŒallocaã³ãã³ããæ€çŽ¢ããããããåŠçã§ããå Žåã¯å®è¡ããŸãã ã°ããŒãã«å€æ°ããã³ããŒãäžã®ã¡ã¢ãªå²ãåœãŠã§ã¯æ©èœããŸããã
- mem2regã¯ãé¢æ°ã®å ¥åãããã¯ã§ã®ã¿allocaã³ãã³ããæ€çŽ¢ããŸãã å ¥åãããã¯ã«é 眮ãããšãallocaã1åå®è¡ãããåæãç°¡åã«ãªããŸãã
- mem2regã¯ãçŽæ¥ããŒãããã³ä¿åã³ãã³ãã§äœ¿çšãããallocaã³ãã³ãã®ã¿ãå€æããŸãã ã¹ã¿ãã¯å€æ°ã®ã¢ãã¬ã¹ãé¢æ°ã«æž¡ãããããé¢çœããã€ã³ã¿ãŒæŒç®ã䜿çšãããå Žåãallocaã³ãã³ãã¯å€æãããŸããã
- mem2regã¯ãã¡ã¢ãªãå²ãåœãŠãé åã®ãµã€ãºã1ã§ããïŒãŸãã¯.llãã¡ã€ã«ããçç¥ãããŠããïŒå Žåã«ã®ã¿ããã¡ãŒã¹ãã¯ã©ã¹å€ïŒãã€ã³ã¿ãŒãã¹ã«ã©ãŒããã¯ãã«å€ãªã©ïŒã®allocaã³ãã³ãã§ã®ã¿æ©èœããŸãã mem2regã¯ãæ§é äœãŸãã¯é åãã¬ãžã¹ã¿ã«å€æã§ããŸããã sroaããã»ãŒãžã¯ãã匷åã§ãããå€ãã®å Žåãæ§é ããŠããªã³ãããã³é åãå€æã§ããããšã«æ³šæããŠãã ããã
ãããã®ããããã£ã¯ãã¹ãŠãã»ãšãã©ã®åœä»€åèšèªã§ç°¡åã«å®çŸã§ããŸããã«ã¬ã€ãã¹ã³ãŒãã®äŸã䜿çšããŠã以äžã§èª¬æããŸãã ããªããå°ããããšãã§ããæåŸã®è³ªåã¯ãç§ã®ããã³ããšã³ãã§ããããã¹ãŠãå¿é ããå¿ èŠããããŸããïŒ mem2regæé©åãã¹ã䜿çšããã«ãSSAãçŽæ¥æ§ç¯ããæ¹ãè¯ããšæããŸãããïŒ èŠããã«ãããããªãçç±ãç¹ã«ãªãéãããã®ææ³ã䜿çšããŠSSAãã©ãŒã ãäœæããããšã匷ããå§ãããŸãã
ãã®ãã¯ããã¯ïŒ
å®èšŒæžã¿ã§ååã«ãã¹ãæžã¿ïŒ clangã¯ãããŒã«ã«ã®å¯å€å€æ°ã«ãã®ææ³ã䜿çšããŠããŸãã ãŸããã»ãšãã©ã®äžè¬çãªLLVMã¯ã©ã€ã¢ã³ãã§ã¯ããã®ææ³ãå€æ°ã«äœ¿çšãããŸãã ãã°ãè¿ éã«çºèŠãããåæ段éã§ä¿®æ£ãããŠããããšã確èªã§ããŸãã
mem2regã¯ç¹å¥ãªå Žåãšäžè¬çãªå Žåã®äž¡æ¹ãæé©åããè¿ éã«å®è¡ããŸãã ããšãã°ã1ã€ã®ãããã¯ã§ã®ã¿äœ¿çšãããå€æ°ãåäžã®å²ãåœãŠãã€ã³ããæã€å€æ°ãäžèŠãªãã¡ã€ããŒãã®æ¿å ¥ãåé¿ããããã®åªãããã¥ãŒãªã¹ãã£ãã¯ãªã©ã®åå¥ã®è¿ éãªæé©åããããŸãã
ãããã°æ å ±ãçæããŸã ãLLVMã®ãããã°æ å ±ã¯ããããæ¥ç¶ãããŠããå€æ°ã®ã¢ãã¬ã¹ã«åºã¥ããŠããŸãã ãã®ææ³ã¯ããã®ã¹ã¿ã€ã«ã®ãããã°æ å ±ã«éåžžã«èªç¶ã«åãããããŠããŸãã
ãããŠæåŸã«ãããã³ããšã³ããšäœæ¥ã«çµã¿èŸŒã¿ããããå®è£ ãç°¡åã§ãã å¯å€å€æ°ãæäœããããã®äžè¯é¡ãä»ããæ¡åŒµããŸãããïŒ
7.4ã äžè¯é¡ã§å€æ°ãå€æŽãã
ããã§ã解決ãããåé¡ã®æ¬è³ªãããããŸããã ç§ãã¡ã®å°ããªèšèªã®äžè¯é¡ã®æèã§ãããã©ã®ããã«èŠãããèŠãŠã¿ãŸãããã 次ã®2ã€ã®æ©èœãè¿œå ããå¿ èŠããããŸãã
ã=ãæŒç®åã䜿çšããŠå€æ°ãå€æŽããæ©èœã
æ°ããå€æ°ãå®çŸ©ããæ©èœã
æåã®ãã€ã³ãã¯ç§ãã¡å šå¡ã話ããããšã§ããå€æ°ãé¢æ°ãžã®å ¥ååŒæ°ãšããŠãäžéå€æ°ãšããŠäœ¿çšããããããåå®çŸ©ããŸãã ãŸããæ°ããå€æ°ãå®çŸ©ããæ©èœã¯ãå€æ°ãå€æŽã§ãããã©ããã«é¢ä¿ãªã䟿å©ã§ãã 以äžã¯ãããããã©ã®ããã«äœ¿çšã§ãããã瀺ãåæ©ä»ãã®äŸã§ãã
# ':' : , # RHS. def binary : 1 (xy) y; # fib, def fib(x) if (x < 3) then 1 else fib(x-1)+fib(x-2); # fib. def fibi(x) var a = 1, b = 1, c in (for i = 3, i < x in c = a + b : a = b : b = c) : b; # . fibi(10);
å€æ°ãå€æŽããã«ã¯ãæ¢åã®å€æ°ã«ãallocaãããªãã¯ã䜿çšããå¿ èŠããããŸãã ãããããããæ°ããæŒç®åãè¿œå ããã«ã¬ã€ãã¹ã³ãŒããæ¡åŒµããŠæ°ããå€æ°ã®å®çŸ©ããµããŒãããŸãã
7.5ã æ¢åã®å€æ°ãå¯å€åœ¢åŒã«äœãçŽã¶
ã³ãŒãçæäžã®äžè¯é¡ã®æåã®è¡šã¯ãè¡šïŒãããïŒãNamedValuesãã§è¡šãããŸãã ããŒãã«ã«ã¯ãååä»ãå€æ°ã®å粟床å€ãå«ãLLVM "Value *"å€ãžã®ãã€ã³ã¿ãŒãå«ãŸããŠããŸãã å€æ°ã®å¯å€æ§ããµããŒãããã«ã¯ãå€æ°ãå°ãå€æŽããŠãNamedValuesããŒãã«ã«å€æ°ã®ã¡ã¢ãªã®å Žæãå«ãŸããããã«ããå¿ èŠããããŸãã ãã®å€æŽã¯ãªãã¡ã¯ã¿ãªã³ã°ã§ããã³ãŒãã®æ§é ã¯å€æŽãããŸãããã³ã³ãã€ã©ãŒã®åäœã¯ïŒããèªäœã§ã¯ïŒå€æŽãããŸããã ããããã¹ãŠã®å€æŽã¯ãKaleidoscopeã³ãŒããžã§ãã¬ãŒã¿ãŒã§åé¢ãããŠããŸãã
Kaleidoscopeã®éçºã®ãã®æ®µéã§ã¯ãé¢æ°ã®å ¥ååŒæ°ãšã«ãŒãå€æ°ãforãã®2ã€ã®å Žåã«ã®ã¿å€æ°ããµããŒãããŠããŸãã äžè²«æ§ãä¿ã€ããããŠãŒã¶ãŒå®çŸ©å€æ°ãšåãæ¹æ³ã§ãããã®å€æ°ãå€æŽã§ããããã«ããŸãã ã€ãŸããã¡ã¢ãªå ã®ã¢ãã¬ã¹ãå¿ èŠã§ãã
Kaleidoscopeã®ãªã¡ã€ã¯ãéå§ããããã«ãNamedValuesããŒãã«ãå€æŽããŠãValue *ã§ã¯ãªãAllocaInst *ãå«ãŸããããã«ããŸãã ãããè¡ããšãC ++ã³ã³ãã€ã©ã¯ãã³ãŒãã®ã©ã®éšåãå€æŽããå¿ èŠãããããéç¥ããŸãã
static std::map<std::string, AllocaInst*> NamedValues;
ãŸããallocaã³ãã³ããäœæããå¿ èŠããããããé¢æ°ã®å ¥åãããã¯ã§allocaã³ãã³ããäœæãããã«ããŒé¢æ°ã䜿çšããŸãã
/// CreateEntryBlockAlloca - alloca /// . .. static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction, const std::string &VarName) { IRBuilder<> TmpB(&TheFunction->getEntryBlock(), TheFunction->getEntryBlock().begin()); return TmpB.CreateAlloca(Type::getDoubleTy(TheContext), 0, VarName.c_str()); }
ãã®ããããªã³ãŒãã¯ãå ¥åãããã¯ã®æåã®ã³ãã³ãïŒ.beginïŒïŒïŒãæãIRBuilderãªããžã§ã¯ããäœæããŸãã å¿ èŠãªååã§allocaã¹ããŒãã¡ã³ããäœæãããããè¿ããŸãã Kadeidoscopeã®ãã¹ãŠã®å€ã¯å粟床ã®å®æ°ã§ããããããã®åœä»€ã䜿çšããããã«åãæž¡ãå¿ èŠã¯ãããŸããã
ãããå®äºãããšãæåã«è¡ãæ©èœã®å€æŽã¯å€æ°åç §ã«é¢ä¿ããŸãã æ°ããã¹ããŒã ã«ãããšãå€æ°ã¯ã¹ã¿ãã¯äžã§æå¹ã§ãããå€æ°ãžã®ãªã³ã¯ãçæããã³ãŒãã¯å®éã«ã¹ã¿ãã¯ã¹ãããããããŒãåœä»€ãçæããå¿ èŠããããŸãã
Value *VariableExprAST::codegen() { // Value *V = NamedValues[Name]; if (!V) return LogErrorV("Unknown variable name"); // return Builder.CreateLoad(V, Name.c_str()); }
ã芧ã®ãšããããã¹ãŠãå®å šã«ç°¡åã§ãã ããã§ãå€æ°ãå®çŸ©ããã³ãŒãå ã®å ŽæãæŽæ°ããŠallocaã¹ããŒãã¡ã³ããæ¿å ¥ããå¿ èŠããããŸãã ForExprAST :: codegenïŒïŒããå§ããŸãããïŒå®å šçã®å®å šãªã³ãŒããªã¹ããåç §ããŠãã ããïŒã
Function *TheFunction = Builder.GetInsertBlock()->getParent(); // alloca AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName); // , . Value *StartVal = Start->codegen(); if (!StartVal) return nullptr; // alloca. Builder.CreateStore(StartVal, Alloca); ... // Value *EndCond = End->codegen(); if (!EndCond) return nullptr; // , , alloca. // . Value *CurVar = Builder.CreateLoad(Alloca); Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar"); Builder.CreateStore(NextVar, Alloca); ...
ãã®ã³ãŒãã¯ãå®éã«ã¯å¯å€å€æ°ãèš±å¯ããäžèšã®ã³ãŒããšåãã§ãã 倧ããªéãã¯ãPHIããŒããæ§ç¯ããå¿ èŠããªããªããå¿ èŠãªãšãã«load / storeã䜿çšããŠå€æ°ã«ã¢ã¯ã»ã¹ããããšã§ãã
å¯å€é¢æ°ã®åŒæ°ããµããŒãããã«ã¯ããããã®allocaã¹ããŒãã¡ã³ããäœæããå¿ èŠããããŸãã ãã®ã³ãŒãã¯éåžžã«ç°¡åã§ãã
Function *FunctionAST::codegen() { ... Builder.SetInsertPoint(BB); // NamedValues. NamedValues.clear(); for (auto &Arg : TheFunction->args()) { // alloca AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName()); // alloca. Builder.CreateStore(&Arg, Alloca); // . NamedValues[Arg.getName()] = Alloca; } if (Value *RetVal = Body->codegen()) { ...
åŒæ°ããšã«ãallocaã¹ããŒãã¡ã³ããäœæããå ¥åå€ãallocaã«æžã蟌ã¿ãallocaãåŒæ°ã®ã¡ã¢ãªäœçœ®ãšããŠç»é²ããŸãã ãã®ã¡ãœããã¯ãé¢æ°ã®å ¥åãããã¯ãçæããçŽåŸã«FunctionAST :: codegenïŒïŒã«ãã£ãŠåŒã³åºãããŸãã
ããã·ã§ã³ã®æåŸã®éšåã¯ãmem2regãã¹ãè¿œå ããŠãé©åãªã³ãŒããå床çæã§ããããã«ããããšã§ãã
// alloca . TheFPM->add(createPromoteMemoryToRegisterPass()); // peephole- . TheFPM->add(createInstructionCombiningPass()); // TheFPM->add(createReassociatePass()); ...
mem2regæé©åäœæ¥ã®ååŸã«ã³ãŒããã©ã®ããã«èŠããããèŠãã®ã¯èå³æ·±ãã§ãã ããšãã°ãæé©åã®ååŸã®ååž°é¢æ°fibã®ã³ãŒãã¯æ¬¡ã®ãšããã§ãã æé©ååïŒ
define double @fib(double %x) { entry: %x1 = alloca double store double %x, double* %x1 %x2 = load double, double* %x1 %cmptmp = fcmp ult double %x2, 3.000000e+00 %booltmp = uitofp i1 %cmptmp to double %ifcond = fcmp one double %booltmp, 0.000000e+00 br i1 %ifcond, label %then, label %else then: ; preds = %entry br label %ifcont else: ; preds = %entry %x3 = load double, double* %x1 %subtmp = fsub double %x3, 1.000000e+00 %calltmp = call double @fib(double %subtmp) %x4 = load double, double* %x1 %subtmp5 = fsub double %x4, 2.000000e+00 %calltmp6 = call double @fib(double %subtmp5) %addtmp = fadd double %calltmp, %calltmp6 br label %ifcont ifcont: ; preds = %else, %then %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ] ret double %iftmp }
å€æ°ã¯1ã€ïŒxãå ¥ååŒæ°ïŒãããââããŸãããã䜿çšããåçŽãªã³ãŒãçææŠç¥ãèŠãããšãã§ããŸãã å ¥åãããã¯ã§allocaãäœæãããåæå ¥åå€ãããã«ä¿åãããŸãã å€æ°ãžã®ååç §ã«ãããã¹ã¿ãã¯ããã®èªã¿åããè¡ãããŸãã ãŸããif / then / elseåŒãå€æŽããŠããªãããšã«æ³šæããŠãã ãããPHIããŒããæ¿å ¥ãããŸãã ãã®å Žåã¯allocaãäœæã§ããŸãããå®éã«ã¯PHIããŒããäœæããæ¹ãç°¡åãªã®ã§ãäœæããŸãã
mem2regãæž¡ããåŸã®ã³ãŒãã¯æ¬¡ã®ãšããã§ãã
define double @fib(double %x) { entry: %cmptmp = fcmp ult double %x, 3.000000e+00 %booltmp = uitofp i1 %cmptmp to double %ifcond = fcmp one double %booltmp, 0.000000e+00 br i1 %ifcond, label %then, label %else then: br label %ifcont else: %subtmp = fsub double %x, 1.000000e+00 %calltmp = call double @fib(double %subtmp) %subtmp5 = fsub double %x, 2.000000e+00 %calltmp6 = call double @fib(double %subtmp5) %addtmp = fadd double %calltmp, %calltmp6 br label %ifcont ifcont: ; preds = %else, %then %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ] ret double %iftmp }
ããã«ã¯ããŒãã£å€æ°ã®å®£èšããªãã®ã§ãããã¯mem2regã®äºçŽ°ãªã±ãŒã¹ã§ãã ããã瀺ãç®çã¯ãã³ãŒããç¡å¹ã«ããããšãã欲æ±ããããªããå®ãããšã§ãã æ®ãã®æé©åãå®äºãããšã次ã®çµæãåŸãããŸãã
define double @fib(double %x) { entry: %cmptmp = fcmp ult double %x, 3.000000e+00 %booltmp = uitofp i1 %cmptmp to double %ifcond = fcmp ueq double %booltmp, 0.000000e+00 br i1 %ifcond, label %else, label %ifcont else: %subtmp = fsub double %x, 1.000000e+00 %calltmp = call double @fib(double %subtmp) %subtmp5 = fsub double %x, 2.000000e+00 %calltmp6 = call double @fib(double %subtmp5) %addtmp = fadd double %calltmp, %calltmp6 ret double %addtmp ifcont: ret double 1.000000e+00 }
ããã§ã¯ãåçŽåãã¹ãelseãããã¯ã®æåŸã§returnã¹ããŒãã¡ã³ããè€è£œããããšã決å®ããããšãããããŸãã ããã«ãããäžéšã®ãã©ã³ããšPHIããŒããåé€ã§ããŸããã
ã·ã³ãã«ããŒãã«ãæŽæ°ãããã¹ã¿ãã¯å€æ°ãå«ãŸããŠããã®ã§ãä»£å ¥æŒç®åãè¿œå ããŸãã
7.6ã æ°ããå²ãåœãŠæŒç®å
ãã®ãã¬ãŒã ã¯ãŒã¯ã§ã¯ãæ°ããä»£å ¥æŒç®åã®è¿œå ã¯éåžžã«ç°¡åã§ãã ä»ã®äºé æŒç®åãšåæ§ã«è§£æããŸãããïŒãŠãŒã¶ãŒã«å®è¡ãããã®ã§ã¯ãªãïŒèªåã§åŠçããŸãã ãŸãã圌ã«åªå é äœãå²ãåœãŠãŸãã
int main() { // // 1 - BinopPrecedence['='] = 2; BinopPrecedence['<'] = 10; BinopPrecedence['+'] = 20; BinopPrecedence['-'] = 20;
ããŒãµãŒã¯äºé æŒç®åã®åªå é äœãç¥ã£ãã®ã§ãASTã®è§£æãšçæãåŠçããŸãã ä»£å ¥æŒç®åã®ã³ãŒãçæãå®è£ ããå¿ èŠããããŸãã 次ã®ããã«ãªããŸãã
Value *BinaryExprAST::codegen() { // '=' - , .. LHS if (Op == '=') { // , LHS . VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS.get()); if (!LHSE) return LogErrorV("destination of '=' must be a variable");
ä»ã®äºé æŒç®åãšã¯ç°ãªããä»£å ¥æŒç®åã¯ãLHSã®çæãRHSã®çæãèšç®ã®å®è¡ãã¢ãã«ã«åŸããŸããã ä»ã®äºé æŒç®åãåŠçãããåã«ãç¹å¥ãªã±ãŒã¹ãšããŠæ±ãããŸãã ãã1ã€ã®å¥åŠãªããšã¯ãLHSã¯å¯å€ã§ãªããã°ãªããªããšããããšã§ãã ãïŒx + 1ïŒ= exprããšèšè¿°ããã®ã¯æ£ãããããŸããããx = exprããªã©ã®åŒã®ã¿ãæå¹ã§ãã
// RHS. Value *Val = RHS->codegen(); if (!Val) return nullptr; // Value *Variable = NamedValues[LHSE->getName()]; if (!Variable) return LogErrorV("Unknown variable name"); Builder.CreateStore(Val, Variable); return Val; } ...
å€æ°ãããå Žåãå²ãåœãŠçšã®ã³ãŒãã®çæã¯éåžžã«ç°¡åã§ããå²ãåœãŠçšã®RHSãçæããstoreã¹ããŒãã¡ã³ããäœæããèšç®ãããå€ãè¿ããŸãã æ»ãå€ã«ããããX =ïŒY = ZïŒããªã©ã®å²ãåœãŠãã§ãŒã³ãäœæã§ããŸãã
ããã§ä»£å ¥æŒç®åãã§ããŸãããã«ãŒãå€æ°ãšåŒæ°ãå€æŽã§ããŸãã ããšãã°ã次ã®ãããªã³ãŒããæžãããšãã§ããŸãã
# Function to print a double. extern printd(x); # Define ':' for sequencing: as a low-precedence operator that ignores operands # and just returns the RHS. def binary : 1 (xy) y; def test(x) printd(x) : x = 4 : printd(x); test(123);
èµ·åæã«ããã®äŸã¯ã123ãã次ã«ã4ããåºåããå€æ°ã®å€ãå®éã«å€æŽããããšã瀺ããŸãïŒ ããŠãããã§æ©èœãããšããç®æšã«å°éããŸãããäžè¬çãªã±ãŒã¹ã§ã¯ãSSAãã©ãŒã ãäœæããå¿ èŠããããŸãã ãã ããç¬èªã®ããŒã«ã«å€æ°ãå°å ¥ã§ããã°æ¬åœã«äŸ¿å©ã§ãã
7.7ã ãŠãŒã¶ãŒå®çŸ©ã®ããŒã«ã«å€æ°
ãvarããšãinãã®è¿œå ã¯ãKaleidoscopeã§è¡ã£ãä»ã®æ¡åŒµæ©èœãšåæ§ã§ããã¬ãã·ã«ã«ã¢ãã©ã€ã¶ãŒãããŒãµãŒãASTãããã³ã³ãŒããžã§ãã¬ãŒã¿ãŒãæ¡åŒµããŸããvar / inæ§é ãè¿œå ããæåã®ã¹ãããã¯ãåå¥ã¢ãã©ã€ã¶ãŒãæ¡åŒµããããšã§ããåãšåæ§ãããã¯ç°¡åãªããšã§ãã³ãŒãã¯æ¬¡ã®ããã«ãªããŸãã
enum Token { ... // var tok_var = -13 ... } ... static int gettok() { ... if (IdentifierStr == "in") return tok_in; if (IdentifierStr == "binary") return tok_binary; if (IdentifierStr == "unary") return tok_unary; if (IdentifierStr == "var") return tok_var; return tok_identifier; ...
次ã®ã¹ãããã¯ãæ§ç¯äžã®ASTããŒããèå¥ããããšã§ãããvar / inãã®å Žåã次ã®ããã«ãªããŸãã
/// VarExprAST - var/in class VarExprAST : public ExprAST { std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames; std::unique_ptr<ExprAST> Body; public: VarExprAST(std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames, std::unique_ptr<ExprAST> Body) : VarNames(std::move(VarNames)), Body(std::move(Body)) {} Value *codegen() override; };
ãVar / inãã䜿çšãããšãååã®ãªã¹ããããã«å®çŸ©ã§ããåååã«ãªãã·ã§ã³ã§åæåå€ãèšå®ã§ããŸããVarNamesãã¯ãã«ã«æ å ±ãä¿åããŸãããŸããvar / inã«ã¯æ¬äœããããæ¬äœã¯var / inã§å®çŸ©ãããå€æ°ã«ã¢ã¯ã»ã¹ã§ããŸãã
ãããå®äºãããšãããŒãµãŒã®éšåã決å®ã§ããŸããæåã«è¡ãããšã¯ããã©ã€ããªåŒãè¿œå ããããšã§ãã
/// primary /// ::= identifierexpr /// ::= numberexpr /// ::= parenexpr /// ::= ifexpr /// ::= forexpr /// ::= varexpr static std::unique_ptr<ExprAST> ParsePrimary() { switch (CurTok) { default: return LogError("unknown token when expecting an expression"); case tok_identifier: return ParseIdentifierExpr(); case tok_number: return ParseNumberExpr(); case '(': return ParseParenExpr(); case tok_if: return ParseIfExpr(); case tok_for: return ParseForExpr(); case tok_var: return ParseVarExpr(); } }
次ã«ãParseVarExprãå®çŸ©ããŸãã
/// varexpr ::= 'var' identifier ('=' expression)? // (',' identifier ('=' expression)?)* 'in' expression static std::unique_ptr<ExprAST> ParseVarExpr() { getNextToken(); // var. std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames; // if (CurTok != tok_identifier) return LogError("expected identifier after var");
ãã®ã³ãŒãã®æåã®éšåã¯ãããŒã«ã«ã®VarNamesãã¯ãã«å ã®èå¥å/åŒã®ãã¢ã®ãªã¹ãã解æããŸãã
while (1) { std::string Name = IdentifierStr; getNextToken(); // // std::unique_ptr<ExprAST> Init; if (CurTok == '=') { getNextToken(); // '='. Init = ParseExpression(); if (!Init) return nullptr; } VarNames.push_back(std::make_pair(Name, std::move(Init))); // , if (CurTok != ',') break; getNextToken(); // ','. if (CurTok != tok_identifier) return LogError("expected identifier list after var"); }
ãã¹ãŠã®å€æ°ã解æãããããæ¬äœã解æããŠASTããŒããäœæããŸãã
// 'in'. if (CurTok != tok_in) return LogError("expected 'in' keyword after 'var'"); getNextToken(); // eat 'in'. auto Body = ParseExpression(); if (!Body) return nullptr; return llvm::make_unique<VarExprAST>(std::move(VarNames), std::move(Body)); }
ããã§ãã³ãŒãã解æããŠéä¿¡ã§ããããã«ãªããLLVM IRã³ãŒãçæããµããŒãããå¿ èŠããããŸãããããã³ãŒãã®å§ãŸãã§ãïŒ
Value *VarExprAST::codegen() { std::vector<AllocaInst *> OldBindings; Function *TheFunction = Builder.GetInsertBlock()->getParent(); // for (unsigned i = 0, e = VarNames.size(); i != e; ++i) { const std::string &VarName = VarNames[i].first; ExprAST *Init = VarNames[i].second.get();
åºæ¬çã«ãã³ãŒãã¯ãã¹ãŠã®å€æ°ãã«ãŒãåŠçããäžåºŠã«1ã€ãã€åŠçããŸããã·ã³ãã«ããŒãã«ã«é 眮ãããåå€æ°ã«ã€ããŠãOldBindingsã«çœ®ãæããåã®å€ãèŠããŠããŸãã
// , // , // : // var a = 1 in // var a = a in ... # 'a'. Value *InitVal; if (Init) { InitVal = Init->codegen(); if (!InitVal) return nullptr; } else { // , 0.0. InitVal = ConstantFP::get(TheContext, APFloat(0.0)); } AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName); Builder.CreateStore(InitVal, Alloca); // , // OldBindings.push_back(NamedValues[VarName]); // NamedValues[VarName] = Alloca; }
ãã®ã³ãŒãã«ã¯ããã«ã³ã¡ã³ãããããŸããäž»ãªã¢ã€ãã¢ã¯ãåæååãçæããallocaã³ãã³ããäœæããŠãããããŒãã«å ã®æåãæŽæ°ããŠallocaãæãããã«ããããšã§ãããã¹ãŠã®å€æ°ãã·ã³ãã«ããŒãã«ã«æžã蟌ãŸãããšãåŒã®var /ã®æ¬äœãèšç®ããŸãã
// , Value *BodyVal = Body->codegen(); if (!BodyVal) return nullptr;
æåŸã«ãæ»ãåã«ã以åã®äžé£ã®å€æ°ã埩å ããŸãã
// for (unsigned i = 0, e = VarNames.size(); i != e; ++i) NamedValues[VarNames[i].first] = OldBindings[i]; // return BodyVal; }
ããããã¹ãŠã®æçµçµæã¯ãå€æ°ãã¹ã³ãŒãã«æ£ããé 眮ããïŒäºçŽ°ãªæ¹æ³ã§ïŒå€æ°ãå€æŽã§ããããã«ããããšã§ãã
ããã§ãããããããšãçµãããŸãããã€ã³ãããã¯ã·ã§ã³ããã®å埩fibé¢æ°ã®äŸã¯ãã³ã³ãã€ã«ããŠããŸãæ©èœããŸããæé©åãã¹mem2regã¯ãå¿ èŠã«å¿ããŠPHIããŒããæ¿å ¥ããããšã«ãããSSAã¬ãžã¹ã¿å ã®ãã¹ãŠã®ã¹ã¿ãã¯å€æ°ãæé©åããããã³ããšã³ãã¯åçŽãªãŸãŸã§ããè€éãªã¢ã«ãŽãªãºã ãšèšç®ã¯ãããŸããã
7.8ãå®å šãªãªã¹ã
以äžã¯ãå€æŽå¯èœãªå€æ°ãšãµããŒããããŠããvar /ã«ãã£ãŠæ¡åŒµããããäœæ¥äŸã®ãœãŒã¹ã³ãŒãã®å®å šãªãªã¹ãã§ãããµã³ãã«ãäœæããã«ã¯ã次ã®ã³ãã³ãã䜿çšããŸãã
# clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy # ./toy
ãœãŒã¹ã³ãŒãïŒ
ã³ãŒã
#include "llvm/ADT/APFloat.h" #include "llvm/ADT/STLExtras.h" #include "llvm/IR/BasicBlock.h" #include "llvm/IR/Constants.h" #include "llvm/IR/DerivedTypes.h" #include "llvm/IR/Function.h" #include "llvm/IR/Instructions.h" #include "llvm/IR/IRBuilder.h" #include "llvm/IR/LLVMContext.h" #include "llvm/IR/LegacyPassManager.h" #include "llvm/IR/Module.h" #include "llvm/IR/Type.h" #include "llvm/IR/Verifier.h" #include "llvm/Support/TargetSelect.h" #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/Scalar.h" #include "llvm/Transforms/Scalar/GVN.h" #include "../include/KaleidoscopeJIT.h" #include <algorithm> #include <cassert> #include <cctype> #include <cstdint> #include <cstdio> #include <cstdlib> #include <map> #include <memory> #include <string> #include <utility> #include <vector> using namespace llvm; using namespace llvm::orc; //===----------------------------------------------------------------------===// // //===----------------------------------------------------------------------===// // [0-255] , // enum Token { tok_eof = -1, // tok_def = -2, tok_extern = -3, // tok_identifier = -4, tok_number = -5, // tok_if = -6, tok_then = -7, tok_else = -8, tok_for = -9, tok_in = -10, // tok_binary = -11, tok_unary = -12, // tok_var = -13 }; static std::string IdentifierStr; // tok_identifier static double NumVal; // tok_number /// gettok - static int gettok() { static int LastChar = ' '; // while (isspace(LastChar)) LastChar = getchar(); if (isalpha(LastChar)) { // : [a-zA-Z][a-zA-Z0-9]* IdentifierStr = LastChar; while (isalnum((LastChar = getchar()))) IdentifierStr += LastChar; if (IdentifierStr == "def") return tok_def; if (IdentifierStr == "extern") return tok_extern; if (IdentifierStr == "if") return tok_if; if (IdentifierStr == "then") return tok_then; if (IdentifierStr == "else") return tok_else; if (IdentifierStr == "for") return tok_for; if (IdentifierStr == "in") return tok_in; if (IdentifierStr == "binary") return tok_binary; if (IdentifierStr == "unary") return tok_unary; if (IdentifierStr == "var") return tok_var; return tok_identifier; } if (isdigit(LastChar) || LastChar == '.') { // : [0-9.]+ std::string NumStr; do { NumStr += LastChar; LastChar = getchar(); } while (isdigit(LastChar) || LastChar == '.'); NumVal = strtod(NumStr.c_str(), nullptr); return tok_number; } if (LastChar == '#') { // do LastChar = getchar(); while (LastChar != EOF && LastChar != '\n' && LastChar != '\r'); if (LastChar != EOF) return gettok(); } // . EOF. if (LastChar == EOF) return tok_eof; // , ascii-. int ThisChar = LastChar; LastChar = getchar(); return ThisChar; } //===----------------------------------------------------------------------===// // ( ) //===----------------------------------------------------------------------===// namespace { /// ExprAST - . class ExprAST { public: virtual ~ExprAST() = default; virtual Value *codegen() = 0; }; /// NumberExprAST - "1.0". class NumberExprAST : public ExprAST { double Val; public: NumberExprAST(double Val) : Val(Val) {} Value *codegen() override; }; /// VariableExprAST - , , "a". class VariableExprAST : public ExprAST { std::string Name; public: VariableExprAST(const std::string &Name) : Name(Name) {} Value *codegen() override; const std::string &getName() const { return Name; } }; /// UnaryExprAST - class UnaryExprAST : public ExprAST { char Opcode; std::unique_ptr<ExprAST> Operand; public: UnaryExprAST(char Opcode, std::unique_ptr<ExprAST> Operand) : Opcode(Opcode), Operand(std::move(Operand)) {} Value *codegen() override; }; /// BinaryExprAST - class BinaryExprAST : public ExprAST { char Op; std::unique_ptr<ExprAST> LHS, RHS; public: BinaryExprAST(char Op, std::unique_ptr<ExprAST> LHS, std::unique_ptr<ExprAST> RHS) : Op(Op), LHS(std::move(LHS)), RHS(std::move(RHS)) {} Value *codegen() override; }; /// CallExprAST - class CallExprAST : public ExprAST { std::string Callee; std::vector<std::unique_ptr<ExprAST>> Args; public: CallExprAST(const std::string &Callee, std::vector<std::unique_ptr<ExprAST>> Args) : Callee(Callee), Args(std::move(Args)) {} Value *codegen() override; }; /// IfExprAST - if/then/else. class IfExprAST : public ExprAST { std::unique_ptr<ExprAST> Cond, Then, Else; public: IfExprAST(std::unique_ptr<ExprAST> Cond, std::unique_ptr<ExprAST> Then, std::unique_ptr<ExprAST> Else) : Cond(std::move(Cond)), Then(std::move(Then)), Else(std::move(Else)) {} Value *codegen() override; }; /// ForExprAST - for/in. class ForExprAST : public ExprAST { std::string VarName; std::unique_ptr<ExprAST> Start, End, Step, Body; public: ForExprAST(const std::string &VarName, std::unique_ptr<ExprAST> Start, std::unique_ptr<ExprAST> End, std::unique_ptr<ExprAST> Step, std::unique_ptr<ExprAST> Body) : VarName(VarName), Start(std::move(Start)), End(std::move(End)), Step(std::move(Step)), Body(std::move(Body)) {} Value *codegen() override; }; /// VarExprAST - var/in class VarExprAST : public ExprAST { std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames; std::unique_ptr<ExprAST> Body; public: VarExprAST( std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames, std::unique_ptr<ExprAST> Body) : VarNames(std::move(VarNames)), Body(std::move(Body)) {} Value *codegen() override; }; /// PrototypeAST - "" , /// , (, , /// , ), . class PrototypeAST { std::string Name; std::vector<std::string> Args; bool IsOperator; unsigned Precedence; // Precedence if a binary op. public: PrototypeAST(const std::string &Name, std::vector<std::string> Args, bool IsOperator = false, unsigned Prec = 0) : Name(Name), Args(std::move(Args)), IsOperator(IsOperator), Precedence(Prec) {} Function *codegen(); const std::string &getName() const { return Name; } bool isUnaryOp() const { return IsOperator && Args.size() == 1; } bool isBinaryOp() const { return IsOperator && Args.size() == 2; } char getOperatorName() const { assert(isUnaryOp() || isBinaryOp()); return Name[Name.size() - 1]; } unsigned getBinaryPrecedence() const { return Precedence; } }; /// FunctionAST - class FunctionAST { std::unique_ptr<PrototypeAST> Proto; std::unique_ptr<ExprAST> Body; public: FunctionAST(std::unique_ptr<PrototypeAST> Proto, std::unique_ptr<ExprAST> Body) : Proto(std::move(Proto)), Body(std::move(Body)) {} Function *codegen(); }; } // //===----------------------------------------------------------------------===// // //===----------------------------------------------------------------------===// /// CurTok/getNextToken - . CurTok - /// , . getNextToken /// CurTok . static int CurTok; static int getNextToken() { return CurTok = gettok(); } /// BinopPrecedence - , /// static std::map<char, int> BinopPrecedence; /// GetTokPrecedence - . static int GetTokPrecedence() { if (!isascii(CurTok)) return -1; // , int TokPrec = BinopPrecedence[CurTok]; if (TokPrec <= 0) return -1; return TokPrec; } /// Error* - . std::unique_ptr<ExprAST> LogError(const char *Str) { fprintf(stderr, "Error: %s\n", Str); return nullptr; } std::unique_ptr<PrototypeAST> LogErrorP(const char *Str) { LogError(Str); return nullptr; } static std::unique_ptr<ExprAST> ParseExpression(); /// numberexpr ::= number static std::unique_ptr<ExprAST> ParseNumberExpr() { auto Result = llvm::make_unique<NumberExprAST>(NumVal); getNextToken(); // return std::move(Result); } /// parenexpr ::= '(' expression ')' static std::unique_ptr<ExprAST> ParseParenExpr() { getNextToken(); // (. auto V = ParseExpression(); if (!V) return nullptr; if (CurTok != ')') return LogError("expected ')'"); getNextToken(); // ). return V; } /// identifierexpr /// ::= identifier /// ::= identifier '(' expression* ')' static std::unique_ptr<ExprAST> ParseIdentifierExpr() { std::string IdName = IdentifierStr; getNextToken(); // . if (CurTok != '(') // return llvm::make_unique<VariableExprAST>(IdName); // Call. getNextToken(); // ( std::vector<std::unique_ptr<ExprAST>> Args; if (CurTok != ')') { while (true) { if (auto Arg = ParseExpression()) Args.push_back(std::move(Arg)); else return nullptr; if (CurTok == ')') break; if (CurTok != ',') return LogError("Expected ')' or ',' in argument list"); getNextToken(); } } // ')'. getNextToken(); return llvm::make_unique<CallExprAST>(IdName, std::move(Args)); } /// ifexpr ::= 'if' expression 'then' expression 'else' expression static std::unique_ptr<ExprAST> ParseIfExpr() { getNextToken(); // eat the if. // auto Cond = ParseExpression(); if (!Cond) return nullptr; if (CurTok != tok_then) return LogError("expected then"); getNextToken(); // then auto Then = ParseExpression(); if (!Then) return nullptr; if (CurTok != tok_else) return LogError("expected else"); getNextToken(); auto Else = ParseExpression(); if (!Else) return nullptr; return llvm::make_unique<IfExprAST>(std::move(Cond), std::move(Then), std::move(Else)); } /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression static std::unique_ptr<ExprAST> ParseForExpr() { getNextToken(); // eat the for. if (CurTok != tok_identifier) return LogError("expected identifier after for"); std::string IdName = IdentifierStr; getNextToken(); // if (CurTok != '=') return LogError("expected '=' after for"); getNextToken(); // '='. auto Start = ParseExpression(); if (!Start) return nullptr; if (CurTok != ',') return LogError("expected ',' after for start value"); getNextToken(); auto End = ParseExpression(); if (!End) return nullptr; // std::unique_ptr<ExprAST> Step; if (CurTok == ',') { getNextToken(); Step = ParseExpression(); if (!Step) return nullptr; } if (CurTok != tok_in) return LogError("expected 'in' after for"); getNextToken(); // 'in'. auto Body = ParseExpression(); if (!Body) return nullptr; return llvm::make_unique<ForExprAST>(IdName, std::move(Start), std::move(End), std::move(Step), std::move(Body)); } /// varexpr ::= 'var' identifier ('=' expression)? // (',' identifier ('=' expression)?)* 'in' expression static std::unique_ptr<ExprAST> ParseVarExpr() { getNextToken(); // var. std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames; // if (CurTok != tok_identifier) return LogError("expected identifier after var"); while (true) { std::string Name = IdentifierStr; getNextToken(); // // Read the optional initializer. std::unique_ptr<ExprAST> Init = nullptr; if (CurTok == '=') { getNextToken(); // '='. Init = ParseExpression(); if (!Init) return nullptr; } VarNames.push_back(std::make_pair(Name, std::move(Init))); // End of var list, exit loop. if (CurTok != ',') break; getNextToken(); // ','. if (CurTok != tok_identifier) return LogError("expected identifier list after var"); } // At this point, we have to have 'in'. if (CurTok != tok_in) return LogError("expected 'in' keyword after 'var'"); getNextToken(); // 'in'. auto Body = ParseExpression(); if (!Body) return nullptr; return llvm::make_unique<VarExprAST>(std::move(VarNames), std::move(Body)); } /// primary /// ::= identifierexpr /// ::= numberexpr /// ::= parenexpr /// ::= ifexpr /// ::= forexpr /// ::= varexpr static std::unique_ptr<ExprAST> ParsePrimary() { switch (CurTok) { default: return LogError("unknown token when expecting an expression"); case tok_identifier: return ParseIdentifierExpr(); case tok_number: return ParseNumberExpr(); case '(': return ParseParenExpr(); case tok_if: return ParseIfExpr(); case tok_for: return ParseForExpr(); case tok_var: return ParseVarExpr(); } } /// unary /// ::= primary /// ::= '!' unary static std::unique_ptr<ExprAST> ParseUnary() { // , . if (!isascii(CurTok) || CurTok == '(' || CurTok == ',') return ParsePrimary(); // , int Opc = CurTok; getNextToken(); if (auto Operand = ParseUnary()) return llvm::make_unique<UnaryExprAST>(Opc, std::move(Operand)); return nullptr; } /// binoprhs /// ::= ('+' unary)* static std::unique_ptr<ExprAST> ParseBinOpRHS(int ExprPrec, std::unique_ptr<ExprAST> LHS) { // , while (true) { int TokPrec = GetTokPrecedence(); // , // , if (TokPrec < ExprPrec) return LHS; // , int BinOp = CurTok; getNextToken(); // eat binop // auto RHS = ParseUnary(); if (!RHS) return nullptr; // BinOp RHS, RHS, // RHS LHS. int NextPrec = GetTokPrecedence(); if (TokPrec < NextPrec) { RHS = ParseBinOpRHS(TokPrec + 1, std::move(RHS)); if (!RHS) return nullptr; } // LHS/RHS. LHS = llvm::make_unique<BinaryExprAST>(BinOp, std::move(LHS), std::move(RHS)); } } /// expression /// ::= unary binoprhs /// static std::unique_ptr<ExprAST> ParseExpression() { auto LHS = ParseUnary(); if (!LHS) return nullptr; return ParseBinOpRHS(0, std::move(LHS)); } /// prototype /// ::= id '(' id* ')' /// ::= binary LETTER number? (id, id) /// ::= unary LETTER (id) static std::unique_ptr<PrototypeAST> ParsePrototype() { std::string FnName; unsigned Kind = 0; // 0 = , 1 = , 2 = . unsigned BinaryPrecedence = 30; switch (CurTok) { default: return LogErrorP("Expected function name in prototype"); case tok_identifier: FnName = IdentifierStr; Kind = 0; getNextToken(); break; case tok_unary: getNextToken(); if (!isascii(CurTok)) return LogErrorP("Expected unary operator"); FnName = "unary"; FnName += (char)CurTok; Kind = 1; getNextToken(); break; case tok_binary: getNextToken(); if (!isascii(CurTok)) return LogErrorP("Expected binary operator"); FnName = "binary"; FnName += (char)CurTok; Kind = 2; getNextToken(); // , if (CurTok == tok_number) { if (NumVal < 1 || NumVal > 100) return LogErrorP("Invalid precedence: must be 1..100"); BinaryPrecedence = (unsigned)NumVal; getNextToken(); } break; } if (CurTok != '(') return LogErrorP("Expected '(' in prototype"); std::vector<std::string> ArgNames; while (getNextToken() == tok_identifier) ArgNames.push_back(IdentifierStr); if (CurTok != ')') return LogErrorP("Expected ')' in prototype"); // . getNextToken(); // eat ')'. // , if (Kind && ArgNames.size() != Kind) return LogErrorP("Invalid number of operands for operator"); return llvm::make_unique<PrototypeAST>(FnName, ArgNames, Kind != 0, BinaryPrecedence); } /// definition ::= 'def' prototype expression static std::unique_ptr<FunctionAST> ParseDefinition() { getNextToken(); // def. auto Proto = ParsePrototype(); if (!Proto) return nullptr; if (auto E = ParseExpression()) return llvm::make_unique<FunctionAST>(std::move(Proto), std::move(E)); return nullptr; } /// toplevelexpr ::= expression static std::unique_ptr<FunctionAST> ParseTopLevelExpr() { if (auto E = ParseExpression()) { // auto Proto = llvm::make_unique<PrototypeAST>("__anon_expr", std::vector<std::string>()); return llvm::make_unique<FunctionAST>(std::move(Proto), std::move(E)); } return nullptr; } /// external ::= 'extern' prototype static std::unique_ptr<PrototypeAST> ParseExtern() { getNextToken(); // eat extern. return ParsePrototype(); } //===----------------------------------------------------------------------===// // //===----------------------------------------------------------------------===// static LLVMContext TheContext; static IRBuilder<> Builder(TheContext); static std::unique_ptr<Module> TheModule; static std::map<std::string, AllocaInst *> NamedValues; static std::unique_ptr<legacy::FunctionPassManager> TheFPM; static std::unique_ptr<KaleidoscopeJIT> TheJIT; static std::map<std::string, std::unique_ptr<PrototypeAST>> FunctionProtos; Value *LogErrorV(const char *Str) { LogError(Str); return nullptr; } Function *getFunction(std::string Name) { // , . if (auto *F = TheModule->getFunction(Name)) return F; // , // . auto FI = FunctionProtos.find(Name); if (FI != FunctionProtos.end()) return FI->second->codegen(); // , null. return nullptr; } /// CreateEntryBlockAlloca - alloca /// . . static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction, const std::string &VarName) { IRBuilder<> TmpB(&TheFunction->getEntryBlock(), TheFunction->getEntryBlock().begin()); return TmpB.CreateAlloca(Type::getDoubleTy(TheContext), nullptr, VarName); } Value *NumberExprAST::codegen() { return ConstantFP::get(TheContext, APFloat(Val)); } Value *VariableExprAST::codegen() { // , Value *V = NamedValues[Name]; if (!V) return LogErrorV("Unknown variable name"); // return Builder.CreateLoad(V, Name.c_str()); } Value *UnaryExprAST::codegen() { Value *OperandV = Operand->codegen(); if (!OperandV) return nullptr; Function *F = getFunction(std::string("unary") + Opcode); if (!F) return LogErrorV("Unknown unary operator"); return Builder.CreateCall(F, OperandV, "unop"); } Value *BinaryExprAST::codegen() { // '=', .. LHS if (Op == '=') { // , LHS // , RTTI, .. LLVM // . LLVM RTTI // dynamic_cast . VariableExprAST *LHSE = static_cast<VariableExprAST *>(LHS.get()); if (!LHSE) return LogErrorV("destination of '=' must be a variable"); // RHS. Value *Val = RHS->codegen(); if (!Val) return nullptr; // Value *Variable = NamedValues[LHSE->getName()]; if (!Variable) return LogErrorV("Unknown variable name"); Builder.CreateStore(Val, Variable); return Val; } Value *L = LHS->codegen(); Value *R = RHS->codegen(); if (!L || !R) return nullptr; switch (Op) { case '+': return Builder.CreateFAdd(L, R, "addtmp"); case '-': return Builder.CreateFSub(L, R, "subtmp"); case '*': return Builder.CreateFMul(L, R, "multmp"); case '<': L = Builder.CreateFCmpULT(L, R, "cmptmp"); // bool 0/1 double 0.0 or 1.0 return Builder.CreateUIToFP(L, Type::getDoubleTy(TheContext), "booltmp"); default: break; } // , . // . Function *F = getFunction(std::string("binary") + Op); assert(F && "binary operator not found!"); Value *Ops[] = {L, R}; return Builder.CreateCall(F, Ops, "binop"); } Value *CallExprAST::codegen() { // Function *CalleeF = getFunction(Callee); if (!CalleeF) return LogErrorV("Unknown function referenced"); // , . if (CalleeF->arg_size() != Args.size()) return LogErrorV("Incorrect # arguments passed"); std::vector<Value *> ArgsV; for (unsigned i = 0, e = Args.size(); i != e; ++i) { ArgsV.push_back(Args[i]->codegen()); if (!ArgsV.back()) return nullptr; } return Builder.CreateCall(CalleeF, ArgsV, "calltmp"); } Value *IfExprAST::codegen() { Value *CondV = Cond->codegen(); if (!CondV) return nullptr; // 0.0. CondV = Builder.CreateFCmpONE( CondV, ConstantFP::get(TheContext, APFloat(0.0)), "ifcond"); Function *TheFunction = Builder.GetInsertBlock()->getParent(); // then else. 'then' // BasicBlock *ThenBB = BasicBlock::Create(TheContext, "then", TheFunction); BasicBlock *ElseBB = BasicBlock::Create(TheContext, "else"); BasicBlock *MergeBB = BasicBlock::Create(TheContext, "ifcont"); Builder.CreateCondBr(CondV, ThenBB, ElseBB); // . Builder.SetInsertPoint(ThenBB); Value *ThenV = Then->codegen(); if (!ThenV) return nullptr; Builder.CreateBr(MergeBB); // 'Then' , ThenBB PHI. ThenBB = Builder.GetInsertBlock(); // "else" TheFunction->getBasicBlockList().push_back(ElseBB); Builder.SetInsertPoint(ElseBB); Value *ElseV = Else->codegen(); if (!ElseV) return nullptr; Builder.CreateBr(MergeBB); // 'Else' , ElseBB PHI. ElseBB = Builder.GetInsertBlock(); // TheFunction->getBasicBlockList().push_back(MergeBB); Builder.SetInsertPoint(MergeBB); PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(TheContext), 2, "iftmp"); PN->addIncoming(ThenV, ThenBB); PN->addIncoming(ElseV, ElseBB); return PN; } // for-loop : // var = alloca double // ... // start = startexpr // store start -> var // goto loop // loop: // ... // bodyexpr // ... // loopend: // step = stepexpr // endcond = endexpr // // curvar = load var // nextvar = curvar + step // store nextvar -> var // br endcond, loop, endloop // outloop: Value *ForExprAST::codegen() { Function *TheFunction = Builder.GetInsertBlock()->getParent(); // alloca AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName); // , 'variable' Value *StartVal = Start->codegen(); if (!StartVal) return nullptr; // alloca. Builder.CreateStore(StartVal, Alloca); // , // . BasicBlock *LoopBB = BasicBlock::Create(TheContext, "loop", TheFunction); // LoopBB. Builder.CreateBr(LoopBB); // LoopBB. Builder.SetInsertPoint(LoopBB); // , PHI-. // , , . AllocaInst *OldVal = NamedValues[VarName]; NamedValues[VarName] = Alloca; // . , , // BB. , , , // . if (!Body->codegen()) return nullptr; // Value *StepVal = nullptr; if (Step) { StepVal = Step->codegen(); if (!StepVal) return nullptr; } else { // , 1.0. StepVal = ConstantFP::get(TheContext, APFloat(1.0)); } // Value *EndCond = End->codegen(); if (!EndCond) return nullptr; // , , alloca. // Value *CurVar = Builder.CreateLoad(Alloca, VarName.c_str()); Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar"); Builder.CreateStore(NextVar, Alloca); // 0.0. EndCond = Builder.CreateFCmpONE( EndCond, ConstantFP::get(TheContext, APFloat(0.0)), "loopcond"); // " " BasicBlock *AfterBB = BasicBlock::Create(TheContext, "afterloop", TheFunction); // LoopEndBB. Builder.CreateCondBr(EndCond, LoopBB, AfterBB); // AfterBB. Builder.SetInsertPoint(AfterBB); // . if (OldVal) NamedValues[VarName] = OldVal; else NamedValues.erase(VarName); // 0.0. return Constant::getNullValue(Type::getDoubleTy(TheContext)); } Value *VarExprAST::codegen() { std::vector<AllocaInst *> OldBindings; Function *TheFunction = Builder.GetInsertBlock()->getParent(); // for (unsigned i = 0, e = VarNames.size(); i != e; ++i) { const std::string &VarName = VarNames[i].first; ExprAST *Init = VarNames[i].second.get(); // , // , // l : // var a = 1 in // var a = a in ... # 'a'. Value *InitVal; if (Init) { InitVal = Init->codegen(); if (!InitVal) return nullptr; } else { // If not specified, use 0.0. InitVal = ConstantFP::get(TheContext, APFloat(0.0)); } AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName); Builder.CreateStore(InitVal, Alloca); // OldBindings.push_back(NamedValues[VarName]); // NamedValues[VarName] = Alloca; } // Value *BodyVal = Body->codegen(); if (!BodyVal) return nullptr; // for (unsigned i = 0, e = VarNames.size(); i != e; ++i) NamedValues[VarNames[i].first] = OldBindings[i]; // return BodyVal; } Function *PrototypeAST::codegen() { // : double(double,double) etc. std::vector<Type *> Doubles(Args.size(), Type::getDoubleTy(TheContext)); FunctionType *FT = FunctionType::get(Type::getDoubleTy(TheContext), Doubles, false); Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule.get()); // unsigned Idx = 0; for (auto &Arg : F->args()) Arg.setName(Args[Idx++]); return F; } Function *FunctionAST::codegen() { // FunctionProtos, // . auto &P = *Proto; FunctionProtos[Proto->getName()] = std::move(Proto); Function *TheFunction = getFunction(P.getName()); if (!TheFunction) return nullptr; // , if (P.isBinaryOp()) BinopPrecedence[P.getOperatorName()] = P.getBinaryPrecedence(); // BasicBlock *BB = BasicBlock::Create(TheContext, "entry", TheFunction); Builder.SetInsertPoint(BB); // NamedValues. NamedValues.clear(); for (auto &Arg : TheFunction->args()) { // alloca AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName()); // alloca. Builder.CreateStore(&Arg, Alloca); // NamedValues[Arg.getName()] = Alloca; } if (Value *RetVal = Body->codegen()) { // Finish off the function. Builder.CreateRet(RetVal); // verifyFunction(*TheFunction); // TheFPM->run(*TheFunction); return TheFunction; } // , TheFunction->eraseFromParent(); if (P.isBinaryOp()) BinopPrecedence.erase(P.getOperatorName()); return nullptr; } //===----------------------------------------------------------------------===// // JIT //===----------------------------------------------------------------------===// static void InitializeModuleAndPassManager() { // TheModule = llvm::make_unique<Module>("my cool jit", TheContext); TheModule->setDataLayout(TheJIT->getTargetMachine().createDataLayout()); // TheFPM = llvm::make_unique<legacy::FunctionPassManager>(TheModule.get()); // alloca TheFPM->add(createPromoteMemoryToRegisterPass()); // "peephole"-. TheFPM->add(createInstructionCombiningPass()); // TheFPM->add(createReassociatePass()); // TheFPM->add(createGVNPass()); // ( ..). TheFPM->add(createCFGSimplificationPass()); TheFPM->doInitialization(); } static void HandleDefinition() { if (auto FnAST = ParseDefinition()) { if (auto *FnIR = FnAST->codegen()) { fprintf(stderr, "Read function definition:"); FnIR->print(errs()); fprintf(stderr, "\n"); TheJIT->addModule(std::move(TheModule)); InitializeModuleAndPassManager(); } } else { // getNextToken(); } } static void HandleExtern() { if (auto ProtoAST = ParseExtern()) { if (auto *FnIR = ProtoAST->codegen()) { fprintf(stderr, "Read extern: "); FnIR->print(errs()); fprintf(stderr, "\n"); FunctionProtos[ProtoAST->getName()] = std::move(ProtoAST); } } else { // getNextToken(); } } static void HandleTopLevelExpression() { // if (auto FnAST = ParseTopLevelExpr()) { if (FnAST->codegen()) { // JIT , // auto H = TheJIT->addModule(std::move(TheModule)); InitializeModuleAndPassManager(); // JIT __anon_expr auto ExprSymbol = TheJIT->findSymbol("__anon_expr"); assert(ExprSymbol && "Function not found"); // ( // , double) . double (*FP)() = (double (*)())(intptr_t)cantFail(ExprSymbol.getAddress()); fprintf(stderr, "Evaluated to %f\n", FP()); // JIT. TheJIT->removeModule(H); } } else { // getNextToken(); } } /// top ::= definition | external | expression | ';' static void MainLoop() { while (true) { fprintf(stderr, "ready> "); switch (CurTok) { case tok_eof: return; case ';': // getNextToken(); break; case tok_def: HandleDefinition(); break; case tok_extern: HandleExtern(); break; default: HandleTopLevelExpression(); break; } } } //===----------------------------------------------------------------------===// // "" , //===----------------------------------------------------------------------===// #ifdef LLVM_ON_WIN32 #define DLLEXPORT __declspec(dllexport) #else #define DLLEXPORT #endif /// putchard - putchar, double, 0. extern "C" DLLEXPORT double putchard(double X) { fputc((char)X, stderr); return 0; } /// printd - printf, double "%f\n", 0. extern "C" DLLEXPORT double printd(double X) { fprintf(stderr, "%f\n", X); return 0; } //===----------------------------------------------------------------------===// // main //===----------------------------------------------------------------------===// int main() { InitializeNativeTarget(); InitializeNativeTargetAsmPrinter(); InitializeNativeTargetAsmParser(); // // 1 - BinopPrecedence['='] = 2; BinopPrecedence['<'] = 10; BinopPrecedence['+'] = 20; BinopPrecedence['-'] = 20; BinopPrecedence['*'] = 40; // . // fprintf(stderr, "ready> "); getNextToken(); TheJIT = llvm::make_unique<KaleidoscopeJIT>(); InitializeModuleAndPassManager(); // MainLoop(); return 0; }