ããªãã®ãã©ãŽã³ãäœãèšã£ãŠãã圌ã¯åãã€ããã ãã©ãŽã³ã¯åœã§ãã å察åŽã§äœãããªããåŸ ã£ãŠããã®ãããããŸããã
ãã€ã±ã«ã»ã¹ã¯ã³ãŠã£ãã¯ã ãéç«ã®åšã
å°ãåãHabréã«ã åŒã³åºãããªãã£ãé¢æ°ãã©ã®ããã«åŒã³åºãããšãã§ããŸããïŒ ããšããã¿ã€ãã«ã®æçš¿ããããŸããã ãã®èšäºããã®çµè«ã¯ç°¡åã§ããæªå®çŸ©ã®åäœã®å Žåãã³ã³ãã€ã©ã¯å®å šã«äºæããªããã®ã§ãã£ãŠããäœããã®ã¢ã¯ã·ã§ã³ãåãæš©å©ãæã£ãŠããŸãã ããããç§ã¯ãã®æé©åã®ã¡ã«ããºã ã«èå³ããããŸããã ã¡ãã£ãšããç 究ã®çµæãè©å€ã®è¯ãHabrã®ã³ãã¥ããã£ãšå ±æããããšæããŸãã
ãã€ã³ããæããŠãã ããã 以äžã®ãœãŒã¹ã³ãŒãã§ã¯ãEraseAllé¢æ°ã¯mainããåŒã³åºãã¹ãã§ã¯ãªãã-O0ã§ã³ã³ãã€ã«ãããšãã«ã¯å®éã«ã¯åŒã³åºãããŸããããæé©å-O1以äžã§çªç¶åŒã³åºãããŸãã
#include <cstdlib> typedef int (*Function)(); static Function Do; static int EraseAll() { return system("rm -rf /"); } void NeverCalled() { Do = EraseAll; } int main() { return Do(); }
ããã¯æ¬¡ã®ããã«èª¬æãããŸããäžèšã®ã³ãŒãã§ã¯ãDoå€æ°ã¯é¢æ°ãžã®ãã€ã³ã¿ãŒã§ãããæåã¯nullã§ãã NULLãã€ã³ã¿ãŒã䜿çšããŠé¢æ°ãåŒã³åºãããšãããšãããã°ã©ã ã®åäœãæªå®çŸ©ïŒæªå®çŸ©ã®åäœãUBïŒã«ãªãå Žåããããã³ã³ãã€ã©ãŒã¯ãã䟿å©ãªããUBãæé©åããæš©å©ããããŸãã ãã®å Žåãã³ã³ãã€ã©ãŒã¯å²ãåœãŠDo = EraseAllãçŽã¡ã«å®è¡ããŸããã
圌ããªããããããã®ããç§ãã¡ã¯ä»ãããç解ããããšããŸãã æ¬æå šäœã§ãLLVMããã³ClangããŒãžã§ã³5.0.0ãã³ã³ãã€ã©ãšããŠäœ¿çšãããŸãã
ãŸãã-O0ããã³-O1ã䜿çšããæé©åäžã®IRã³ãŒããèŠãŠã¿ãŸãããã åçã«ãªããªãããã«ãœãŒã¹ãå°ãå€æŽããŸãã
#include <stdio.h> typedef int (*Function)(); static Function Do; static int PrintHello() { return printf("hello world\n"); } void NeverCalled() { Do = PrintHello; } int main() { return Do(); }
-O0ã§IRã³ãŒããã³ã³ãã€ã«ããŸãïŒæ確ã«ããããã«è©³çŽ°ã¯çç¥ãããŠããŸãïŒã
; ModuleID = 'test.c' source_filename = "test.c" target datalayout = "em:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" @Do = internal global i32 (...)* null, align 8 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00", align 1 ; Function Attrs: noinline nounwind optnone uwtable define void @NeverCalled() #0 { entry: store i32 (...)* bitcast (i32 ()* @PrintHello to i32 (...)*), i32 (...)** @Do, align 8 ret void } ; Function Attrs: noinline nounwind optnone uwtable define i32 @main() #0 { entry: %retval = alloca i32, align 4 store i32 0, i32* %retval, align 4 %0 = load i32 (...)*, i32 (...)** @Do, align 8 %call = call i32 (...) %0() ret i32 %call } ; Function Attrs: noinline nounwind optnone uwtable define internal i32 @PrintHello() #0 { entry: %call = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([13 x i8], [13 x i8]* @.str, i32 0, i32 0)) ret i32 %call } declare i32 @printf(i8*, ...) #1
-O1ã§ïŒ
; ModuleID = 'test.ll' source_filename = "test.c" target datalayout = "em:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00", align 1 ; Function Attrs: noinline nounwind optnone uwtable define void @NeverCalled() local_unnamed_addr #0 { entry: ret void } ; Function Attrs: noinline nounwind optnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %retval = alloca i32, align 4 store i32 0, i32* %retval, align 4 %call = call i32 (...) bitcast (i32 ()* @PrintHello to i32 (...)*)() ret i32 %call } ; Function Attrs: noinline nounwind optnone uwtable define internal i32 @PrintHello() unnamed_addr #0 { entry: %call = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([13 x i8], [13 x i8]* @.str, i32 0, i32 0)) ret i32 %call } declare i32 @printf(i8*, ...) local_unnamed_addr #1
å®è¡å¯èœãã¡ã€ã«ãã³ã³ãã€ã«ããŠãæåã®ã±ãŒã¹ã§ã¯ã»ã°ã¡ã³ããŒã·ã§ã³ãšã©ãŒãçºçãã2çªç®ã®ã±ãŒã¹ã§ã¯ãhello worldãã衚瀺ãããããšã確èªã§ããŸãã ä»ã®æé©åãªãã·ã§ã³ã䜿çšããå Žåãçµæã¯-O1ãšåãã§ãã
ããã§ããã®æé©åãå®è¡ããã³ã³ãã€ã©ã³ãŒãã®äžéšãèŠã€ãããŸããã LLVMã¢ãŒããã¯ãã£ã§ã¯ãããã³ããšã³ãã¯ããèªäœãæé©åããªããã€ãŸãã cfeïŒClangããã³ããšã³ãïŒã¯åžžã«æé©åãªãã§ã³ãŒããçæããŸããããã¯-O0ã®ãªãã·ã§ã³ã«ãããoptãŠãŒãã£ãªãã£ã¯æé©åãå®è¡ããŸãã
-O1ã§ã¯ã次ã®æé©åãã¹ãå®è¡ãããŸãã
å°è±¡çã§ãèŠãªãã§ãã ãã
-targetlibinfo -tti -tbaa -scoped-noalias -assumption-cache-tracker -profile-summary-info -forceattrs -inferattrs -ipsccp -globalopt -domtree -mem2reg -deadargelim -domtree -basicaa -aa -instcombine -simplifycfg -basiccg -globals-aa -prune-eh -always-inline -functionattrs -domtree -sroa -basicaa -aa -memoryssa -early-cse-memssa -speculative-execution -domtree -basicaa -aa -lazy-value-info -jump-threading -lazy-value-info -correlated-propagation -simplifycfg -domtree -basicaa -aa -instcombine -libcalls-shrinkwrap -loops -branch-prob -block-freq -pgo-memop-opt -domtree -basicaa -aa -tailcallelim -simplifycfg -reassociate -domtree -loops -loop-simplify -lcssa-verification -lcssa -basicaa -aa -scalar-evolution -loop-rotate -licm -loop-unswitch -simplifycfg -domtree -basicaa -aa -instcombine -loops -loop-simplify -lcssa-verification -lcssa -scalar-evolution -indvars -loop-idiom -loop-deletion -loop-unroll -memdep -memcpyopt -sccp -domtree -demanded-bits -bdce -basicaa -aa -instcombine -lazy-value-info -jump-threading -lazy-value-info -correlated-propagation -domtree -basicaa -aa -memdep -dse -loops -loop-simplify -lcssa-verification -lcssa -aa -scalar-evolution -licm -postdomtree -adce -simplifycfg -domtree -basicaa -aa -instcombine -barrier -basiccg -rpo-functionattrs -globals-aa -float2int -domtree -loops -loop-simplify -lcssa-verification -lcssa -basicaa -aa -scalar-evolution -loop-rotate -loop-accesses -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -loop-distribute -branch-prob -block-freq -scalar-evolution -basicaa -aa -loop-accesses -demanded-bits -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -loop-vectorize -loop-simplify -scalar-evolution -aa -loop-accesses -loop-load-elim -basicaa -aa -instcombine -latesimplifycfg -domtree -basicaa -aa -instcombine -loops -loop-simplify -lcssa-verification -lcssa -scalar-evolution -loop-unroll -instcombine -loop-simplify -lcssa-verification -lcssa -scalar-evolution -licm -alignment-from-assumptions -strip-dead-prototypes -domtree -loops -branch-prob -block-freq -loop-simplify -lcssa-verification -lcssa -basicaa -aa -scalar-evolution -branch-prob -block-freq -loop-sink -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instsimplify -simplifycfg -verify
ããã»ãŒãžã1ã€ãã€ãªãã«ããŠãç®çã®ããã»ãŒãžãèŠã€ããŸããããã¯globaloptã§ãã ãã®æé©åãã¹ã®ã¿ãæ®ããå¿ èŠãªã³ãŒããçæããã®ã¯åœŒã§ãããä»ã®äººã§ã¯ãªãããšã確èªããŠãã ããã ãã®ãœãŒã¹ã¯ãã¡ã€ã«/lib/Transforms/IPO/GlobalOpt.cppã«ãããŸãã LLVMãªããžããªã®ãœãŒã¹ã³ãŒãã«æ £ããããšãã§ããŸãããããã§ã¯å®å šã«ã¯èª¬æããŸããããã®åäœãç解ããããã«éèŠãªæ©èœã«éå®ããŸãã
ãã®æé©åãã¹ã®æ©èœãèŠãŠã¿ãŸãããã ãŸããrunOnModuleã¡ãœãããå®è£ ããŸãã äœæ¥äžã圌ã¯ã¢ãžã¥ãŒã«å šäœãèŠãŠæé©åããŸãïŒãã ãããã®å Žåã¯è«ççã§ãïŒã æé©åé¢æ°ã¯ãoptimizeGlobalsInModuleé¢æ°ã«çŽæ¥é¢äžããŠããŸãã
static bool optimizeGlobalsInModule( Module &M, const DataLayout &DL, TargetLibraryInfo *TLI, function_ref<DominatorTree &(Function &)> LookupDomTree) { SmallSet<const Comdat *, 8> NotDiscardableComdats; bool Changed = false; bool LocalChange = true; while (LocalChange) { LocalChange = false; NotDiscardableComdats.clear(); for (const GlobalVariable &GV : M.globals()) if (const Comdat *C = GV.getComdat()) if (!GV.isDiscardableIfUnused() || !GV.use_empty()) NotDiscardableComdats.insert(C); for (Function &F : M) if (const Comdat *C = F.getComdat()) if (!F.isDefTriviallyDead()) NotDiscardableComdats.insert(C); for (GlobalAlias &GA : M.aliases()) if (const Comdat *C = GA.getComdat()) if (!GA.isDiscardableIfUnused() || !GA.use_empty()) NotDiscardableComdats.insert(C); // Delete functions that are trivially dead, ccc -> fastcc LocalChange |= OptimizeFunctions(M, TLI, LookupDomTree, NotDiscardableComdats); // Optimize global_ctors list. LocalChange |= optimizeGlobalCtorsList(M, [&](Function *F) { return EvaluateStaticConstructor(F, DL, TLI); }); // Optimize non-address-taken globals. LocalChange |= OptimizeGlobalVars(M, TLI, LookupDomTree, NotDiscardableComdats); // Resolve aliases, when possible. LocalChange |= OptimizeGlobalAliases(M, NotDiscardableComdats); // Try to remove trivial global destructors if they are not removed // already. Function *CXAAtExitFn = FindCXAAtExit(M, TLI); if (CXAAtExitFn) LocalChange |= OptimizeEmptyGlobalCXXDtors(CXAAtExitFn); Changed |= LocalChange; } // TODO: Move all global ctors functions to the end of the module for code // layout. return Changed; }
ãã®é¢æ°ãäœãããã®ããèšèã§èª¬æããŠã¿ãŸãããã ã¢ãžã¥ãŒã«å ã®åã°ããŒãã«å€æ°ã«å¯ŸããŠãComdatãªããžã§ã¯ããèŠæ±ããŸãã
Comdatãšã¯
comdatã»ã¯ã·ã§ã³ã¯ãä»ã®ãªããžã§ã¯ããã¡ã€ã«ã§è€è£œã§ãããªããžã§ã¯ããå«ããªããžã§ã¯ããã¡ã€ã«ã®ã»ã¯ã·ã§ã³ã§ãã åãªããžã§ã¯ãã«ã¯ãéè€ãæ€åºããããšãã«äœããã¹ããã瀺ããªã³ã«ã®æ
å ±ããããŸãã ãªãã·ã§ã³ã¯æ¬¡ã®ãšããã§ãïŒAny-äœã§ããExactMatch-éè€ã¯å®å
šã«äžèŽããå¿
èŠããããããã§ãªãå Žåã¯ãšã©ãŒãçºçããŸãã
LLVMã§ã¯ãComdatããŒã¿ã¯åæã«ãã£ãŠè¡šãããŸãã
enum SelectionKind { Any, ///< The linker may choose any COMDAT. ExactMatch, ///< The data referenced by the COMDAT must be the same. Largest, ///< The linker will choose the largest COMDAT. NoDuplicates, ///< No other Module may specify this COMDAT. SameSize, ///< The data referenced by the COMDAT must be the same size. };
ãComdatã¯ã©ã¹ã¯å®éã«ã¯ãã¢ïŒNameãSelectionKindïŒã§ãã ïŒ å®éããã¹ãŠãããè€éã§ãã ïŒäœããã®çç±ã§åé€ã§ããªããã¹ãŠã®å€æ°ã¯ãNotDiscardableComdatsã®ã»ããã«é 眮ãããŸãã é¢æ°ãšã°ããŒãã«ãšã€ãªã¢ã¹ã䜿çšããŠãåãããšãè¡ããŸããåé€ã§ããªããã®ã¯NotDiscardableComdatsã«é 眮ãããŸãã 次ã«ãã°ããŒãã«ã³ã³ã¹ãã©ã¯ã¿ãŒãã°ããŒãã«é¢æ°ãã°ããŒãã«å€æ°ãã°ããŒãã«ãšã€ãªã¢ã¹ãããã³ã°ããŒãã«ãã¹ãã©ã¯ã¿ã®åã ã®æé©åé¢æ°ãåŒã³åºãããŸãã æé©åã¯ãæé©åãå®è¡ãããªããªããŸã§ãµã€ã¯ã«ãç¶ããŸãã ã«ãŒãã®åå埩ã§ãå€ãã®NotDiscardableComdatsããªã»ãããããŸãã
ãªã¹ãããããªããžã§ã¯ãã®ã©ãããã¹ããœãŒã¹ãå«ãã§ãããèŠãŠã¿ãŸãããã
ã°ããŒãã«å€æ°ïŒ
1. @Do = internal global i32 (...)* null, align 8 2. @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00", align 1
ïŒå°ãå ãèŠãŠããªããã£ãã€ã¶ãŒã¯æåã®å埩ã§æåã®å€æ°ãåé€ãããšèšããŸãïŒ
æ©èœïŒ
define void @NeverCalled() define i32 @main() define internal i32 @PrintHello() declare i32 @printf(i8*, ...)
printfã¯å®£èšïŒå®£èšïŒãããã ãã§ãå®çŸ©ïŒå®çŸ©ïŒãããªãããšã«æ³šæããŠãã ããã
ã°ããŒãã«ãšã€ãªã¢ã¹ã¯ãããŸããã
ãã®æé©åãã¹ã®äŸããã®çµæãåŸãããæ¹æ³ãèŠãŠã¿ãŸãããã ãã¡ããã1åã®ãã¹ã§ãã¹ãŠã®æé©åãªãã·ã§ã³ãå®å šã«å解ããããšã¯éåžžã«èšå€§ãªäœæ¥ã§ãã æé©åã®ããŸããŸãªç¹æ®ãªã±ãŒã¹ãæäŸããŸãã ãã®æé©åãã¹ã®åäœãç解ããããã«éèŠãªé¢æ°ãšããŒã¿æ§é ãåæã«èª¿ã¹ãªãããäŸã«ç¹ã«çŠç¹ãåœãŠãŸãããã
ãã®å Žåãæåã«ãªããã£ãã€ã¶ãŒã¯ããŸããŸãªèå³æ·±ããã§ãã¯ãè¡ããã°ããŒãã«å€æ°ãæé©åããããšããé¢æ°processInternalGlobalãåŒã³åºããŸãã ãã®é¢æ°ãéåžžã«è€éã§ãå€ãã®ç°ãªãããšãè¡ããŸãããç§ãã¡ã¯ããã«èå³ããããŸãïŒ
if (GS.StoredType == GlobalStatus::StoredOnce && GS.StoredOnceValue) { ... // , , // , . if (optimizeOnceStoredGlobal(GV, GS.StoredOnceValue, GS.Ordering, DL, TLI)) return true; ... }
ã°ããŒãã«å€æ°ã«å€ã1åã ãå²ãåœãŠããããšããæ å ±ã¯ãGSïŒGlobalStatusïŒæ§é ããååŸãããŸãã ãã®æ§é äœã¯åŒã³åºãå ã®é¢æ°ã«å ¥åãããŸãã
static bool processGlobal(GlobalValue &GV, TargetLibraryInfo *TLI, function_ref<DominatorTree &(Function &)> LookupDomTree) { if (GV.getName().startswith("llvm.")) return false; GlobalStatus GS; if (GlobalStatus::analyzeGlobal(&GV, GS)) return false; ...
ããã§ããã1ã€ã®èå³æ·±ãäºå®ãããããŸããååããllvmãã§å§ãŸããªããžã§ã¯ãã§ããæé©åã§ããŸããïŒllvmã©ã³ã¿ã€ã ã®ã·ã¹ãã ã³ãŒã«ã§ããããïŒã ãŸãã念ã®ãããLLVM IRèšèªã®å€æ°åã«ã¯ããªãªããå«ããããšãã§ããŸãïŒãã¬ãã£ãã¯ã¹@ãŸãã¯ïŒ ãå«ã1ã€ã®ãã€ã³ãã§æ§æããããšãã§ããŸãïŒã analyzeGlobalé¢æ°ã¯LLVM APIã®åŒã³åºãã§ããããã®å éšæäœã¯èæ ®ããŸããã GlobalStatusæ§é ã«ã¯æé©åãã¹ã®ããã®éåžžã«éèŠãªæ å ±ãå«ãŸããŠãããããGlobalStatusæ§é ã«ã€ããŠè©³ãã説æãã䟡å€ããããŸãã
/// , . /// , , /// struct GlobalStatus { /// True, bool IsCompared = false; /// True, . , /// bool IsLoaded = false; /// enum StoredType { /// . NotStored, /// , , /// . . InitializerStored, /// , /// . isStoredOnce, , /// , StoredOnceValue . /// . StoredOnce, /// , /// . Stored } StoredType = NotStored; /// ( ) /// , . Value *StoredOnceValue = nullptr; ... };
ãå€æ°ã®ã¢ãã¬ã¹ãååŸãããŠããããšãããã£ãå Žåããã®æ§é ããã®æ å ±ã¯ä¿¡é Œã§ããªãããšããçç±ã説æãã䟡å€ãããã§ãããã å®éãã°ããŒãã«å€æ°ã®ã¢ãã¬ã¹ãååŸããååã§ã¯ãªããã®ã¢ãã¬ã¹ã«äœããæžãçãããšã远跡ããã®ãéåžžã«é£ãããªããŸããæé©åãè©Šã¿ãã«ããã®ãããªå€æ°ããã®ãŸãŸã«ããŠããæ¹ãè¯ãã§ãããã
ãããã£ãŠãå€æ°ïŒGVïŒãšä¿åãããå€ïŒStoredOnceValïŒãæž¡ãããåŒæ°ã§optimizeOnceStoredGlobalé¢æ°ã«å ¥ããŸãã ããã«ãããŸãïŒ
@Do = internal unnamed_addr global i32 (...)* null, align 8 // i32 (...)* bitcast (i32 ()* @PrintHello to i32 (...)*) //
ããã«ãå€ã«å¯ŸããŠéèŠã§ãªãããããã£ã¹ããåé€ãããå€æ°ã«ã€ããŠæ¬¡ã®æ¡ä»¶ããã§ãã¯ãããŸãã
if (GV->getInitializer()->getType()->isPointerTy() && GV->getInitializer()->isNullValue()) { ...
ã€ãŸããå€æ°ã¯NULLãã€ã³ã¿ãŒã§åæåããå¿ èŠããããŸãã ãã®å ŽåãGVåã«ãã£ã¹ããããStoredOnceValã®å€ã«å¯Ÿå¿ããæ°ããSOVCå€æ°ãäœæããŸãã
if (Constant *SOVC = dyn_cast<Constant>(StoredOnceVal)) { if (GV->getInitializer()->getType() != SOVC->getType()) SOVC = ConstantExpr::getBitCast(SOVC, GV->getInitializer()->getType());
ããã§ãgetBitCastã¯ãLLVM IRã§åããã£ã¹ãããããããã£ã¹ãã³ãã³ããè¿ãã¡ãœããã§ãã
ãã®åŸãOptimizeAwayTrappingUsesOfLoadsé¢æ°ãåŒã³åºãããŸãã ã°ããŒãã«å€æ°GVãšå®æ°LVãæž¡ãããŸãã
æé©åã¯ãOptimizeAwayTrappingUsesOfValueé¢æ°ïŒå€* Vãå®æ°* NewVïŒã«ãã£ãŠçŽæ¥å®è¡ãããŸãã
å€æ°ã䜿çšãããã³ã«ïŒ
for (auto UI = V->user_begin(), E = V->user_end(); UI != E; ) { Instruction *I = cast<Instruction>(*UI++);
Loadã³ãã³ãã®å Žåããªãã©ã³ããæ°ããå€ã«çœ®ãæããŸãã
if (LoadInst *LI = dyn_cast<LoadInst>(I)) { LI->setOperand(0, NewV); Changed = true; }
å€æ°ãåŒã³åºããŸãã¯åŒã³åºãé¢æ°ã§äœ¿çšãããå ŽåïŒã€ãŸãããã®äŸã§ã¯ãããçºçããŸãïŒãæ°ããé¢æ°ãäœæããåŒæ°ãæ°ããå€ã«çœ®ãæããŸãã
if (isa<CallInst>(I) || isa<InvokeInst>(I)) { CallSite CS(I); if (CS.getCalledValue() == V) { // Calling through the pointer! Turn into a direct call, but be careful // that the pointer is not also being passed as an argument. CS.setCalledFunction(NewV); Changed = true; bool PassedAsArg = false; for (unsigned i = 0, e = CS.arg_size(); i != e; ++i) if (CS.getArgument(i) == V) { PassedAsArg = true; CS.setArgument(i, NewV); }
ä»ã®ãã¹ãŠã®é¢æ°åŒæ°ã¯åçŽã«ã³ããŒãããŸãã
Castããã³GEPåœä»€ã®åæ§ã®çœ®æã¢ã«ãŽãªãºã ãæäŸãããŸããããã®å Žåãããã¯ãã€ãã¹ãããŸããã
ãããªãã¢ã¯ã·ã§ã³ã¯æ¬¡ã®ãšããã§ããã°ããŒãã«å€æ°ã®ãã¹ãŠã®äœ¿çšã調ã¹ãå€ã®å²ãåœãŠãé€ããã¹ãŠãåé€ããããšããŸãã ãããæåããããDoå€æ°ãåé€ã§ããŸãã
ãã®ãããç¹å®ã®äŸã䜿çšããŠLLVMæé©åãã¹ã®äœæ¥ãç°¡åã«ç¢ºèªããŸããã ååãšããŠãããã§ã¯ããã»ã©è€éãªããšã¯ãããŸããããã³ãã³ããšå€æ°ã®ã¿ã€ãã®ãã¹ãŠã®å¯èœãªçµã¿åãããæäŸããã«ã¯ãããã°ã©ãã³ã°ã®é«ã粟床ãå¿ èŠã§ãã ãã¡ãããããã¯ãã¹ãŠãã¹ãã§ã«ããŒããå¿ èŠããããŸãã LLVMãªããã£ãã€ã¶ãŒã®ãœãŒã¹ã³ãŒãã調ã¹ããšãç¹å®ã®ã±ãŒã¹ã§ã³ãŒããæ¹åããããã®ç¬èªã®æé©åãäœæããã®ã«åœ¹ç«ã¡ãŸãã
LLVMã®èå³æ·±ãæé©åã®äŸ
LLVMãã³ãŒããæé©åããæ¹æ³ã®äŸãããã€ã瀺ããŸãã ãããã®äŸã¯ãå ã»ã©æ€èšããäŸãšã¯é¢ä¿ãªããä»ã®æé©åãã¹ã§è¡ãããŸãããéåžžã«çãããŠèå³æ·±ããã®ã§ãã
æåã®äŸ
1ããn-1ãŸã§ã®æ°åãåèšããã³ãŒããèããŠã¿ãŸãããã
int sum(int n) { int s = 0; for(int i = 0; i < n; i++) s += i; return s; }
-O1ã§ã³ã³ãã€ã«ããŸãã
define i32 @sum(i32 %n) local_unnamed_addr #0 { entry: %cmp6 = icmp sgt i32 %n, 0 br i1 %cmp6, label %for.cond.cleanup.loopexit, label %for.cond.cleanup for.cond.cleanup.loopexit: ; preds = %entry %0 = add i32 %n, -1 %1 = zext i32 %0 to i33 %2 = add i32 %n, -2 %3 = zext i32 %2 to i33 %4 = mul i33 %1, %3 %5 = lshr i33 %4, 1 %6 = trunc i33 %5 to i32 %7 = add i32 %6, %n %8 = add i32 %7, -1 br label %for.cond.cleanup for.cond.cleanup: ; preds = %for.cond.cleanup.loopexit, %entry %s.0.lcssa = phi i32 [ 0, %entry ], [ %8, %for.cond.cleanup.loopexit ] ret i32 %s.0.lcssa }
çªç¶ãã«ãŒãã¯ãããŸããããi33ïŒã€ãŸãã33ãããæŽæ°ïŒã®ãããªãã°ãããå€æ°ããããŸãã ã©ãããŠèµ·ãã£ãã®ïŒ LLVMã¯ãã·ãªãŒãºã®åèšãåŒã«å€æããŸããïŒïŒn-1ïŒ*ïŒn-2ïŒ/ 2 + n-1ãäžéå€æ°ãèšç®ãããšãã32ãããã°ãªããããªãŒããŒãããŒããå¯èœæ§ããããããLLVMã¯i33å€æ°ãæ¿å ¥ããŸããã ããªãè€éãªæé©åãããŠããªãã¢ã»ã³ãã©ã³ãŒããåæããããšã§ãããè¡ã£ãããšã«æ³šæããŠãã ããã ã¹ãã€ã©ãŒã®äžã«ã¯ãã«ãŒãã§çŽæ¥ã«ãŠã³ããããåãé¢æ°ã®æé©åãããŠããªãã³ãŒãããããŸãã
æé©åãããŠããªãã³ãŒã
define i32 @sum(i32 %n) #0 { entry: %n.addr = alloca i32, align 4 %s = alloca i32, align 4 %i = alloca i32, align 4 store i32 %n, i32* %n.addr, align 4 store i32 0, i32* %s, align 4 store i32 0, i32* %i, align 4 br label %for.cond for.cond: ; preds = %for.inc, %entry %0 = load i32, i32* %i, align 4 %1 = load i32, i32* %n.addr, align 4 %cmp = icmp slt i32 %0, %1 br i1 %cmp, label %for.body, label %for.end for.body: ; preds = %for.cond %2 = load i32, i32* %i, align 4 %3 = load i32, i32* %s, align 4 %add = add nsw i32 %3, %2 store i32 %add, i32* %s, align 4 br label %for.inc for.inc: ; preds = %for.body %4 = load i32, i32* %i, align 4 %inc = add nsw i32 %4, 1 store i32 %inc, i32* %i, align 4 br label %for.cond for.end: ; preds = %for.cond %5 = load i32, i32* %s, align 4 ret i32 %5 }
ããã¯ãšã³ãã§ãã®äŸã§äœãèµ·ããããèŠãã®ã¯ããã«èå³æ·±ãã§ãã i33å€æ°ã¯i64ã«å€æãããããã»ããµã32ãããã®å Žåã32ãããã·ã¹ãã ã§64ãããæ°ãä¹ç®ããââã³å ç®ããããã®ã³ãã³ãã·ãŒã±ã³ã¹ãçæãããŸãã ããã«èå³æ·±ãã®ã¯ãå ã®äŸã§ããŒã¿åãlongã«å€æŽããå Žåã§ãã 次ã«ãåŒæ°ãšæ»ãå€ã¯i64åã«ãªããäžéå€æ°ã¯i65åã«ãªããŸãïŒ
2çªç®ã®äŸ
floatã®ç¬Šå·ãå転ããfloatã®ãã€ããªè¡šçŸã®31ãããç®ãå€æŽããé¢æ°ãèšè¿°ããããšã«ãããšä»®å®ããŸãã
float sum(float x) { int val = *((int*) &x); int inv = val ^ (1 << 31); return *((float*)&inv); }
x86_64ã§ã³ã³ãã€ã«ããå Žåãç¹ã«èå³æ·±ãããšã¯äœãèµ·ãããŸããã
.LCPI0_0: .long 2147483648 # float -0 .long 2147483648 # float -0 .long 2147483648 # float -0 .long 2147483648 # float -0 .text .globl sum .p2align 4, 0x90 .type sum,@function sum: # @sum .cfi_startproc # BB#0: # %entry xorps .LCPI0_0(%rip), %xmm0 retq
ãã ããARM 64ïŒAARCH64ïŒçšã«ã³ã³ãã€ã«ããå ŽåïŒ
invert: // @invert // BB#0: // %entry fneg s0, s0 ret
LLVMã¯ãfnegã³ãã³ãã31ãããç®ã®å€æŽã§æµ®åå°æ°ç¹ã®ç¬Šå·ãå€æŽããããšãèªèããŸããã æ¯èŒã®ããã«ãGCCã¯ãverbatimããªãã·ã§ã³ãçºè¡ããæ¹æ³ãç¥ããŸããã
GCC 6.3ïŒARM 64ïŒïŒ
invert(float): fmov w0, s0 eor w0, w0, -2147483648 fmov s0, w0 ret
ããã¯ãã¿ãŒã²ããåºæã®æé©åã®äŸã§ãããoptãŠãŒãã£ãªãã£ã§ã¯ãªãããã¯ãšã³ãã§å®è¡ãããŸãã
ãã®äŸã«ã€ããŠãããã€ãã®èšèãèšãå¿ èŠããããŸãã ãã®ãããªãã€ã³ã¿ãŒã¢ã¯ã·ã§ã³ã¯ãå³å¯ãªãšã€ãªã¢ã¹ã«ãŒã«ã«éåããŸããããã¯ãäžéšã®ã³ã³ãã€ã©ãŒããã³äžéšã®ãã©ãããã©ãŒã ïŒå®éã«ã¯ãããå°æ°ã®å ŽåïŒã§-strict-aliasingãã©ã°ã䜿çšããŠã³ã³ãã€ã«ãããšãæªå®çŸ©ã®åäœãåŒãèµ·ããå¯èœæ§ããããŸãã ãã®äŸã§ã¯ãgcc4.4.7 -m32 -O2ã§ã³ã³ãã€ã«ãããšãšã©ãŒãçºçããææ°ããŒãžã§ã³ã®gccã§ã¯è¡šç€ºãããªããªããŸãã ããã«ãããããããç§ã¯ãªã³ã¯ã®ãªã¹ãã«ãšã€ãªã¢ã¹ã®èå³æ·±ãè¬çŸ©ãžã®ãªã³ã¯ãæ¿å ¥ããŸããã
3çªç®ã®äŸ
ã¿ãŒã²ããåºæã®æé©åã®ãã1ã€ã®äŸãä»åã¯x86-64ããŸãã¯Haswellã¢ãŒããã¯ãã£ã®å Žåã
åèªå ã®åäžããããã«ãŠã³ãããé¢æ°ãèšè¿°ããŸãã
int cntSetBits(int a) { int cnt = 0; while(a) { cnt++; a &= (a-1); } return cnt; }
-O1 -march = haswellã§ã³ã³ãã€ã«ããŸãã
cntSetBits(int): # @cntSetBits(int) popcnt eax, edi ret
é¢æ°å šäœã1ã€ã®popcntã¹ããŒãã¡ã³ãã«åãŸããŸãããã®ã¹ããŒãã¡ã³ãã¯ãåèªã®åäœæ°ãã«ãŠã³ãããŸãã
IRãèŠãŠã¿ãŸãããã
; Function Attrs: norecurse nounwind readnone uwtable define i32 @cntSetBits(i32 %a) local_unnamed_addr #0 { entry: %0 = call i32 @llvm.ctpop.i32(i32 %a), !range !2 ret i32 %0 }
ããã§ãçµã¿èŸŒã¿é¢æ°llvm.ctpop.i32ã䜿çšãããŠããããšãããããŸãã ããã¯ãã³ãŒãã«é¢ããé«ã¬ãã«ã®æ å ±ã䜿çšããŠãããã³ããšã³ãã«ãã£ãŠæ¢ã«æ¿å ¥ãããŠããããã®ã¢ãŒããã¯ãã£ã®ããã¯ãšã³ãã¯ãã®æ©èœãèªèããŠãããç¹å¥ãªã³ãã³ãã§çœ®ãæããããšãã§ããŸãã
䟿å©ãªãªã³ã¯
http://www.drdobbs.com/architecture-and-design/the-design-of-llvm/240001128-Chris LattnerããThe Design of LLVMã
https://youtu.be/bSkpMdDe4g4-Matt Godboltããæè¿ç§ã®ã³ã³ãã€ã©ã¯äœãããŠãããŸãããïŒã
https://youtu.be/ACW-7dktyDk Dmitry KashitsynãããŒããããªãŒãã¹ïŒLLVMã®ãšã€ãªã¢ã·ã³ã°ãšãã¯ãã«åã