JBreakによるパフォーマンスタスクの分析(パート4)

最後の4番目のタスクの分析:



public double octaPow(double a) { return Math.pow(a, 8); } public double octaPow(double a) { return a * a * a * a * a * a * a * a; } public double octaPow(double a) { return Math.pow(Math.pow(Math.pow(a, 2), 2), 2); } public double octaPow(double a) { a *= a; a *= a; return a * a; }
      
      





条件(簡略化):

どのメソッドが高速で、どれが遅いかを判別します(JRE 1.8.0_161)。
カットベンチマーク、アセンブラの一部、およびJVMからの最適化の分析。



シリーズの他の出版物: パート1パート2 、およびパート3



タスクに関する解説



ご存知のように、浮動小数点演算は悪名高いです。



  1. 複雑で実装依存。
  2. 連想的ではありません。
  3. それらは非論理的な結果をもたらします。
  4. ほとんどの場合、結果を==



    と比較しても意味がありません。


これに関して、提案された方法異なる計算結果を与えることができるが、パフォーマンスの観点ではなく、 算術的な意味で得られることを理解することが重要です。



いくつかの例
  public static void main(String[] args) { double value = 1e15; double delta = 0.0001; System.out.println(value + delta == value); // true double a = 1.010101; double b = 101.0101; double c = 10101.01; System.out.println((a * b) * c != a * (b * c)); // true }
      
      







明らかに間違った答え



このタスクには4つのタイプのアルゴリズムがあったため、より多くの潜在的な答えがあります。

すべてのオプションは同じです、なぜなら JavaにはクールなJITコンパイラがあります! /* */



2番目または4番目のオプションが最速です。 単純な乗算です。

調査した方法の詳細な分析



  public double mathOctaPow(double a) { return Math.pow(a, 8); } public double plainOctaPow(double a) { return a * a * a * a * a * a * a * a; } public double trickyMathOctaPow(double a) { return Math.pow(Math.pow(Math.pow(a, 2), 2), 2); } public double trickyPlainOctaPow(double a) { a *= a; a *= a; return a * a; }
      
      





逆アセンブルされたコードは、次のキーセットを使用して出力されました。



 -XX:+UnlockDiagnosticVMOptions -XX:CompileCommand=print,<>.<> -XX:PrintAssemblyOptions=intel
      
      



plainOctaPow





plainOctaPow



の最も単純なケースから始めましょう。 実際、コード



 a * a * a * a * a * a * a * a
      
      





コードと同等



 ((((((a * a) * a) * a) * a) * a) * a) * a
      
      





乗算演算の左結合性のため。



実質的に、このコードはJITコンパイラー(c1)によって次の命令セットにコンパイルされました( xmm0



レジスタにはdouble a



パラメーターの値のみが含まれてxmm0



ます)。



  0x0000000002c96a3e: vmovapd xmm1, xmm0 0x0000000002c96a42: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a46: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a4a: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a4e: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a52: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a56: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a5a: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a5e: vmovapd xmm0, xmm1
      
      





簡単な取扱説明書
vmovapd xmm1, xmm2



レジスタxmm2



からレジスタxmm1



位置合わせされた倍精度浮動小数点数を配置します (倍精度浮動小数点はどこからでもdoubleと呼ばれます)。 XMM



レジスターのサイズは128bit



ので、一度に最大2つのdoubleを使用できます。 この命令は、それぞれ256ビットと512ビットのサイズのYMMおよびZMMレジスタをサポートします。



vmulsd xmm1, xmm2, xmm3



レジスタxmm2



およびxmm3



からの倍精度値を乗算し、結果をレジスタxmm1



ます。 前の手順と同様に、最大2つのdoubleを同時に乗算できます。 YMMおよびZMMレジスターを使用する場合、それぞれ最大4つおよび8つのdoubleです。


命令のシーケンスは、コードに記述されているものと正確に一致します。つまり、 中間結果a



順次乗算です。 この場合、左結合性に違反して、結果コードを最適化することはできません。



trickyPlainOctaPow





同等の結果を得るのに問題がないことを思い出させてください。 したがって、操作の数を減らすことでコードの最適化を独自に試みることができます。たとえば、連続する乗算を2乗の3つの操作に置き換えます。



trickyPlainOctaPow()



メソッドのコードは、次の一連の命令に意味のある形でコンパイルされます。



 0x0000000002b501be: vmovapd xmm1, xmm0 0x0000000002b501c2: vmulsd xmm1, xmm1, xmm0 0x0000000002b501c6: vmovapd xmm0, xmm1 0x0000000002b501ca: vmulsd xmm0, xmm0, xmm1 0x0000000002b501ce: vmovapd xmm1, xmm0 0x0000000002b501d2: vmulsd xmm1, xmm1, xmm0 0x0000000002b501d6: vmovapd xmm0, xmm1
      
      





ご覧のとおり、演算の総数が減少しました。7回の乗算の代わりに、乗算の2番目のオペランドを準備するために3回の乗算と2回のvmovapd



命令が得られました。 各命令の条件付きlatency



考慮すると、結果のコードは約2倍高速になります。



mathOctaPow





Math.pow()



メソッドの実装の内部を見てください:



  public static double pow(double a, double b) { return StrictMath.pow(a, b); }
      
      





最初に注意することは、2番目の引数で渡される次数の値がdouble



型であることです。 このため、関数の実装は通常の乗算​​の場合ほど単純にすることはできません。



同時に、 StrictMath.pow()



はネイティブメソッドです。



  public static native double pow(double a, double b);
      
      





実際的な意味では、これはMath.pow()



を呼び出すことは、JNIを使​​用してネイティブメソッドを呼び出すことになります。 一方、JDKは組み込み関数を広範囲に使用しますHotSpot組み込み関数完全なリストを参照してください)。 その中には_dpow



- Math.pow()



呼び出しを置き換える組み込み関数があります。



後者は、ウォームアップ後、コードがJITコンパイラーでコンパイルされると、 mathOctaPow()



メソッドで次数を計算するためのコードを取得できることを意味します。



mathOctaPow()メソッドのアセンブラコードの内容
  0x0000000002aaacd0: vmovsd xmm1,QWORD PTR [rip+0xffffffffffffff68] # 0x0000000002aaac40 ; {section_word} 0x0000000002aaacd8: vmovsd QWORD PTR [rsp],xmm1 0x0000000002aaacdd: fld QWORD PTR [rsp] 0x0000000002aaace0: vmovsd QWORD PTR [rsp],xmm0 0x0000000002aaace5: fld QWORD PTR [rsp] 0x0000000002aaace8: movabs rax,0x6c4ba7d0 ; {external_word} 0x0000000002aaacf2: fld QWORD PTR [rax] 0x0000000002aaacf4: fucomip st,st(2) 0x0000000002aaacf6: jp 0x0000000002aaad0f 0x0000000002aaacfc: jne 0x0000000002aaad0f 0x0000000002aaad02: fxch st(1) 0x0000000002aaad04: ffree st(0) 0x0000000002aaad06: fincstp 0x0000000002aaad08: fmul st,st(0) 0x0000000002aaad0a: jmp 0x0000000002aab166 0x0000000002aaad0f: fldz 0x0000000002aaad11: fucomip st,st(1) 0x0000000002aaad13: ja 0x0000000002aaad96 0x0000000002aaad19: fld st(1) 0x0000000002aaad1b: fld st(1) 0x0000000002aaad1d: sub rsp,0x8 0x0000000002aaad21: fstcw WORD PTR [rsp] 0x0000000002aaad25: mov eax,DWORD PTR [rsp] 0x0000000002aaad28: or eax,0x300 0x0000000002aaad2e: push rax 0x0000000002aaad2f: fldcw WORD PTR [rsp] 0x0000000002aaad32: pop rax 0x0000000002aaad33: fyl2x 0x0000000002aaad35: sub rsp,0x8 0x0000000002aaad39: fld st(0) 0x0000000002aaad3b: frndint 0x0000000002aaad3d: fsubr st(1),st 0x0000000002aaad3f: fistp DWORD PTR [rsp] 0x0000000002aaad42: f2xm1 0x0000000002aaad44: fld1 0x0000000002aaad46: faddp st(1),st 0x0000000002aaad48: mov eax,DWORD PTR [rsp] 0x0000000002aaad4b: mov ecx,0xfffff800 0x0000000002aaad50: add eax,0x3ff 0x0000000002aaad56: mov edx,eax 0x0000000002aaad58: shl eax,0x14 0x0000000002aaad5b: add edx,0x1 0x0000000002aaad5e: cmove eax,ecx 0x0000000002aaad61: cmp edx,0x1 0x0000000002aaad64: cmove eax,ecx 0x0000000002aaad67: test ecx,edx 0x0000000002aaad69: cmovne eax,ecx 0x0000000002aaad6c: mov DWORD PTR [rsp+0x4],eax 0x0000000002aaad70: mov DWORD PTR [rsp],0x0 0x0000000002aaad77: fmul QWORD PTR [rsp] 0x0000000002aaad7a: add rsp,0x8 0x0000000002aaad7e: fldcw WORD PTR [rsp] 0x0000000002aaad81: add rsp,0x8 0x0000000002aaad85: fucomi st,st(0) 0x0000000002aaad87: jp 0x0000000002aaae36 0x0000000002aaad8d: ffree st(2) 0x0000000002aaad8f: ffree st(1) 0x0000000002aaad91: jmp 0x0000000002aab166 0x0000000002aaad96: fld st(1) 0x0000000002aaad98: frndint 0x0000000002aaad9a: fucomi st,st(2) 0x0000000002aaad9c: jne 0x0000000002aaae36 0x0000000002aaada2: sub rsp,0x8 0x0000000002aaada6: fistp QWORD PTR [rsp] 0x0000000002aaada9: fld st(1) 0x0000000002aaadab: fld st(1) 0x0000000002aaadad: fabs 0x0000000002aaadaf: sub rsp,0x8 0x0000000002aaadb3: fstcw WORD PTR [rsp] 0x0000000002aaadb7: mov eax,DWORD PTR [rsp] 0x0000000002aaadba: or eax,0x300 0x0000000002aaadc0: push rax 0x0000000002aaadc1: fldcw WORD PTR [rsp] 0x0000000002aaadc4: pop rax 0x0000000002aaadc5: fyl2x 0x0000000002aaadc7: sub rsp,0x8 0x0000000002aaadcb: fld st(0) 0x0000000002aaadcd: frndint 0x0000000002aaadcf: fsubr st(1),st 0x0000000002aaadd1: fistp DWORD PTR [rsp] 0x0000000002aaadd4: f2xm1 0x0000000002aaadd6: fld1 0x0000000002aaadd8: faddp st(1),st 0x0000000002aaadda: mov eax,DWORD PTR [rsp] 0x0000000002aaaddd: mov ecx,0xfffff800 0x0000000002aaade2: add eax,0x3ff 0x0000000002aaade8: mov edx,eax 0x0000000002aaadea: shl eax,0x14 0x0000000002aaaded: add edx,0x1 0x0000000002aaadf0: cmove eax,ecx 0x0000000002aaadf3: cmp edx,0x1 0x0000000002aaadf6: cmove eax,ecx 0x0000000002aaadf9: test ecx,edx 0x0000000002aaadfb: cmovne eax,ecx 0x0000000002aaadfe: mov DWORD PTR [rsp+0x4],eax 0x0000000002aaae02: mov DWORD PTR [rsp],0x0 0x0000000002aaae09: fmul QWORD PTR [rsp] 0x0000000002aaae0c: add rsp,0x8 0x0000000002aaae10: fldcw WORD PTR [rsp] 0x0000000002aaae13: add rsp,0x8 0x0000000002aaae17: fucomi st,st(0) 0x0000000002aaae19: pop rax 0x0000000002aaae1a: jp 0x0000000002aaae36 0x0000000002aaae20: ffree st(2) 0x0000000002aaae22: ffree st(1) 0x0000000002aaae24: test eax,0x1 0x0000000002aaae29: je 0x0000000002aab166 0x0000000002aaae2f: fchs 0x0000000002aaae31: jmp 0x0000000002aab166 0x0000000002aaae36: ffree st(0) 0x0000000002aaae38: fincstp 0x0000000002aaae3a: mov QWORD PTR [rsp-0x28],rsp 0x0000000002aaae3f: sub rsp,0x80 0x0000000002aaae46: mov QWORD PTR [rsp+0x78],rax 0x0000000002aaae4b: mov QWORD PTR [rsp+0x70],rcx 0x0000000002aaae50: mov QWORD PTR [rsp+0x68],rdx 0x0000000002aaae55: mov QWORD PTR [rsp+0x60],rbx 0x0000000002aaae5a: mov QWORD PTR [rsp+0x50],rbp 0x0000000002aaae5f: mov QWORD PTR [rsp+0x48],rsi 0x0000000002aaae64: mov QWORD PTR [rsp+0x40],rdi 0x0000000002aaae69: mov QWORD PTR [rsp+0x38],r8 0x0000000002aaae6e: mov QWORD PTR [rsp+0x30],r9 0x0000000002aaae73: mov QWORD PTR [rsp+0x28],r10 0x0000000002aaae78: mov QWORD PTR [rsp+0x20],r11 0x0000000002aaae7d: mov QWORD PTR [rsp+0x18],r12 0x0000000002aaae82: mov QWORD PTR [rsp+0x10],r13 0x0000000002aaae87: mov QWORD PTR [rsp+0x8],r14 0x0000000002aaae8c: mov QWORD PTR [rsp],r15 0x0000000002aaae90: sub rsp,0x100 0x0000000002aaae97: vextractf128 XMMWORD PTR [rsp],ymm0,0x1 0x0000000002aaae9e: vextractf128 XMMWORD PTR [rsp+0x10],ymm1,0x1 0x0000000002aaaea6: vextractf128 XMMWORD PTR [rsp+0x20],ymm2,0x1 0x0000000002aaaeae: vextractf128 XMMWORD PTR [rsp+0x30],ymm3,0x1 0x0000000002aaaeb6: vextractf128 XMMWORD PTR [rsp+0x40],ymm4,0x1 0x0000000002aaaebe: vextractf128 XMMWORD PTR [rsp+0x50],ymm5,0x1 0x0000000002aaaec6: vextractf128 XMMWORD PTR [rsp+0x60],ymm6,0x1 0x0000000002aaaece: vextractf128 XMMWORD PTR [rsp+0x70],ymm7,0x1 0x0000000002aaaed6: vextractf128 XMMWORD PTR [rsp+0x80],ymm8,0x1 0x0000000002aaaee1: vextractf128 XMMWORD PTR [rsp+0x90],ymm9,0x1 0x0000000002aaaeec: vextractf128 XMMWORD PTR [rsp+0xa0],ymm10,0x1 0x0000000002aaaef7: vextractf128 XMMWORD PTR [rsp+0xb0],ymm11,0x1 0x0000000002aaaf02: vextractf128 XMMWORD PTR [rsp+0xc0],ymm12,0x1 0x0000000002aaaf0d: vextractf128 XMMWORD PTR [rsp+0xd0],ymm13,0x1 0x0000000002aaaf18: vextractf128 XMMWORD PTR [rsp+0xe0],ymm14,0x1 0x0000000002aaaf23: vextractf128 XMMWORD PTR [rsp+0xf0],ymm15,0x1 0x0000000002aaaf2e: sub rsp,0x100 0x0000000002aaaf35: vmovdqu XMMWORD PTR [rsp],xmm0 0x0000000002aaaf3a: vmovdqu XMMWORD PTR [rsp+0x10],xmm1 0x0000000002aaaf40: vmovdqu XMMWORD PTR [rsp+0x20],xmm2 0x0000000002aaaf46: vmovdqu XMMWORD PTR [rsp+0x30],xmm3 0x0000000002aaaf4c: vmovdqu XMMWORD PTR [rsp+0x40],xmm4 0x0000000002aaaf52: vmovdqu XMMWORD PTR [rsp+0x50],xmm5 0x0000000002aaaf58: vmovdqu XMMWORD PTR [rsp+0x60],xmm6 0x0000000002aaaf5e: vmovdqu XMMWORD PTR [rsp+0x70],xmm7 0x0000000002aaaf64: vmovdqu XMMWORD PTR [rsp+0x80],xmm8 0x0000000002aaaf6d: vmovdqu XMMWORD PTR [rsp+0x90],xmm9 0x0000000002aaaf76: vmovdqu XMMWORD PTR [rsp+0xa0],xmm10 0x0000000002aaaf7f: vmovdqu XMMWORD PTR [rsp+0xb0],xmm11 0x0000000002aaaf88: vmovdqu XMMWORD PTR [rsp+0xc0],xmm12 0x0000000002aaaf91: vmovdqu XMMWORD PTR [rsp+0xd0],xmm13 0x0000000002aaaf9a: vmovdqu XMMWORD PTR [rsp+0xe0],xmm14 0x0000000002aaafa3: vmovdqu XMMWORD PTR [rsp+0xf0],xmm15 0x0000000002aaafac: sub rsp,0x10 0x0000000002aaafb0: fstp QWORD PTR [rsp] 0x0000000002aaafb3: fstp QWORD PTR [rsp+0x8] 0x0000000002aaafb7: vmovsd xmm0,QWORD PTR [rsp] 0x0000000002aaafbc: vmovsd xmm1,QWORD PTR [rsp+0x8] 0x0000000002aaafc2: sub rsp,0x20 0x0000000002aaafc6: test esp,0xf 0x0000000002aaafcc: je 0x0000000002aaafe4 0x0000000002aaafd2: sub rsp,0x8 0x0000000002aaafd6: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002aaafdb: add rsp,0x8 0x0000000002aaafdf: jmp 0x0000000002aaafe9 0x0000000002aaafe4: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002aaafe9: add rsp,0x20 0x0000000002aaafed: vmovsd QWORD PTR [rsp],xmm0 0x0000000002aaaff2: fld QWORD PTR [rsp] 0x0000000002aaaff5: add rsp,0x10 0x0000000002aaaff9: vmovdqu xmm0,XMMWORD PTR [rsp] 0x0000000002aaaffe: vmovdqu xmm1,XMMWORD PTR [rsp+0x10] 0x0000000002aab004: vmovdqu xmm2,XMMWORD PTR [rsp+0x20] 0x0000000002aab00a: vmovdqu xmm3,XMMWORD PTR [rsp+0x30] 0x0000000002aab010: vmovdqu xmm4,XMMWORD PTR [rsp+0x40] 0x0000000002aab016: vmovdqu xmm5,XMMWORD PTR [rsp+0x50] 0x0000000002aab01c: vmovdqu xmm6,XMMWORD PTR [rsp+0x60] 0x0000000002aab022: vmovdqu xmm7,XMMWORD PTR [rsp+0x70] 0x0000000002aab028: vmovdqu xmm8,XMMWORD PTR [rsp+0x80] 0x0000000002aab031: vmovdqu xmm9,XMMWORD PTR [rsp+0x90] 0x0000000002aab03a: vmovdqu xmm10,XMMWORD PTR [rsp+0xa0] 0x0000000002aab043: vmovdqu xmm11,XMMWORD PTR [rsp+0xb0] 0x0000000002aab04c: vmovdqu xmm12,XMMWORD PTR [rsp+0xc0] 0x0000000002aab055: vmovdqu xmm13,XMMWORD PTR [rsp+0xd0] 0x0000000002aab05e: vmovdqu xmm14,XMMWORD PTR [rsp+0xe0] 0x0000000002aab067: vmovdqu xmm15,XMMWORD PTR [rsp+0xf0] 0x0000000002aab070: add rsp,0x100 0x0000000002aab077: vinsertf128 ymm0,ymm0,XMMWORD PTR [rsp],0x1 0x0000000002aab07e: vinsertf128 ymm1,ymm1,XMMWORD PTR [rsp+0x10],0x1 0x0000000002aab086: vinsertf128 ymm2,ymm2,XMMWORD PTR [rsp+0x20],0x1 0x0000000002aab08e: vinsertf128 ymm3,ymm3,XMMWORD PTR [rsp+0x30],0x1 0x0000000002aab096: vinsertf128 ymm4,ymm4,XMMWORD PTR [rsp+0x40],0x1 0x0000000002aab09e: vinsertf128 ymm5,ymm5,XMMWORD PTR [rsp+0x50],0x1 0x0000000002aab0a6: vinsertf128 ymm6,ymm6,XMMWORD PTR [rsp+0x60],0x1 0x0000000002aab0ae: vinsertf128 ymm7,ymm7,XMMWORD PTR [rsp+0x70],0x1 0x0000000002aab0b6: vinsertf128 ymm8,ymm8,XMMWORD PTR [rsp+0x80],0x1 0x0000000002aab0c1: vinsertf128 ymm9,ymm9,XMMWORD PTR [rsp+0x90],0x1 0x0000000002aab0cc: vinsertf128 ymm10,ymm10,XMMWORD PTR [rsp+0xa0],0x1 0x0000000002aab0d7: vinsertf128 ymm11,ymm11,XMMWORD PTR [rsp+0xb0],0x1 0x0000000002aab0e2: vinsertf128 ymm12,ymm12,XMMWORD PTR [rsp+0xc0],0x1 0x0000000002aab0ed: vinsertf128 ymm13,ymm13,XMMWORD PTR [rsp+0xd0],0x1 0x0000000002aab0f8: vinsertf128 ymm14,ymm14,XMMWORD PTR [rsp+0xe0],0x1 0x0000000002aab103: vinsertf128 ymm15,ymm15,XMMWORD PTR [rsp+0xf0],0x1 0x0000000002aab10e: add rsp,0x100 0x0000000002aab115: mov r15,QWORD PTR [rsp] 0x0000000002aab119: mov r14,QWORD PTR [rsp+0x8] 0x0000000002aab11e: mov r13,QWORD PTR [rsp+0x10] 0x0000000002aab123: mov r12,QWORD PTR [rsp+0x18] 0x0000000002aab128: mov r11,QWORD PTR [rsp+0x20] 0x0000000002aab12d: mov r10,QWORD PTR [rsp+0x28] 0x0000000002aab132: mov r9,QWORD PTR [rsp+0x30] 0x0000000002aab137: mov r8,QWORD PTR [rsp+0x38] 0x0000000002aab13c: mov rdi,QWORD PTR [rsp+0x40] 0x0000000002aab141: mov rsi,QWORD PTR [rsp+0x48] 0x0000000002aab146: mov rbp,QWORD PTR [rsp+0x50] 0x0000000002aab14b: mov rbx,QWORD PTR [rsp+0x60] 0x0000000002aab150: mov rdx,QWORD PTR [rsp+0x68] 0x0000000002aab155: mov rcx,QWORD PTR [rsp+0x70] 0x0000000002aab15a: mov rax,QWORD PTR [rsp+0x78] 0x0000000002aab15f: add rsp,0x80 0x0000000002aab166: fstp QWORD PTR [rsp] 0x0000000002aab169: vmovsd xmm0,QWORD PTR [rsp] ;*invokestatic pow ; - ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark::mathOctaPow@4 (line 55)
      
      





ここで、最初の命令は定数8.0



をレジスタxmm1



に書き込み、値a



はすでにレジスタxmm0



ます。 次は、 組み込み関数の本体です。



trickyMathOctaPow





驚くべきことに、 高価な Math.pow()



を1回呼び出す代わりに、3つを取得しました。 JITコンパイラーは、 trickyMathOctaPow()



メソッドの本体を3つの連続した_dpow実装に置き換えました。



順次インライン化_dpow
  0x0000000002a70b14: vmovsd xmm1,QWORD PTR [rip+0xffffffffffffff44] # 0x0000000002a70a60 ; {section_word} 0x0000000002a70b1c: vmovsd QWORD PTR [rsp],xmm1 0x0000000002a70b21: fld QWORD PTR [rsp] 0x0000000002a70b24: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a70b29: fld QWORD PTR [rsp] 0x0000000002a70b2c: movabs rax,0x6c4ba7d0 ; {external_word} 0x0000000002a70b36: fld QWORD PTR [rax] 0x0000000002a70b38: fucomip st,st(2) 0x0000000002a70b3a: jp 0x0000000002a70b53 0x0000000002a70b40: jne 0x0000000002a70b53 0x0000000002a70b46: fxch st(1) 0x0000000002a70b48: ffree st(0) 0x0000000002a70b4a: fincstp 0x0000000002a70b4c: fmul st,st(0) 0x0000000002a70b4e: jmp 0x0000000002a70faa 0x0000000002a70b53: fldz 0x0000000002a70b55: fucomip st,st(1) 0x0000000002a70b57: ja 0x0000000002a70bda 0x0000000002a70b5d: fld st(1) 0x0000000002a70b5f: fld st(1) 0x0000000002a70b61: sub rsp,0x8 0x0000000002a70b65: fstcw WORD PTR [rsp] 0x0000000002a70b69: mov eax,DWORD PTR [rsp] 0x0000000002a70b6c: or eax,0x300 0x0000000002a70b72: push rax 0x0000000002a70b73: fldcw WORD PTR [rsp] 0x0000000002a70b76: pop rax 0x0000000002a70b77: fyl2x 0x0000000002a70b79: sub rsp,0x8 0x0000000002a70b7d: fld st(0) 0x0000000002a70b7f: frndint 0x0000000002a70b81: fsubr st(1),st 0x0000000002a70b83: fistp DWORD PTR [rsp] 0x0000000002a70b86: f2xm1 0x0000000002a70b88: fld1 0x0000000002a70b8a: faddp st(1),st 0x0000000002a70b8c: mov eax,DWORD PTR [rsp] 0x0000000002a70b8f: mov ecx,0xfffff800 0x0000000002a70b94: add eax,0x3ff 0x0000000002a70b9a: mov edx,eax 0x0000000002a70b9c: shl eax,0x14 0x0000000002a70b9f: add edx,0x1 0x0000000002a70ba2: cmove eax,ecx 0x0000000002a70ba5: cmp edx,0x1 0x0000000002a70ba8: cmove eax,ecx 0x0000000002a70bab: test ecx,edx 0x0000000002a70bad: cmovne eax,ecx 0x0000000002a70bb0: mov DWORD PTR [rsp+0x4],eax 0x0000000002a70bb4: mov DWORD PTR [rsp],0x0 0x0000000002a70bbb: fmul QWORD PTR [rsp] 0x0000000002a70bbe: add rsp,0x8 0x0000000002a70bc2: fldcw WORD PTR [rsp] 0x0000000002a70bc5: add rsp,0x8 0x0000000002a70bc9: fucomi st,st(0) 0x0000000002a70bcb: jp 0x0000000002a70c7a 0x0000000002a70bd1: ffree st(2) 0x0000000002a70bd3: ffree st(1) 0x0000000002a70bd5: jmp 0x0000000002a70faa 0x0000000002a70bda: fld st(1) 0x0000000002a70bdc: frndint 0x0000000002a70bde: fucomi st,st(2) 0x0000000002a70be0: jne 0x0000000002a70c7a 0x0000000002a70be6: sub rsp,0x8 0x0000000002a70bea: fistp QWORD PTR [rsp] 0x0000000002a70bed: fld st(1) 0x0000000002a70bef: fld st(1) 0x0000000002a70bf1: fabs 0x0000000002a70bf3: sub rsp,0x8 0x0000000002a70bf7: fstcw WORD PTR [rsp] 0x0000000002a70bfb: mov eax,DWORD PTR [rsp] 0x0000000002a70bfe: or eax,0x300 0x0000000002a70c04: push rax 0x0000000002a70c05: fldcw WORD PTR [rsp] 0x0000000002a70c08: pop rax 0x0000000002a70c09: fyl2x 0x0000000002a70c0b: sub rsp,0x8 0x0000000002a70c0f: fld st(0) 0x0000000002a70c11: frndint 0x0000000002a70c13: fsubr st(1),st 0x0000000002a70c15: fistp DWORD PTR [rsp] 0x0000000002a70c18: f2xm1 0x0000000002a70c1a: fld1 0x0000000002a70c1c: faddp st(1),st 0x0000000002a70c1e: mov eax,DWORD PTR [rsp] 0x0000000002a70c21: mov ecx,0xfffff800 0x0000000002a70c26: add eax,0x3ff 0x0000000002a70c2c: mov edx,eax 0x0000000002a70c2e: shl eax,0x14 0x0000000002a70c31: add edx,0x1 0x0000000002a70c34: cmove eax,ecx 0x0000000002a70c37: cmp edx,0x1 0x0000000002a70c3a: cmove eax,ecx 0x0000000002a70c3d: test ecx,edx 0x0000000002a70c3f: cmovne eax,ecx 0x0000000002a70c42: mov DWORD PTR [rsp+0x4],eax 0x0000000002a70c46: mov DWORD PTR [rsp],0x0 0x0000000002a70c4d: fmul QWORD PTR [rsp] 0x0000000002a70c50: add rsp,0x8 0x0000000002a70c54: fldcw WORD PTR [rsp] 0x0000000002a70c57: add rsp,0x8 0x0000000002a70c5b: fucomi st,st(0) 0x0000000002a70c5d: pop rax 0x0000000002a70c5e: jp 0x0000000002a70c7a 0x0000000002a70c64: ffree st(2) 0x0000000002a70c66: ffree st(1) 0x0000000002a70c68: test eax,0x1 0x0000000002a70c6d: je 0x0000000002a70faa 0x0000000002a70c73: fchs 0x0000000002a70c75: jmp 0x0000000002a70faa 0x0000000002a70c7a: ffree st(0) 0x0000000002a70c7c: fincstp 0x0000000002a70c7e: mov QWORD PTR [rsp-0x28],rsp 0x0000000002a70c83: sub rsp,0x80 0x0000000002a70c8a: mov QWORD PTR [rsp+0x78],rax 0x0000000002a70c8f: mov QWORD PTR [rsp+0x70],rcx 0x0000000002a70c94: mov QWORD PTR [rsp+0x68],rdx 0x0000000002a70c99: mov QWORD PTR [rsp+0x60],rbx 0x0000000002a70c9e: mov QWORD PTR [rsp+0x50],rbp 0x0000000002a70ca3: mov QWORD PTR [rsp+0x48],rsi 0x0000000002a70ca8: mov QWORD PTR [rsp+0x40],rdi 0x0000000002a70cad: mov QWORD PTR [rsp+0x38],r8 0x0000000002a70cb2: mov QWORD PTR [rsp+0x30],r9 0x0000000002a70cb7: mov QWORD PTR [rsp+0x28],r10 0x0000000002a70cbc: mov QWORD PTR [rsp+0x20],r11 0x0000000002a70cc1: mov QWORD PTR [rsp+0x18],r12 0x0000000002a70cc6: mov QWORD PTR [rsp+0x10],r13 0x0000000002a70ccb: mov QWORD PTR [rsp+0x8],r14 0x0000000002a70cd0: mov QWORD PTR [rsp],r15 0x0000000002a70cd4: sub rsp,0x100 0x0000000002a70cdb: vextractf128 XMMWORD PTR [rsp],ymm0,0x1 0x0000000002a70ce2: vextractf128 XMMWORD PTR [rsp+0x10],ymm1,0x1 0x0000000002a70cea: vextractf128 XMMWORD PTR [rsp+0x20],ymm2,0x1 0x0000000002a70cf2: vextractf128 XMMWORD PTR [rsp+0x30],ymm3,0x1 0x0000000002a70cfa: vextractf128 XMMWORD PTR [rsp+0x40],ymm4,0x1 0x0000000002a70d02: vextractf128 XMMWORD PTR [rsp+0x50],ymm5,0x1 0x0000000002a70d0a: vextractf128 XMMWORD PTR [rsp+0x60],ymm6,0x1 0x0000000002a70d12: vextractf128 XMMWORD PTR [rsp+0x70],ymm7,0x1 0x0000000002a70d1a: vextractf128 XMMWORD PTR [rsp+0x80],ymm8,0x1 0x0000000002a70d25: vextractf128 XMMWORD PTR [rsp+0x90],ymm9,0x1 0x0000000002a70d30: vextractf128 XMMWORD PTR [rsp+0xa0],ymm10,0x1 0x0000000002a70d3b: vextractf128 XMMWORD PTR [rsp+0xb0],ymm11,0x1 0x0000000002a70d46: vextractf128 XMMWORD PTR [rsp+0xc0],ymm12,0x1 0x0000000002a70d51: vextractf128 XMMWORD PTR [rsp+0xd0],ymm13,0x1 0x0000000002a70d5c: vextractf128 XMMWORD PTR [rsp+0xe0],ymm14,0x1 0x0000000002a70d67: vextractf128 XMMWORD PTR [rsp+0xf0],ymm15,0x1 0x0000000002a70d72: sub rsp,0x100 0x0000000002a70d79: vmovdqu XMMWORD PTR [rsp],xmm0 0x0000000002a70d7e: vmovdqu XMMWORD PTR [rsp+0x10],xmm1 0x0000000002a70d84: vmovdqu XMMWORD PTR [rsp+0x20],xmm2 0x0000000002a70d8a: vmovdqu XMMWORD PTR [rsp+0x30],xmm3 0x0000000002a70d90: vmovdqu XMMWORD PTR [rsp+0x40],xmm4 0x0000000002a70d96: vmovdqu XMMWORD PTR [rsp+0x50],xmm5 0x0000000002a70d9c: vmovdqu XMMWORD PTR [rsp+0x60],xmm6 0x0000000002a70da2: vmovdqu XMMWORD PTR [rsp+0x70],xmm7 0x0000000002a70da8: vmovdqu XMMWORD PTR [rsp+0x80],xmm8 0x0000000002a70db1: vmovdqu XMMWORD PTR [rsp+0x90],xmm9 0x0000000002a70dba: vmovdqu XMMWORD PTR [rsp+0xa0],xmm10 0x0000000002a70dc3: vmovdqu XMMWORD PTR [rsp+0xb0],xmm11 0x0000000002a70dcc: vmovdqu XMMWORD PTR [rsp+0xc0],xmm12 0x0000000002a70dd5: vmovdqu XMMWORD PTR [rsp+0xd0],xmm13 0x0000000002a70dde: vmovdqu XMMWORD PTR [rsp+0xe0],xmm14 0x0000000002a70de7: vmovdqu XMMWORD PTR [rsp+0xf0],xmm15 0x0000000002a70df0: sub rsp,0x10 0x0000000002a70df4: fstp QWORD PTR [rsp] 0x0000000002a70df7: fstp QWORD PTR [rsp+0x8] 0x0000000002a70dfb: vmovsd xmm0,QWORD PTR [rsp] 0x0000000002a70e00: vmovsd xmm1,QWORD PTR [rsp+0x8] 0x0000000002a70e06: sub rsp,0x20 0x0000000002a70e0a: test esp,0xf 0x0000000002a70e10: je 0x0000000002a70e28 0x0000000002a70e16: sub rsp,0x8 0x0000000002a70e1a: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a70e1f: add rsp,0x8 0x0000000002a70e23: jmp 0x0000000002a70e2d 0x0000000002a70e28: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a70e2d: add rsp,0x20 0x0000000002a70e31: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a70e36: fld QWORD PTR [rsp] 0x0000000002a70e39: add rsp,0x10 0x0000000002a70e3d: vmovdqu xmm0,XMMWORD PTR [rsp] 0x0000000002a70e42: vmovdqu xmm1,XMMWORD PTR [rsp+0x10] 0x0000000002a70e48: vmovdqu xmm2,XMMWORD PTR [rsp+0x20] 0x0000000002a70e4e: vmovdqu xmm3,XMMWORD PTR [rsp+0x30] 0x0000000002a70e54: vmovdqu xmm4,XMMWORD PTR [rsp+0x40] 0x0000000002a70e5a: vmovdqu xmm5,XMMWORD PTR [rsp+0x50] 0x0000000002a70e60: vmovdqu xmm6,XMMWORD PTR [rsp+0x60] 0x0000000002a70e66: vmovdqu xmm7,XMMWORD PTR [rsp+0x70] 0x0000000002a70e6c: vmovdqu xmm8,XMMWORD PTR [rsp+0x80] 0x0000000002a70e75: vmovdqu xmm9,XMMWORD PTR [rsp+0x90] 0x0000000002a70e7e: vmovdqu xmm10,XMMWORD PTR [rsp+0xa0] 0x0000000002a70e87: vmovdqu xmm11,XMMWORD PTR [rsp+0xb0] 0x0000000002a70e90: vmovdqu xmm12,XMMWORD PTR [rsp+0xc0] 0x0000000002a70e99: vmovdqu xmm13,XMMWORD PTR [rsp+0xd0] 0x0000000002a70ea2: vmovdqu xmm14,XMMWORD PTR [rsp+0xe0] 0x0000000002a70eab: vmovdqu xmm15,XMMWORD PTR [rsp+0xf0] 0x0000000002a70eb4: add rsp,0x100 0x0000000002a70ebb: vinsertf128 ymm0,ymm0,XMMWORD PTR [rsp],0x1 0x0000000002a70ec2: vinsertf128 ymm1,ymm1,XMMWORD PTR [rsp+0x10],0x1 0x0000000002a70eca: vinsertf128 ymm2,ymm2,XMMWORD PTR [rsp+0x20],0x1 0x0000000002a70ed2: vinsertf128 ymm3,ymm3,XMMWORD PTR [rsp+0x30],0x1 0x0000000002a70eda: vinsertf128 ymm4,ymm4,XMMWORD PTR [rsp+0x40],0x1 0x0000000002a70ee2: vinsertf128 ymm5,ymm5,XMMWORD PTR [rsp+0x50],0x1 0x0000000002a70eea: vinsertf128 ymm6,ymm6,XMMWORD PTR [rsp+0x60],0x1 0x0000000002a70ef2: vinsertf128 ymm7,ymm7,XMMWORD PTR [rsp+0x70],0x1 0x0000000002a70efa: vinsertf128 ymm8,ymm8,XMMWORD PTR [rsp+0x80],0x1 0x0000000002a70f05: vinsertf128 ymm9,ymm9,XMMWORD PTR [rsp+0x90],0x1 0x0000000002a70f10: vinsertf128 ymm10,ymm10,XMMWORD PTR [rsp+0xa0],0x1 0x0000000002a70f1b: vinsertf128 ymm11,ymm11,XMMWORD PTR [rsp+0xb0],0x1 0x0000000002a70f26: vinsertf128 ymm12,ymm12,XMMWORD PTR [rsp+0xc0],0x1 0x0000000002a70f31: vinsertf128 ymm13,ymm13,XMMWORD PTR [rsp+0xd0],0x1 0x0000000002a70f3c: vinsertf128 ymm14,ymm14,XMMWORD PTR [rsp+0xe0],0x1 0x0000000002a70f47: vinsertf128 ymm15,ymm15,XMMWORD PTR [rsp+0xf0],0x1 0x0000000002a70f52: add rsp,0x100 0x0000000002a70f59: mov r15,QWORD PTR [rsp] 0x0000000002a70f5d: mov r14,QWORD PTR [rsp+0x8] 0x0000000002a70f62: mov r13,QWORD PTR [rsp+0x10] 0x0000000002a70f67: mov r12,QWORD PTR [rsp+0x18] 0x0000000002a70f6c: mov r11,QWORD PTR [rsp+0x20] 0x0000000002a70f71: mov r10,QWORD PTR [rsp+0x28] 0x0000000002a70f76: mov r9,QWORD PTR [rsp+0x30] 0x0000000002a70f7b: mov r8,QWORD PTR [rsp+0x38] 0x0000000002a70f80: mov rdi,QWORD PTR [rsp+0x40] 0x0000000002a70f85: mov rsi,QWORD PTR [rsp+0x48] 0x0000000002a70f8a: mov rbp,QWORD PTR [rsp+0x50] 0x0000000002a70f8f: mov rbx,QWORD PTR [rsp+0x60] 0x0000000002a70f94: mov rdx,QWORD PTR [rsp+0x68] 0x0000000002a70f99: mov rcx,QWORD PTR [rsp+0x70] 0x0000000002a70f9e: mov rax,QWORD PTR [rsp+0x78] 0x0000000002a70fa3: add rsp,0x80 0x0000000002a70faa: fstp QWORD PTR [rsp] 0x0000000002a70fad: vmovsd xmm0,QWORD PTR [rsp] ;*invokestatic pow ; - ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark::trickyMathOctaPow@4 (line 63) 0x0000000002a70fb2: vmovsd xmm1,QWORD PTR [rip+0xfffffffffffffaae] # 0x0000000002a70a68 ; {section_word} 0x0000000002a70fba: vmovsd QWORD PTR [rsp],xmm1 0x0000000002a70fbf: fld QWORD PTR [rsp] 0x0000000002a70fc2: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a70fc7: fld QWORD PTR [rsp] 0x0000000002a70fca: movabs rax,0x6c4ba7d0 ; {external_word} 0x0000000002a70fd4: fld QWORD PTR [rax] 0x0000000002a70fd6: fucomip st,st(2) 0x0000000002a70fd8: jp 0x0000000002a70ff1 0x0000000002a70fde: jne 0x0000000002a70ff1 0x0000000002a70fe4: fxch st(1) 0x0000000002a70fe6: ffree st(0) 0x0000000002a70fe8: fincstp 0x0000000002a70fea: fmul st,st(0) 0x0000000002a70fec: jmp 0x0000000002a71448 0x0000000002a70ff1: fldz 0x0000000002a70ff3: fucomip st,st(1) 0x0000000002a70ff5: ja 0x0000000002a71078 0x0000000002a70ffb: fld st(1) 0x0000000002a70ffd: fld st(1) 0x0000000002a70fff: sub rsp,0x8 0x0000000002a71003: fstcw WORD PTR [rsp] 0x0000000002a71007: mov eax,DWORD PTR [rsp] 0x0000000002a7100a: or eax,0x300 0x0000000002a71010: push rax 0x0000000002a71011: fldcw WORD PTR [rsp] 0x0000000002a71014: pop rax 0x0000000002a71015: fyl2x 0x0000000002a71017: sub rsp,0x8 0x0000000002a7101b: fld st(0) 0x0000000002a7101d: frndint 0x0000000002a7101f: fsubr st(1),st 0x0000000002a71021: fistp DWORD PTR [rsp] 0x0000000002a71024: f2xm1 0x0000000002a71026: fld1 0x0000000002a71028: faddp st(1),st 0x0000000002a7102a: mov eax,DWORD PTR [rsp] 0x0000000002a7102d: mov ecx,0xfffff800 0x0000000002a71032: add eax,0x3ff 0x0000000002a71038: mov edx,eax 0x0000000002a7103a: shl eax,0x14 0x0000000002a7103d: add edx,0x1 0x0000000002a71040: cmove eax,ecx 0x0000000002a71043: cmp edx,0x1 0x0000000002a71046: cmove eax,ecx 0x0000000002a71049: test ecx,edx 0x0000000002a7104b: cmovne eax,ecx 0x0000000002a7104e: mov DWORD PTR [rsp+0x4],eax 0x0000000002a71052: mov DWORD PTR [rsp],0x0 0x0000000002a71059: fmul QWORD PTR [rsp] 0x0000000002a7105c: add rsp,0x8 0x0000000002a71060: fldcw WORD PTR [rsp] 0x0000000002a71063: add rsp,0x8 0x0000000002a71067: fucomi st,st(0) 0x0000000002a71069: jp 0x0000000002a71118 0x0000000002a7106f: ffree st(2) 0x0000000002a71071: ffree st(1) 0x0000000002a71073: jmp 0x0000000002a71448 0x0000000002a71078: fld st(1) 0x0000000002a7107a: frndint 0x0000000002a7107c: fucomi st,st(2) 0x0000000002a7107e: jne 0x0000000002a71118 0x0000000002a71084: sub rsp,0x8 0x0000000002a71088: fistp QWORD PTR [rsp] 0x0000000002a7108b: fld st(1) 0x0000000002a7108d: fld st(1) 0x0000000002a7108f: fabs 0x0000000002a71091: sub rsp,0x8 0x0000000002a71095: fstcw WORD PTR [rsp] 0x0000000002a71099: mov eax,DWORD PTR [rsp] 0x0000000002a7109c: or eax,0x300 0x0000000002a710a2: push rax 0x0000000002a710a3: fldcw WORD PTR [rsp] 0x0000000002a710a6: pop rax 0x0000000002a710a7: fyl2x 0x0000000002a710a9: sub rsp,0x8 0x0000000002a710ad: fld st(0) 0x0000000002a710af: frndint 0x0000000002a710b1: fsubr st(1),st 0x0000000002a710b3: fistp DWORD PTR [rsp] 0x0000000002a710b6: f2xm1 0x0000000002a710b8: fld1 0x0000000002a710ba: faddp st(1),st 0x0000000002a710bc: mov eax,DWORD PTR [rsp] 0x0000000002a710bf: mov ecx,0xfffff800 0x0000000002a710c4: add eax,0x3ff 0x0000000002a710ca: mov edx,eax 0x0000000002a710cc: shl eax,0x14 0x0000000002a710cf: add edx,0x1 0x0000000002a710d2: cmove eax,ecx 0x0000000002a710d5: cmp edx,0x1 0x0000000002a710d8: cmove eax,ecx 0x0000000002a710db: test ecx,edx 0x0000000002a710dd: cmovne eax,ecx 0x0000000002a710e0: mov DWORD PTR [rsp+0x4],eax 0x0000000002a710e4: mov DWORD PTR [rsp],0x0 0x0000000002a710eb: fmul QWORD PTR [rsp] 0x0000000002a710ee: add rsp,0x8 0x0000000002a710f2: fldcw WORD PTR [rsp] 0x0000000002a710f5: add rsp,0x8 0x0000000002a710f9: fucomi st,st(0) 0x0000000002a710fb: pop rax 0x0000000002a710fc: jp 0x0000000002a71118 0x0000000002a71102: ffree st(2) 0x0000000002a71104: ffree st(1) 0x0000000002a71106: test eax,0x1 0x0000000002a7110b: je 0x0000000002a71448 0x0000000002a71111: fchs 0x0000000002a71113: jmp 0x0000000002a71448 0x0000000002a71118: ffree st(0) 0x0000000002a7111a: fincstp 0x0000000002a7111c: mov QWORD PTR [rsp-0x28],rsp 0x0000000002a71121: sub rsp,0x80 0x0000000002a71128: mov QWORD PTR [rsp+0x78],rax 0x0000000002a7112d: mov QWORD PTR [rsp+0x70],rcx 0x0000000002a71132: mov QWORD PTR [rsp+0x68],rdx 0x0000000002a71137: mov QWORD PTR [rsp+0x60],rbx 0x0000000002a7113c: mov QWORD PTR [rsp+0x50],rbp 0x0000000002a71141: mov QWORD PTR [rsp+0x48],rsi 0x0000000002a71146: mov QWORD PTR [rsp+0x40],rdi 0x0000000002a7114b: mov QWORD PTR [rsp+0x38],r8 0x0000000002a71150: mov QWORD PTR [rsp+0x30],r9 0x0000000002a71155: mov QWORD PTR [rsp+0x28],r10 0x0000000002a7115a: mov QWORD PTR [rsp+0x20],r11 0x0000000002a7115f: mov QWORD PTR [rsp+0x18],r12 0x0000000002a71164: mov QWORD PTR [rsp+0x10],r13 0x0000000002a71169: mov QWORD PTR [rsp+0x8],r14 0x0000000002a7116e: mov QWORD PTR [rsp],r15 0x0000000002a71172: sub rsp,0x100 0x0000000002a71179: vextractf128 XMMWORD PTR [rsp],ymm0,0x1 0x0000000002a71180: vextractf128 XMMWORD PTR [rsp+0x10],ymm1,0x1 0x0000000002a71188: vextractf128 XMMWORD PTR [rsp+0x20],ymm2,0x1 0x0000000002a71190: vextractf128 XMMWORD PTR [rsp+0x30],ymm3,0x1 0x0000000002a71198: vextractf128 XMMWORD PTR [rsp+0x40],ymm4,0x1 0x0000000002a711a0: vextractf128 XMMWORD PTR [rsp+0x50],ymm5,0x1 0x0000000002a711a8: vextractf128 XMMWORD PTR [rsp+0x60],ymm6,0x1 0x0000000002a711b0: vextractf128 XMMWORD PTR [rsp+0x70],ymm7,0x1 0x0000000002a711b8: vextractf128 XMMWORD PTR [rsp+0x80],ymm8,0x1 0x0000000002a711c3: vextractf128 XMMWORD PTR [rsp+0x90],ymm9,0x1 0x0000000002a711ce: vextractf128 XMMWORD PTR [rsp+0xa0],ymm10,0x1 0x0000000002a711d9: vextractf128 XMMWORD PTR [rsp+0xb0],ymm11,0x1 0x0000000002a711e4: vextractf128 XMMWORD PTR [rsp+0xc0],ymm12,0x1 0x0000000002a711ef: vextractf128 XMMWORD PTR [rsp+0xd0],ymm13,0x1 0x0000000002a711fa: vextractf128 XMMWORD PTR [rsp+0xe0],ymm14,0x1 0x0000000002a71205: vextractf128 XMMWORD PTR [rsp+0xf0],ymm15,0x1 0x0000000002a71210: sub rsp,0x100 0x0000000002a71217: vmovdqu XMMWORD PTR [rsp],xmm0 0x0000000002a7121c: vmovdqu XMMWORD PTR [rsp+0x10],xmm1 0x0000000002a71222: vmovdqu XMMWORD PTR [rsp+0x20],xmm2 0x0000000002a71228: vmovdqu XMMWORD PTR [rsp+0x30],xmm3 0x0000000002a7122e: vmovdqu XMMWORD PTR [rsp+0x40],xmm4 0x0000000002a71234: vmovdqu XMMWORD PTR [rsp+0x50],xmm5 0x0000000002a7123a: vmovdqu XMMWORD PTR [rsp+0x60],xmm6 0x0000000002a71240: vmovdqu XMMWORD PTR [rsp+0x70],xmm7 0x0000000002a71246: vmovdqu XMMWORD PTR [rsp+0x80],xmm8 0x0000000002a7124f: vmovdqu XMMWORD PTR [rsp+0x90],xmm9 0x0000000002a71258: vmovdqu XMMWORD PTR [rsp+0xa0],xmm10 0x0000000002a71261: vmovdqu XMMWORD PTR [rsp+0xb0],xmm11 0x0000000002a7126a: vmovdqu XMMWORD PTR [rsp+0xc0],xmm12 0x0000000002a71273: vmovdqu XMMWORD PTR [rsp+0xd0],xmm13 0x0000000002a7127c: vmovdqu XMMWORD PTR [rsp+0xe0],xmm14 0x0000000002a71285: vmovdqu XMMWORD PTR [rsp+0xf0],xmm15 0x0000000002a7128e: sub rsp,0x10 0x0000000002a71292: fstp QWORD PTR [rsp] 0x0000000002a71295: fstp QWORD PTR [rsp+0x8] 0x0000000002a71299: vmovsd xmm0,QWORD PTR [rsp] 0x0000000002a7129e: vmovsd xmm1,QWORD PTR [rsp+0x8] 0x0000000002a712a4: sub rsp,0x20 0x0000000002a712a8: test esp,0xf 0x0000000002a712ae: je 0x0000000002a712c6 0x0000000002a712b4: sub rsp,0x8 0x0000000002a712b8: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a712bd: add rsp,0x8 0x0000000002a712c1: jmp 0x0000000002a712cb 0x0000000002a712c6: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a712cb: add rsp,0x20 0x0000000002a712cf: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a712d4: fld QWORD PTR [rsp] 0x0000000002a712d7: add rsp,0x10 0x0000000002a712db: vmovdqu xmm0,XMMWORD PTR [rsp] 0x0000000002a712e0: vmovdqu xmm1,XMMWORD PTR [rsp+0x10] 0x0000000002a712e6: vmovdqu xmm2,XMMWORD PTR [rsp+0x20] 0x0000000002a712ec: vmovdqu xmm3,XMMWORD PTR [rsp+0x30] 0x0000000002a712f2: vmovdqu xmm4,XMMWORD PTR [rsp+0x40] 0x0000000002a712f8: vmovdqu xmm5,XMMWORD PTR [rsp+0x50] 0x0000000002a712fe: vmovdqu xmm6,XMMWORD PTR [rsp+0x60] 0x0000000002a71304: vmovdqu xmm7,XMMWORD PTR [rsp+0x70] 0x0000000002a7130a: vmovdqu xmm8,XMMWORD PTR [rsp+0x80] 0x0000000002a71313: vmovdqu xmm9,XMMWORD PTR [rsp+0x90] 0x0000000002a7131c: vmovdqu xmm10,XMMWORD PTR [rsp+0xa0] 0x0000000002a71325: vmovdqu xmm11,XMMWORD PTR [rsp+0xb0] 0x0000000002a7132e: vmovdqu xmm12,XMMWORD PTR [rsp+0xc0] 0x0000000002a71337: vmovdqu xmm13,XMMWORD PTR [rsp+0xd0] 0x0000000002a71340: vmovdqu xmm14,XMMWORD PTR [rsp+0xe0] 0x0000000002a71349: vmovdqu xmm15,XMMWORD PTR [rsp+0xf0] 0x0000000002a71352: add rsp,0x100 0x0000000002a71359: vinsertf128 ymm0,ymm0,XMMWORD PTR [rsp],0x1 0x0000000002a71360: vinsertf128 ymm1,ymm1,XMMWORD PTR [rsp+0x10],0x1 0x0000000002a71368: vinsertf128 ymm2,ymm2,XMMWORD PTR [rsp+0x20],0x1 0x0000000002a71370: vinsertf128 ymm3,ymm3,XMMWORD PTR [rsp+0x30],0x1 0x0000000002a71378: vinsertf128 ymm4,ymm4,XMMWORD PTR [rsp+0x40],0x1 0x0000000002a71380: vinsertf128 ymm5,ymm5,XMMWORD PTR [rsp+0x50],0x1 0x0000000002a71388: vinsertf128 ymm6,ymm6,XMMWORD PTR [rsp+0x60],0x1 0x0000000002a71390: vinsertf128 ymm7,ymm7,XMMWORD PTR [rsp+0x70],0x1 0x0000000002a71398: vinsertf128 ymm8,ymm8,XMMWORD PTR [rsp+0x80],0x1 0x0000000002a713a3: vinsertf128 ymm9,ymm9,XMMWORD PTR [rsp+0x90],0x1 0x0000000002a713ae: vinsertf128 ymm10,ymm10,XMMWORD PTR [rsp+0xa0],0x1 0x0000000002a713b9: vinsertf128 ymm11,ymm11,XMMWORD PTR [rsp+0xb0],0x1 0x0000000002a713c4: vinsertf128 ymm12,ymm12,XMMWORD PTR [rsp+0xc0],0x1 0x0000000002a713cf: vinsertf128 ymm13,ymm13,XMMWORD PTR [rsp+0xd0],0x1 0x0000000002a713da: vinsertf128 ymm14,ymm14,XMMWORD PTR [rsp+0xe0],0x1 0x0000000002a713e5: vinsertf128 ymm15,ymm15,XMMWORD PTR [rsp+0xf0],0x1 0x0000000002a713f0: add rsp,0x100 0x0000000002a713f7: mov r15,QWORD PTR [rsp] 0x0000000002a713fb: mov r14,QWORD PTR [rsp+0x8] 0x0000000002a71400: mov r13,QWORD PTR [rsp+0x10] 0x0000000002a71405: mov r12,QWORD PTR [rsp+0x18] 0x0000000002a7140a: mov r11,QWORD PTR [rsp+0x20] 0x0000000002a7140f: mov r10,QWORD PTR [rsp+0x28] 0x0000000002a71414: mov r9,QWORD PTR [rsp+0x30] 0x0000000002a71419: mov r8,QWORD PTR [rsp+0x38] 0x0000000002a7141e: mov rdi,QWORD PTR [rsp+0x40] 0x0000000002a71423: mov rsi,QWORD PTR [rsp+0x48] 0x0000000002a71428: mov rbp,QWORD PTR [rsp+0x50] 0x0000000002a7142d: mov rbx,QWORD PTR [rsp+0x60] 0x0000000002a71432: mov rdx,QWORD PTR [rsp+0x68] 0x0000000002a71437: mov rcx,QWORD PTR [rsp+0x70] 0x0000000002a7143c: mov rax,QWORD PTR [rsp+0x78] 0x0000000002a71441: add rsp,0x80 0x0000000002a71448: fstp QWORD PTR [rsp] 0x0000000002a7144b: vmovsd xmm0,QWORD PTR [rsp] ;*invokestatic pow ; - ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark::trickyMathOctaPow@10 (line 63) 0x0000000002a71450: vmovsd xmm1,QWORD PTR [rip+0xfffffffffffff618] # 0x0000000002a70a70 ; {section_word} 0x0000000002a71458: vmovsd QWORD PTR [rsp],xmm1 0x0000000002a7145d: fld QWORD PTR [rsp] 0x0000000002a71460: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a71465: fld QWORD PTR [rsp] 0x0000000002a71468: movabs rax,0x6c4ba7d0 ; {external_word} 0x0000000002a71472: fld QWORD PTR [rax] 0x0000000002a71474: fucomip st,st(2) 0x0000000002a71476: jp 0x0000000002a7148f 0x0000000002a7147c: jne 0x0000000002a7148f 0x0000000002a71482: fxch st(1) 0x0000000002a71484: ffree st(0) 0x0000000002a71486: fincstp 0x0000000002a71488: fmul st,st(0) 0x0000000002a7148a: jmp 0x0000000002a718e6 0x0000000002a7148f: fldz 0x0000000002a71491: fucomip st,st(1) 0x0000000002a71493: ja 0x0000000002a71516 0x0000000002a71499: fld st(1) 0x0000000002a7149b: fld st(1) 0x0000000002a7149d: sub rsp,0x8 0x0000000002a714a1: fstcw WORD PTR [rsp] 0x0000000002a714a5: mov eax,DWORD PTR [rsp] 0x0000000002a714a8: or eax,0x300 0x0000000002a714ae: push rax 0x0000000002a714af: fldcw WORD PTR [rsp] 0x0000000002a714b2: pop rax 0x0000000002a714b3: fyl2x 0x0000000002a714b5: sub rsp,0x8 0x0000000002a714b9: fld st(0) 0x0000000002a714bb: frndint 0x0000000002a714bd: fsubr st(1),st 0x0000000002a714bf: fistp DWORD PTR [rsp] 0x0000000002a714c2: f2xm1 0x0000000002a714c4: fld1 0x0000000002a714c6: faddp st(1),st 0x0000000002a714c8: mov eax,DWORD PTR [rsp] 0x0000000002a714cb: mov ecx,0xfffff800 0x0000000002a714d0: add eax,0x3ff 0x0000000002a714d6: mov edx,eax 0x0000000002a714d8: shl eax,0x14 0x0000000002a714db: add edx,0x1 0x0000000002a714de: cmove eax,ecx 0x0000000002a714e1: cmp edx,0x1 0x0000000002a714e4: cmove eax,ecx 0x0000000002a714e7: test ecx,edx 0x0000000002a714e9: cmovne eax,ecx 0x0000000002a714ec: mov DWORD PTR [rsp+0x4],eax 0x0000000002a714f0: mov DWORD PTR [rsp],0x0 0x0000000002a714f7: fmul QWORD PTR [rsp] 0x0000000002a714fa: add rsp,0x8 0x0000000002a714fe: fldcw WORD PTR [rsp] 0x0000000002a71501: add rsp,0x8 0x0000000002a71505: fucomi st,st(0) 0x0000000002a71507: jp 0x0000000002a715b6 0x0000000002a7150d: ffree st(2) 0x0000000002a7150f: ffree st(1) 0x0000000002a71511: jmp 0x0000000002a718e6 0x0000000002a71516: fld st(1) 0x0000000002a71518: frndint 0x0000000002a7151a: fucomi st,st(2) 0x0000000002a7151c: jne 0x0000000002a715b6 0x0000000002a71522: sub rsp,0x8 0x0000000002a71526: fistp QWORD PTR [rsp] 0x0000000002a71529: fld st(1) 0x0000000002a7152b: fld st(1) 0x0000000002a7152d: fabs 0x0000000002a7152f: sub rsp,0x8 0x0000000002a71533: fstcw WORD PTR [rsp] 0x0000000002a71537: mov eax,DWORD PTR [rsp] 0x0000000002a7153a: or eax,0x300 0x0000000002a71540: push rax 0x0000000002a71541: fldcw WORD PTR [rsp] 0x0000000002a71544: pop rax 0x0000000002a71545: fyl2x 0x0000000002a71547: sub rsp,0x8 0x0000000002a7154b: fld st(0) 0x0000000002a7154d: frndint 0x0000000002a7154f: fsubr st(1),st 0x0000000002a71551: fistp DWORD PTR [rsp] 0x0000000002a71554: f2xm1 0x0000000002a71556: fld1 0x0000000002a71558: faddp st(1),st 0x0000000002a7155a: mov eax,DWORD PTR [rsp] 0x0000000002a7155d: mov ecx,0xfffff800 0x0000000002a71562: add eax,0x3ff 0x0000000002a71568: mov edx,eax 0x0000000002a7156a: shl eax,0x14 0x0000000002a7156d: add edx,0x1 0x0000000002a71570: cmove eax,ecx 0x0000000002a71573: cmp edx,0x1 0x0000000002a71576: cmove eax,ecx 0x0000000002a71579: test ecx,edx 0x0000000002a7157b: cmovne eax,ecx 0x0000000002a7157e: mov DWORD PTR [rsp+0x4],eax 0x0000000002a71582: mov DWORD PTR [rsp],0x0 0x0000000002a71589: fmul QWORD PTR [rsp] 0x0000000002a7158c: add rsp,0x8 0x0000000002a71590: fldcw WORD PTR [rsp] 0x0000000002a71593: add rsp,0x8 0x0000000002a71597: fucomi st,st(0) 0x0000000002a71599: pop rax 0x0000000002a7159a: jp 0x0000000002a715b6 0x0000000002a715a0: ffree st(2) 0x0000000002a715a2: ffree st(1) 0x0000000002a715a4: test eax,0x1 0x0000000002a715a9: je 0x0000000002a718e6 0x0000000002a715af: fchs 0x0000000002a715b1: jmp 0x0000000002a718e6 0x0000000002a715b6: ffree st(0) 0x0000000002a715b8: fincstp 0x0000000002a715ba: mov QWORD PTR [rsp-0x28],rsp 0x0000000002a715bf: sub rsp,0x80 0x0000000002a715c6: mov QWORD PTR [rsp+0x78],rax 0x0000000002a715cb: mov QWORD PTR [rsp+0x70],rcx 0x0000000002a715d0: mov QWORD PTR [rsp+0x68],rdx 0x0000000002a715d5: mov QWORD PTR [rsp+0x60],rbx 0x0000000002a715da: mov QWORD PTR [rsp+0x50],rbp 0x0000000002a715df: mov QWORD PTR [rsp+0x48],rsi 0x0000000002a715e4: mov QWORD PTR [rsp+0x40],rdi 0x0000000002a715e9: mov QWORD PTR [rsp+0x38],r8 0x0000000002a715ee: mov QWORD PTR [rsp+0x30],r9 0x0000000002a715f3: mov QWORD PTR [rsp+0x28],r10 0x0000000002a715f8: mov QWORD PTR [rsp+0x20],r11 0x0000000002a715fd: mov QWORD PTR [rsp+0x18],r12 0x0000000002a71602: mov QWORD PTR [rsp+0x10],r13 0x0000000002a71607: mov QWORD PTR [rsp+0x8],r14 0x0000000002a7160c: mov QWORD PTR [rsp],r15 0x0000000002a71610: sub rsp,0x100 0x0000000002a71617: vextractf128 XMMWORD PTR [rsp],ymm0,0x1 0x0000000002a7161e: vextractf128 XMMWORD PTR [rsp+0x10],ymm1,0x1 0x0000000002a71626: vextractf128 XMMWORD PTR [rsp+0x20],ymm2,0x1 0x0000000002a7162e: vextractf128 XMMWORD PTR [rsp+0x30],ymm3,0x1 0x0000000002a71636: vextractf128 XMMWORD PTR [rsp+0x40],ymm4,0x1 0x0000000002a7163e: vextractf128 XMMWORD PTR [rsp+0x50],ymm5,0x1 0x0000000002a71646: vextractf128 XMMWORD PTR [rsp+0x60],ymm6,0x1 0x0000000002a7164e: vextractf128 XMMWORD PTR [rsp+0x70],ymm7,0x1 0x0000000002a71656: vextractf128 XMMWORD PTR [rsp+0x80],ymm8,0x1 0x0000000002a71661: vextractf128 XMMWORD PTR [rsp+0x90],ymm9,0x1 0x0000000002a7166c: vextractf128 XMMWORD PTR [rsp+0xa0],ymm10,0x1 0x0000000002a71677: vextractf128 XMMWORD PTR [rsp+0xb0],ymm11,0x1 0x0000000002a71682: vextractf128 XMMWORD PTR [rsp+0xc0],ymm12,0x1 0x0000000002a7168d: vextractf128 XMMWORD PTR [rsp+0xd0],ymm13,0x1 0x0000000002a71698: vextractf128 XMMWORD PTR [rsp+0xe0],ymm14,0x1 0x0000000002a716a3: vextractf128 XMMWORD PTR [rsp+0xf0],ymm15,0x1 0x0000000002a716ae: sub rsp,0x100 0x0000000002a716b5: vmovdqu XMMWORD PTR [rsp],xmm0 0x0000000002a716ba: vmovdqu XMMWORD PTR [rsp+0x10],xmm1 0x0000000002a716c0: vmovdqu XMMWORD PTR [rsp+0x20],xmm2 0x0000000002a716c6: vmovdqu XMMWORD PTR [rsp+0x30],xmm3 0x0000000002a716cc: vmovdqu XMMWORD PTR [rsp+0x40],xmm4 0x0000000002a716d2: vmovdqu XMMWORD PTR [rsp+0x50],xmm5 0x0000000002a716d8: vmovdqu XMMWORD PTR [rsp+0x60],xmm6 0x0000000002a716de: vmovdqu XMMWORD PTR [rsp+0x70],xmm7 0x0000000002a716e4: vmovdqu XMMWORD PTR [rsp+0x80],xmm8 0x0000000002a716ed: vmovdqu XMMWORD PTR [rsp+0x90],xmm9 0x0000000002a716f6: vmovdqu XMMWORD PTR [rsp+0xa0],xmm10 0x0000000002a716ff: vmovdqu XMMWORD PTR [rsp+0xb0],xmm11 0x0000000002a71708: vmovdqu XMMWORD PTR [rsp+0xc0],xmm12 0x0000000002a71711: vmovdqu XMMWORD PTR [rsp+0xd0],xmm13 0x0000000002a7171a: vmovdqu XMMWORD PTR [rsp+0xe0],xmm14 0x0000000002a71723: vmovdqu XMMWORD PTR [rsp+0xf0],xmm15 0x0000000002a7172c: sub rsp,0x10 0x0000000002a71730: fstp QWORD PTR [rsp] 0x0000000002a71733: fstp QWORD PTR [rsp+0x8] 0x0000000002a71737: vmovsd xmm0,QWORD PTR [rsp] 0x0000000002a7173c: vmovsd xmm1,QWORD PTR [rsp+0x8] 0x0000000002a71742: sub rsp,0x20 0x0000000002a71746: test esp,0xf 0x0000000002a7174c: je 0x0000000002a71764 0x0000000002a71752: sub rsp,0x8 0x0000000002a71756: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a7175b: add rsp,0x8 0x0000000002a7175f: jmp 0x0000000002a71769 0x0000000002a71764: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a71769: add rsp,0x20 0x0000000002a7176d: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a71772: fld QWORD PTR [rsp] 0x0000000002a71775: add rsp,0x10 0x0000000002a71779: vmovdqu xmm0,XMMWORD PTR [rsp] 0x0000000002a7177e: vmovdqu xmm1,XMMWORD PTR [rsp+0x10] 0x0000000002a71784: vmovdqu xmm2,XMMWORD PTR [rsp+0x20] 0x0000000002a7178a: vmovdqu xmm3,XMMWORD PTR [rsp+0x30] 0x0000000002a71790: vmovdqu xmm4,XMMWORD PTR [rsp+0x40] 0x0000000002a71796: vmovdqu xmm5,XMMWORD PTR [rsp+0x50] 0x0000000002a7179c: vmovdqu xmm6,XMMWORD PTR [rsp+0x60] 0x0000000002a717a2: vmovdqu xmm7,XMMWORD PTR [rsp+0x70] 0x0000000002a717a8: vmovdqu xmm8,XMMWORD PTR [rsp+0x80] 0x0000000002a717b1: vmovdqu xmm9,XMMWORD PTR [rsp+0x90] 0x0000000002a717ba: vmovdqu xmm10,XMMWORD PTR [rsp+0xa0] 0x0000000002a717c3: vmovdqu xmm11,XMMWORD PTR [rsp+0xb0] 0x0000000002a717cc: vmovdqu xmm12,XMMWORD PTR [rsp+0xc0] 0x0000000002a717d5: vmovdqu xmm13,XMMWORD PTR [rsp+0xd0] 0x0000000002a717de: vmovdqu xmm14,XMMWORD PTR [rsp+0xe0] 0x0000000002a717e7: vmovdqu xmm15,XMMWORD PTR [rsp+0xf0] 0x0000000002a717f0: add rsp,0x100 0x0000000002a717f7: vinsertf128 ymm0,ymm0,XMMWORD PTR [rsp],0x1 0x0000000002a717fe: vinsertf128 ymm1,ymm1,XMMWORD PTR [rsp+0x10],0x1 0x0000000002a71806: vinsertf128 ymm2,ymm2,XMMWORD PTR [rsp+0x20],0x1 0x0000000002a7180e: vinsertf128 ymm3,ymm3,XMMWORD PTR [rsp+0x30],0x1 0x0000000002a71816: vinsertf128 ymm4,ymm4,XMMWORD PTR [rsp+0x40],0x1 0x0000000002a7181e: vinsertf128 ymm5,ymm5,XMMWORD PTR [rsp+0x50],0x1 0x0000000002a71826: vinsertf128 ymm6,ymm6,XMMWORD PTR [rsp+0x60],0x1 0x0000000002a7182e: vinsertf128 ymm7,ymm7,XMMWORD PTR [rsp+0x70],0x1 0x0000000002a71836: vinsertf128 ymm8,ymm8,XMMWORD PTR [rsp+0x80],0x1 0x0000000002a71841: vinsertf128 ymm9,ymm9,XMMWORD PTR [rsp+0x90],0x1 0x0000000002a7184c: vinsertf128 ymm10,ymm10,XMMWORD PTR [rsp+0xa0],0x1 0x0000000002a71857: vinsertf128 ymm11,ymm11,XMMWORD PTR [rsp+0xb0],0x1 0x0000000002a71862: vinsertf128 ymm12,ymm12,XMMWORD PTR [rsp+0xc0],0x1 0x0000000002a7186d: vinsertf128 ymm13,ymm13,XMMWORD PTR [rsp+0xd0],0x1 0x0000000002a71878: vinsertf128 ymm14,ymm14,XMMWORD PTR [rsp+0xe0],0x1 0x0000000002a71883: vinsertf128 ymm15,ymm15,XMMWORD PTR [rsp+0xf0],0x1 0x0000000002a7188e: add rsp,0x100 0x0000000002a71895: mov r15,QWORD PTR [rsp] 0x0000000002a71899: mov r14,QWORD PTR [rsp+0x8] 0x0000000002a7189e: mov r13,QWORD PTR [rsp+0x10] 0x0000000002a718a3: mov r12,QWORD PTR [rsp+0x18] 0x0000000002a718a8: mov r11,QWORD PTR [rsp+0x20] 0x0000000002a718ad: mov r10,QWORD PTR [rsp+0x28] 0x0000000002a718b2: mov r9,QWORD PTR [rsp+0x30] 0x0000000002a718b7: mov r8,QWORD PTR [rsp+0x38] 0x0000000002a718bc: mov rdi,QWORD PTR [rsp+0x40] 0x0000000002a718c1: mov rsi,QWORD PTR [rsp+0x48] 0x0000000002a718c6: mov rbp,QWORD PTR [rsp+0x50] 0x0000000002a718cb: mov rbx,QWORD PTR [rsp+0x60] 0x0000000002a718d0: mov rdx,QWORD PTR [rsp+0x68] 0x0000000002a718d5: mov rcx,QWORD PTR [rsp+0x70] 0x0000000002a718da: mov rax,QWORD PTR [rsp+0x78] 0x0000000002a718df: add rsp,0x80 0x0000000002a718e6: fstp QWORD PTR [rsp] 0x0000000002a718e9: vmovsd xmm0,QWORD PTR [rsp] ;*invokestatic pow ; - ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark::trickyMathOctaPow@16 (line 63)
      
      





ただし、組み込み関数 の実装には_dpow



興味深い機能、つまり「特殊なケース」の処理があります。以下は、library_call.cpp OpenJDK 8 ソースコードのスニペットです



 //------------------------------inline_pow------------------------------------- // Inline power instructions, if possible. bool LibraryCallKit::inline_pow() { // Pseudocode for pow // if (y == 2) { // return x * x; // } else { // if (x <= 0.0) { // long longy = (long)y; // if ((double)longy == y) { // if y is long // if (y + 1 == y) longy = 0; // huge number: even // result = ((1&longy) == 0)?-DPow(abs(x), y):DPow(abs(x), y); // } else { // result = NaN; // } // } else { // result = DPow(x,y); // } // if (result != result)? { // result = uncommon_trap() or runtime_call(); // } // return result; // } /* code omitted */ }
      
      





HotSpot開発者は、1つの特定のケースを処理しました-数の2乗です。このため、JITコンパイラーによって置換されたコードは実行のみになりx * x



ます。例として最初の呼び出しを使用して、逆アセンブルされたコードでこのチェックを見つけますMath.pow(a, 2)







  0x0000000002a70b14: vmovsd xmm1,QWORD PTR [rip+0xffffffffffffff44] ;  xmm1   2.0 0x0000000002a70b1c: vmovsd QWORD PTR [rsp],xmm1 0x0000000002a70b21: fld QWORD PTR [rsp] ;   2.0  FPU register stack 0x0000000002a70b24: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a70b29: fld QWORD PTR [rsp] ;   a  FPU register stack 0x0000000002a70b2c: movabs rax,0x6c4ba7d0 0x0000000002a70b36: fld QWORD PTR [rax] ;   2.0  FPU register stack 0x0000000002a70b38: fucomip st,st(2) ;  2.0  2.0 0x0000000002a70b3a: jp 0x0000000002a70b53 0x0000000002a70b40: jne 0x0000000002a70b53 0x0000000002a70b46: fxch st(1) ;    FPU   a 0x0000000002a70b48: ffree st(0) 0x0000000002a70b4a: fincstp 0x0000000002a70b4c: fmul st,st(0) ;  a  a 0x0000000002a70b4e: jmp 0x0000000002a70faa ; code omitted 0x0000000002a70faa: fstp QWORD PTR [rsp] 0x0000000002a70fad: vmovsd xmm0,QWORD PTR [rsp] ;  xmm0   a * a ; code omitted
      
      





ベンチマーク



ベンチマークコード:



 @Fork(value = 3, warmups = 0) @Warmup(iterations = 5, time = 1_000, timeUnit = TimeUnit.MILLISECONDS) @Measurement(iterations = 10, time = 1_000, timeUnit = TimeUnit.MILLISECONDS) @OutputTimeUnit(value = TimeUnit.NANOSECONDS) @BenchmarkMode(Mode.AverageTime) @State(Scope.Benchmark) public class MathBenchmark { public double a; @Setup public void setup() { a = 1234567.890; } @Benchmark public void mathOctaPowBenchmark(Blackhole bh) { bh.consume(mathOctaPow(a)); } @Benchmark public void plainOctaPowBenchmark(Blackhole bh) { bh.consume(plainOctaPow(a)); } @Benchmark public void trickyMathOctaPowBenchmark(Blackhole bh) { bh.consume(trickyMathOctaPow(a)); } @Benchmark public void trickyPlainOctaPowBenchmark(Blackhole bh) { bh.consume(trickyPlainOctaPow(a)); } public double mathOctaPow(double a) { return Math.pow(a, 8); } public double plainOctaPow(double a) { return a * a * a * a * a * a * a * a; } public double trickyMathOctaPow(double a) { return Math.pow(Math.pow(Math.pow(a, 2), 2), 2); } public double trickyPlainOctaPow(double a) { a *= a; a *= a; return a * a; } }
      
      





結果:



 Benchmark Mode Cnt Score Error Units MathBenchmark.mathOctaPowBenchmark avgt 30 76,041 ± 0,428 ns/op MathBenchmark.plainOctaPowBenchmark avgt 30 4,174 ± 0,027 ns/op MathBenchmark.trickyMathOctaPowBenchmark avgt 30 3,010 ± 0,014 ns/op MathBenchmark.trickyPlainOctaPowBenchmark avgt 30 3,011 ± 0,015 ns/op
      
      





ベンチマーク結果全体
 # JMH version: 1.20 # VM version: JDK 1.8.0_161, VM 25.161-b12 # VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe # VM options: <none> # Warmup: 5 iterations, 1000 ms each # Measurement: 10 iterations, 1000 ms each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.mathOctaPowBenchmark # Run progress: 0,00% complete, ETA 00:03:00 # Fork: 1 of 3 # Warmup Iteration 1: 77,026 ns/op # Warmup Iteration 2: 76,561 ns/op # Warmup Iteration 3: 77,623 ns/op # Warmup Iteration 4: 76,192 ns/op # Warmup Iteration 5: 76,012 ns/op Iteration 1: 75,947 ns/op Iteration 2: 75,739 ns/op Iteration 3: 75,864 ns/op Iteration 4: 76,179 ns/op Iteration 5: 75,934 ns/op Iteration 6: 75,783 ns/op Iteration 7: 75,820 ns/op Iteration 8: 75,898 ns/op Iteration 9: 75,798 ns/op Iteration 10: 76,053 ns/op # Run progress: 8,33% complete, ETA 00:02:48 # Fork: 2 of 3 # Warmup Iteration 1: 75,975 ns/op # Warmup Iteration 2: 76,008 ns/op # Warmup Iteration 3: 75,867 ns/op # Warmup Iteration 4: 76,061 ns/op # Warmup Iteration 5: 75,710 ns/op Iteration 1: 75,874 ns/op Iteration 2: 75,862 ns/op Iteration 3: 76,080 ns/op Iteration 4: 75,948 ns/op Iteration 5: 75,848 ns/op Iteration 6: 75,883 ns/op Iteration 7: 76,004 ns/op Iteration 8: 75,790 ns/op Iteration 9: 75,894 ns/op Iteration 10: 75,847 ns/op # Run progress: 16,67% complete, ETA 00:02:33 # Fork: 3 of 3 # Warmup Iteration 1: 75,778 ns/op # Warmup Iteration 2: 75,850 ns/op # Warmup Iteration 3: 75,878 ns/op # Warmup Iteration 4: 76,025 ns/op # Warmup Iteration 5: 76,450 ns/op Iteration 1: 75,791 ns/op Iteration 2: 75,941 ns/op Iteration 3: 75,652 ns/op Iteration 4: 75,795 ns/op Iteration 5: 75,906 ns/op Iteration 6: 78,971 ns/op Iteration 7: 76,055 ns/op Iteration 8: 75,736 ns/op Iteration 9: 75,816 ns/op Iteration 10: 77,537 ns/op Result "ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.mathOctaPowBenchmark": 76,041 ±(99.9%) 0,428 ns/op [Average] (min, avg, max) = (75,652, 76,041, 78,971), stdev = 0,640 CI (99.9%): [75,614, 76,469] (assumes normal distribution) # JMH version: 1.20 # VM version: JDK 1.8.0_161, VM 25.161-b12 # VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe # VM options: <none> # Warmup: 5 iterations, 1000 ms each # Measurement: 10 iterations, 1000 ms each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.plainOctaPowBenchmark # Run progress: 25,00% complete, ETA 00:02:17 # Fork: 1 of 3 # Warmup Iteration 1: 4,622 ns/op # Warmup Iteration 2: 4,406 ns/op # Warmup Iteration 3: 4,169 ns/op # Warmup Iteration 4: 4,163 ns/op # Warmup Iteration 5: 4,153 ns/op Iteration 1: 4,141 ns/op Iteration 2: 4,144 ns/op Iteration 3: 4,141 ns/op Iteration 4: 4,141 ns/op Iteration 5: 4,149 ns/op Iteration 6: 4,136 ns/op Iteration 7: 4,143 ns/op Iteration 8: 4,136 ns/op Iteration 9: 4,140 ns/op Iteration 10: 4,134 ns/op # Run progress: 33,33% complete, ETA 00:02:02 # Fork: 2 of 3 # Warmup Iteration 1: 4,567 ns/op # Warmup Iteration 2: 4,267 ns/op # Warmup Iteration 3: 4,162 ns/op # Warmup Iteration 4: 4,155 ns/op # Warmup Iteration 5: 4,157 ns/op Iteration 1: 4,157 ns/op Iteration 2: 4,151 ns/op Iteration 3: 4,161 ns/op Iteration 4: 4,175 ns/op Iteration 5: 4,136 ns/op Iteration 6: 4,154 ns/op Iteration 7: 4,192 ns/op Iteration 8: 4,206 ns/op Iteration 9: 4,203 ns/op Iteration 10: 4,180 ns/op # Run progress: 41,67% complete, ETA 00:01:47 # Fork: 3 of 3 # Warmup Iteration 1: 4,569 ns/op # Warmup Iteration 2: 4,204 ns/op # Warmup Iteration 3: 4,172 ns/op # Warmup Iteration 4: 4,151 ns/op # Warmup Iteration 5: 4,159 ns/op Iteration 1: 4,141 ns/op Iteration 2: 4,175 ns/op Iteration 3: 4,182 ns/op Iteration 4: 4,205 ns/op Iteration 5: 4,246 ns/op Iteration 6: 4,186 ns/op Iteration 7: 4,273 ns/op Iteration 8: 4,240 ns/op Iteration 9: 4,169 ns/op Iteration 10: 4,270 ns/op Result "ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.plainOctaPowBenchmark": 4,174 ±(99.9%) 0,027 ns/op [Average] (min, avg, max) = (4,134, 4,174, 4,273), stdev = 0,040 CI (99.9%): [4,147, 4,201] (assumes normal distribution) # JMH version: 1.20 # VM version: JDK 1.8.0_161, VM 25.161-b12 # VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe # VM options: <none> # Warmup: 5 iterations, 1000 ms each # Measurement: 10 iterations, 1000 ms each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyMathOctaPowBenchmark # Run progress: 50,00% complete, ETA 00:01:31 # Fork: 1 of 3 # Warmup Iteration 1: 3,396 ns/op # Warmup Iteration 2: 3,237 ns/op # Warmup Iteration 3: 3,156 ns/op # Warmup Iteration 4: 3,020 ns/op # Warmup Iteration 5: 3,001 ns/op Iteration 1: 2,995 ns/op Iteration 2: 3,012 ns/op Iteration 3: 3,014 ns/op Iteration 4: 2,997 ns/op Iteration 5: 3,025 ns/op Iteration 6: 3,015 ns/op Iteration 7: 3,004 ns/op Iteration 8: 2,999 ns/op Iteration 9: 3,033 ns/op Iteration 10: 3,003 ns/op # Run progress: 58,33% complete, ETA 00:01:16 # Fork: 2 of 3 # Warmup Iteration 1: 3,409 ns/op # Warmup Iteration 2: 3,230 ns/op # Warmup Iteration 3: 3,057 ns/op # Warmup Iteration 4: 3,027 ns/op # Warmup Iteration 5: 3,010 ns/op Iteration 1: 3,001 ns/op Iteration 2: 3,001 ns/op Iteration 3: 3,023 ns/op Iteration 4: 3,097 ns/op Iteration 5: 3,017 ns/op Iteration 6: 2,997 ns/op Iteration 7: 3,017 ns/op Iteration 8: 3,011 ns/op Iteration 9: 2,998 ns/op Iteration 10: 2,991 ns/op # Run progress: 66,67% complete, ETA 00:01:01 # Fork: 3 of 3 # Warmup Iteration 1: 3,476 ns/op # Warmup Iteration 2: 3,188 ns/op # Warmup Iteration 3: 2,998 ns/op # Warmup Iteration 4: 2,984 ns/op # Warmup Iteration 5: 3,023 ns/op Iteration 1: 2,999 ns/op Iteration 2: 3,004 ns/op Iteration 3: 2,998 ns/op Iteration 4: 3,059 ns/op Iteration 5: 3,001 ns/op Iteration 6: 3,006 ns/op Iteration 7: 3,002 ns/op Iteration 8: 2,994 ns/op Iteration 9: 3,005 ns/op Iteration 10: 2,989 ns/op Result "ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyMathOctaPowBenchmark": 3,010 ±(99.9%) 0,014 ns/op [Average] (min, avg, max) = (2,989, 3,010, 3,097), stdev = 0,022 CI (99.9%): [2,996, 3,025] (assumes normal distribution) # JMH version: 1.20 # VM version: JDK 1.8.0_161, VM 25.161-b12 # VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe # VM options: <none> # Warmup: 5 iterations, 1000 ms each # Measurement: 10 iterations, 1000 ms each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyPlainOctaPowBenchmark # Run progress: 75,00% complete, ETA 00:00:45 # Fork: 1 of 3 # Warmup Iteration 1: 3,353 ns/op # Warmup Iteration 2: 3,169 ns/op # Warmup Iteration 3: 2,985 ns/op # Warmup Iteration 4: 3,004 ns/op # Warmup Iteration 5: 3,018 ns/op Iteration 1: 2,994 ns/op Iteration 2: 2,986 ns/op Iteration 3: 2,986 ns/op Iteration 4: 3,041 ns/op Iteration 5: 3,000 ns/op Iteration 6: 2,993 ns/op Iteration 7: 2,999 ns/op Iteration 8: 3,001 ns/op Iteration 9: 3,024 ns/op Iteration 10: 2,995 ns/op # Run progress: 83,33% complete, ETA 00:00:30 # Fork: 2 of 3 # Warmup Iteration 1: 3,371 ns/op # Warmup Iteration 2: 3,190 ns/op # Warmup Iteration 3: 3,010 ns/op # Warmup Iteration 4: 2,992 ns/op # Warmup Iteration 5: 2,995 ns/op Iteration 1: 2,993 ns/op Iteration 2: 3,007 ns/op Iteration 3: 2,999 ns/op Iteration 4: 3,006 ns/op Iteration 5: 2,992 ns/op Iteration 6: 3,009 ns/op Iteration 7: 3,013 ns/op Iteration 8: 3,012 ns/op Iteration 9: 3,010 ns/op Iteration 10: 3,000 ns/op # Run progress: 91,67% complete, ETA 00:00:15 # Fork: 3 of 3 # Warmup Iteration 1: 3,388 ns/op # Warmup Iteration 2: 3,239 ns/op # Warmup Iteration 3: 3,046 ns/op # Warmup Iteration 4: 3,146 ns/op # Warmup Iteration 5: 3,008 ns/op Iteration 1: 3,023 ns/op Iteration 2: 3,048 ns/op Iteration 3: 3,039 ns/op Iteration 4: 3,094 ns/op Iteration 5: 3,024 ns/op Iteration 6: 3,004 ns/op Iteration 7: 2,991 ns/op Iteration 8: 3,025 ns/op Iteration 9: 3,006 ns/op Iteration 10: 3,006 ns/op Result "ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyPlainOctaPowBenchmark": 3,011 ±(99.9%) 0,015 ns/op [Average] (min, avg, max) = (2,986, 3,011, 3,094), stdev = 0,023 CI (99.9%): [2,996, 3,026] (assumes normal distribution) # Run complete. Total time: 00:03:03 Benchmark Mode Cnt Score Error Units MathBenchmark.mathOctaPowBenchmark avgt 30 76,041 ± 0,428 ns/op MathBenchmark.plainOctaPowBenchmark avgt 30 4,174 ± 0,027 ns/op MathBenchmark.trickyMathOctaPowBenchmark avgt 30 3,010 ± 0,014 ns/op MathBenchmark.trickyPlainOctaPowBenchmark avgt 30 3,011 ± 0,015 ns/op
      
      





私たちの推論はベンチマーク結果によって確認されます。使用しての違いMath.pow(a, 2)



とは(a * a)



有意ではなかったです。組み込み関数



を使用することの有効性を実証するために、同じベンチマークを実行できますが、組み込み関数を無効にします _dpow







 Benchmark Mode Cnt Score Error Units MathBenchmark.mathOctaPowBenchmark avgt 30 195,222 ± 0,850 ns/op MathBenchmark.plainOctaPowBenchmark avgt 30 4,183 ± 0,030 ns/op MathBenchmark.trickyMathOctaPowBenchmark avgt 30 41,158 ± 0,381 ns/op MathBenchmark.trickyPlainOctaPowBenchmark avgt 30 3,081 ± 0,032 ns/op
      
      





ベンチマーク結果全体
# JMH version: 1.20

# VM version: JDK 1.8.0_161, VM 25.161-b12

# VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe

# VM options: -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dpow

# Warmup: 5 iterations, 1000 ms each

# Measurement: 10 iterations, 1000 ms each

# Timeout: 10 min per iteration

# Threads: 1 thread, will synchronize iterations

# Benchmark mode: Average time, time/op

# Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.mathOctaPowBenchmark



# Run progress: 0,00% complete, ETA 00:03:00

# Fork: 1 of 3

# Warmup Iteration 1: 194,013 ns/op

# Warmup Iteration 2: 197,926 ns/op

# Warmup Iteration 3: 197,374 ns/op

# Warmup Iteration 4: 197,242 ns/op

# Warmup Iteration 5: 202,265 ns/op

Iteration 1: 198,168 ns/op

Iteration 2: 198,107 ns/op

Iteration 3: 197,629 ns/op

Iteration 4: 195,174 ns/op

Iteration 5: 194,771 ns/op

Iteration 6: 194,804 ns/op

Iteration 7: 194,732 ns/op

Iteration 8: 194,932 ns/op

Iteration 9: 194,964 ns/op

Iteration 10: 194,774 ns/op



# Run progress: 8,33% complete, ETA 00:02:48

# Fork: 2 of 3

# Warmup Iteration 1: 200,032 ns/op

# Warmup Iteration 2: 200,323 ns/op

# Warmup Iteration 3: 195,602 ns/op

# Warmup Iteration 4: 194,705 ns/op

# Warmup Iteration 5: 194,277 ns/op

Iteration 1: 194,657 ns/op

Iteration 2: 195,459 ns/op

Iteration 3: 199,108 ns/op

Iteration 4: 195,154 ns/op

Iteration 5: 195,208 ns/op

Iteration 6: 194,692 ns/op

Iteration 7: 194,406 ns/op

Iteration 8: 194,979 ns/op

Iteration 9: 194,950 ns/op

Iteration 10: 194,234 ns/op



# Run progress: 16,67% complete, ETA 00:02:33

# Fork: 3 of 3

# Warmup Iteration 1: 193,094 ns/op

# Warmup Iteration 2: 192,849 ns/op

# Warmup Iteration 3: 195,101 ns/op

# Warmup Iteration 4: 195,456 ns/op

# Warmup Iteration 5: 194,698 ns/op

Iteration 1: 194,806 ns/op

Iteration 2: 194,887 ns/op

Iteration 3: 194,863 ns/op

Iteration 4: 195,134 ns/op

Iteration 5: 194,379 ns/op

Iteration 6: 193,851 ns/op

Iteration 7: 194,085 ns/op

Iteration 8: 194,743 ns/op

Iteration 9: 194,486 ns/op

Iteration 10: 194,508 ns/op



Result «ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.mathOctaPowBenchmark»:

195,222 ±(99.9%) 0,850 ns/op [Average]

(min, avg, max) = (193,851, 195,222, 199,108), stdev = 1,272

CI (99.9%): [194,372, 196,071] (assumes normal distribution)



# JMH version: 1.20

# VM version: JDK 1.8.0_161, VM 25.161-b12

# VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe

# VM options: -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dpow

# Warmup: 5 iterations, 1000 ms each

# Measurement: 10 iterations, 1000 ms each

# Timeout: 10 min per iteration

# Threads: 1 thread, will synchronize iterations

# Benchmark mode: Average time, time/op

# Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.plainOctaPowBenchmark



# Run progress: 25,00% complete, ETA 00:02:17

# Fork: 1 of 3

# Warmup Iteration 1: 4,569 ns/op

# Warmup Iteration 2: 4,238 ns/op

# Warmup Iteration 3: 4,167 ns/op

# Warmup Iteration 4: 4,211 ns/op

# Warmup Iteration 5: 4,267 ns/op

Iteration 1: 4,185 ns/op

Iteration 2: 4,280 ns/op

Iteration 3: 4,186 ns/op

Iteration 4: 4,202 ns/op

Iteration 5: 4,193 ns/op

Iteration 6: 4,360 ns/op

Iteration 7: 4,191 ns/op

Iteration 8: 4,181 ns/op

Iteration 9: 4,176 ns/op

Iteration 10: 4,170 ns/op



# Run progress: 33,33% complete, ETA 00:02:02

# Fork: 2 of 3

# Warmup Iteration 1: 4,573 ns/op

# Warmup Iteration 2: 4,218 ns/op

# Warmup Iteration 3: 4,176 ns/op

# Warmup Iteration 4: 4,155 ns/op

# Warmup Iteration 5: 4,279 ns/op

Iteration 1: 4,251 ns/op

Iteration 2: 4,207 ns/op

Iteration 3: 4,175 ns/op

Iteration 4: 4,174 ns/op

Iteration 5: 4,182 ns/op

Iteration 6: 4,196 ns/op

Iteration 7: 4,169 ns/op

Iteration 8: 4,164 ns/op

Iteration 9: 4,175 ns/op

Iteration 10: 4,157 ns/op



# Run progress: 41,67% complete, ETA 00:01:47

# Fork: 3 of 3

# Warmup Iteration 1: 4,561 ns/op

# Warmup Iteration 2: 4,193 ns/op

# Warmup Iteration 3: 4,139 ns/op

# Warmup Iteration 4: 4,152 ns/op

# Warmup Iteration 5: 4,154 ns/op

Iteration 1: 4,141 ns/op

Iteration 2: 4,144 ns/op

Iteration 3: 4,157 ns/op

Iteration 4: 4,141 ns/op

Iteration 5: 4,162 ns/op

Iteration 6: 4,135 ns/op

Iteration 7: 4,166 ns/op

Iteration 8: 4,156 ns/op

Iteration 9: 4,160 ns/op

Iteration 10: 4,144 ns/op



Result «ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.plainOctaPowBenchmark»:

4,183 ±(99.9%) 0,030 ns/op [Average]

(min, avg, max) = (4,135, 4,183, 4,360), stdev = 0,045

CI (99.9%): [4,152, 4,213] (assumes normal distribution)



# JMH version: 1.20

# VM version: JDK 1.8.0_161, VM 25.161-b12

# VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe

# VM options: -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dpow

# Warmup: 5 iterations, 1000 ms each

# Measurement: 10 iterations, 1000 ms each

# Timeout: 10 min per iteration

# Threads: 1 thread, will synchronize iterations

# Benchmark mode: Average time, time/op

# Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyMathOctaPowBenchmark



# Run progress: 50,00% complete, ETA 00:01:31

# Fork: 1 of 3

# Warmup Iteration 1: 41,544 ns/op

# Warmup Iteration 2: 41,150 ns/op

# Warmup Iteration 3: 41,312 ns/op

# Warmup Iteration 4: 41,196 ns/op

# Warmup Iteration 5: 41,002 ns/op

Iteration 1: 43,681 ns/op

Iteration 2: 41,183 ns/op

Iteration 3: 41,598 ns/op

Iteration 4: 41,703 ns/op

Iteration 5: 41,365 ns/op

Iteration 6: 41,210 ns/op

Iteration 7: 41,380 ns/op

Iteration 8: 41,413 ns/op

Iteration 9: 41,481 ns/op

Iteration 10: 41,763 ns/op



# Run progress: 58,33% complete, ETA 00:01:16

# Fork: 2 of 3

# Warmup Iteration 1: 41,665 ns/op

# Warmup Iteration 2: 40,970 ns/op

# Warmup Iteration 3: 40,872 ns/op

# Warmup Iteration 4: 40,926 ns/op

# Warmup Iteration 5: 40,794 ns/op

Iteration 1: 41,103 ns/op

Iteration 2: 40,991 ns/op

Iteration 3: 40,859 ns/op

Iteration 4: 41,046 ns/op

Iteration 5: 41,241 ns/op

Iteration 6: 40,711 ns/op

Iteration 7: 40,571 ns/op

Iteration 8: 40,928 ns/op

Iteration 9: 40,662 ns/op

Iteration 10: 40,911 ns/op



# Run progress: 66,67% complete, ETA 00:01:01

# Fork: 3 of 3

# Warmup Iteration 1: 42,068 ns/op

# Warmup Iteration 2: 41,017 ns/op

# Warmup Iteration 3: 41,260 ns/op

# Warmup Iteration 4: 41,147 ns/op

# Warmup Iteration 5: 40,777 ns/op

Iteration 1: 41,060 ns/op

Iteration 2: 40,881 ns/op

Iteration 3: 41,014 ns/op

Iteration 4: 40,826 ns/op

Iteration 5: 40,977 ns/op

Iteration 6: 40,837 ns/op

Iteration 7: 41,023 ns/op

Iteration 8: 40,749 ns/op

Iteration 9: 40,959 ns/op

Iteration 10: 40,611 ns/op



Result «ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyMathOctaPowBenchmark»:

41,158 ±(99.9%) 0,381 ns/op [Average]

(min, avg, max) = (40,571, 41,158, 43,681), stdev = 0,570

CI (99.9%): [40,777, 41,538] (assumes normal distribution)



# JMH version: 1.20

# VM version: JDK 1.8.0_161, VM 25.161-b12

# VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe

# VM options: -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dpow

# Warmup: 5 iterations, 1000 ms each

# Measurement: 10 iterations, 1000 ms each

# Timeout: 10 min per iteration

# Threads: 1 thread, will synchronize iterations

# Benchmark mode: Average time, time/op

# Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyPlainOctaPowBenchmark



# Run progress: 75,00% complete, ETA 00:00:45

# Fork: 1 of 3

# Warmup Iteration 1: 3,384 ns/op

# Warmup Iteration 2: 3,214 ns/op

# Warmup Iteration 3: 3,063 ns/op

# Warmup Iteration 4: 3,051 ns/op

# Warmup Iteration 5: 3,073 ns/op

Iteration 1: 3,090 ns/op

Iteration 2: 3,045 ns/op

Iteration 3: 3,054 ns/op

Iteration 4: 3,074 ns/op

Iteration 5: 3,058 ns/op

Iteration 6: 3,059 ns/op

Iteration 7: 3,075 ns/op

Iteration 8: 3,092 ns/op

Iteration 9: 3,155 ns/op

Iteration 10: 3,089 ns/op



# Run progress: 83,33% complete, ETA 00:00:30

# Fork: 2 of 3

# Warmup Iteration 1: 3,442 ns/op

# Warmup Iteration 2: 3,315 ns/op

# Warmup Iteration 3: 3,027 ns/op

# Warmup Iteration 4: 3,031 ns/op

# Warmup Iteration 5: 3,051 ns/op

Iteration 1: 3,032 ns/op

Iteration 2: 3,051 ns/op

Iteration 3: 3,050 ns/op

Iteration 4: 3,076 ns/op

Iteration 5: 3,067 ns/op

Iteration 6: 3,018 ns/op

Iteration 7: 3,034 ns/op

Iteration 8: 3,017 ns/op

Iteration 9: 3,041 ns/op

Iteration 10: 3,023 ns/op



# Run progress: 91,67% complete, ETA 00:00:15

# Fork: 3 of 3

# Warmup Iteration 1: 3,415 ns/op

# Warmup Iteration 2: 3,276 ns/op

# Warmup Iteration 3: 3,344 ns/op

# Warmup Iteration 4: 3,226 ns/op

# Warmup Iteration 5: 3,072 ns/op

Iteration 1: 3,150 ns/op

Iteration 2: 3,132 ns/op

Iteration 3: 3,172 ns/op

Iteration 4: 3,101 ns/op

Iteration 5: 3,053 ns/op

Iteration 6: 3,061 ns/op

Iteration 7: 3,106 ns/op

Iteration 8: 3,150 ns/op

Iteration 9: 3,097 ns/op

Iteration 10: 3,204 ns/op



Result «ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyPlainOctaPowBenchmark»:

3,081 ±(99.9%) 0,032 ns/op [Average]

(min, avg, max) = (3,017, 3,081, 3,204), stdev = 0,048

CI (99.9%): [3,049, 3,113] (assumes normal distribution)



# Run complete. Total time: 00:03:03



Benchmark Mode Cnt Score Error Units

MathBenchmark.mathOctaPowBenchmark avgt 30 195,222 ± 0,850 ns/op

MathBenchmark.plainOctaPowBenchmark avgt 30 4,183 ± 0,030 ns/op

MathBenchmark.trickyMathOctaPowBenchmark avgt 30 41,158 ± 0,381 ns/op

MathBenchmark.trickyPlainOctaPowBenchmark avgt 30 3,081 ± 0,032 ns/op


ネイティブメソッドへの正直な呼び出しの結果が表示されStrictMath.pow()



ます。興味深い事実は、いくつかの課題StrictMath.pow(x, 2)



がまだ優れているということですStrictMath.pow(x, 8)



これは、ネイティブメソッドの実装には、2乗の特殊なケースもあることを示しています。



おわりに



組み込み関数の 実装に関するストーリーは_dpow



一般に別の章に値します。OpenJDKリポジトリの変更から判断すると、組み込みはさまざまなリリースで絶えず変更され、開発者は常に特別なケースを忘れています。Andrey apangin Panginは、Joker 2016カンファレンスでこれについて話しました- 神話と遅いJavaに関する事実



正解



バリアント3と4は、本質的にに減少する組み込み関数の実装の特殊なケースにより、等しく高速x * x



です。



オプション2は、操作が増えるため速度が低下します。



オプション1は、速度が大幅に劣ります。組み込み関数の使用にもかかわらず、数値を型の累乗に変換する複雑なロジックがdouble



呼び出されます。

統計



2人の会議参加者が正しい答えを出しました。別の5つの答えは部分的に正しかった。32のオプションが委託されました。



PS



GitHubの上のすべてのコード:jbreak2018-POW-PERF-テスト



All Articles