☝️ 👎 🎅🏾 ConcurrentReferenceHashMapの作成を高速化する 🤾🏼 🚬 🛀🏽

こんにちは。この記事では、少しの労力でorg.springframework.util.ConcurrentReferenceHashMap

の作成を高速化する方法について説明します。

パフォーマンスの向上に興味がありますか？ようこそ

知能

もちろん、測定から始めて、何が改善されるかを正確に理解しようとします。これを行うには、JMH 1.21、JDK 8、JDK 11、およびasync-profilerを使用します。

空の辞書を作成するのにどれくらいかかるかを知るために、簡単な体験をしました：

 @Benchmark public Object original() { return new ConcurrentReferenceHashMap(); }

プロファイルは次のようになります。

 55.21% 2429743 osuConcurrentReferenceHashMap.calculateShift 20.30% 891404 osuConcurrentReferenceHashMap$Segment.<init> 8.79% 387198 osuConcurrentReferenceHashMap.<init> 3.35% 147651 java.util.concurrent.locks.ReentrantLock.<init> 2.34% 102804 java.lang.ref.ReferenceQueue.<init> 1.61% 70748 osuConcurrentReferenceHashMap.createReferenceManager 1.53% 67265 osuConcurrentReferenceHashMap$Segment.createReferenceArray 0.78% 34493 java.lang.ref.ReferenceQueue$Lock.<init> 0.76% 33546 osuConcurrentReferenceHashMap$ReferenceManager.<init> 0.36% 15948 osuAssert.isTrue

方向は明確です。続行できます。

数学

そのため、 calculateShift

メソッドに時間の大部分を費やしています。ここにあります：

 protected static int calculateShift(int minimumValue, int maximumValue) { int shift = 0; int value = 1; while (value < minimumValue && value < maximumValue) { value <<= 1; shift++; } return shift; }

新しいものを思い付くのは難しいので、使用に切り替えましょう。

 public ConcurrentReferenceHashMap(/*...*/ int concurrencyLevel, /*...*/) { //... this.shift = calculateShift(concurrencyLevel, MAXIMUM_CONCURRENCY_LEVEL); //... } // ConcurrentReferenceHashMap$Segment public Segment(int initialCapacity) { this.referenceManager = createReferenceManager(); this.initialSize = 1 << calculateShift(initialCapacity, MAXIMUM_SEGMENT_SIZE); this.references = createReferenceArray(this.initialSize); this.resizeThreshold = (int) (this.references.length * getLoadFactor()); }

Segment

コンストラクターの使用に注意してください。

 int roundedUpSegmentCapacity = (int) ((initialCapacity + size - 1L) / size); //... for (int i = 0; i < this.segments.length; i++) { this.segments[i] = new Segment(roundedUpSegmentCapacity); }

roundedUpSegmentCapacity

の値はループを通過するとき定数であるため、 Segment

コンストラクターで実行される式1 << calculateShift(initialCapacity, MAXIMUM_SEGMENT_SIZE)

も常に一定です。したがって、コンストラクターとループの外側で指定された式を使用できます。

式(int) (this.references.length * getLoadFactor())

についても同じステートメントが当てはまります。これは、 references

配列が(int) (this.references.length * getLoadFactor())

変数を使用して作成され、各セグメントの作成時にサイズが一定であるためです。コンストラクタとループの境界から式を引き出します。

配列

createReferenceArray

メソッドを検討してください。

 private Reference<K, V>[] createReferenceArray(int size) { return (Reference<K, V>[]) Array.newInstance(Reference.class, size); }

Array::newInstance

使用Array::newInstance

明らかに冗長です。コンストラクタを使用して配列を作成することを妨げるものは何もありません。

 private Reference<K, V>[] createReferenceArray(int size) { return new Reference[size]; }

コンストラクターのパフォーマンスは、C2レベルでArray::newInstance

を呼び出すよりも劣りませんが、C1モード（ -XX:TieredStopAtLevel=1

プロパティ-XX:TieredStopAtLevel=1

）およびインタープリター（ -Xint

プロパティ）の小さな配列の場合よりも大幅に優れています。

 //C2 length Mode Cnt Score Error Units constructor 10 avgt 50 5,6 ± 0,0 ns/op constructor 100 avgt 50 29,7 ± 0,1 ns/op constructor 1000 avgt 50 242,7 ± 1,3 ns/op newInstance 10 avgt 50 5,5 ± 0,0 ns/op newInstance 100 avgt 50 29,7 ± 0,1 ns/op newInstance 1000 avgt 50 249,3 ± 9,6 ns/op //C1 length Mode Cnt Score Error Units constructor 10 avgt 50 6,8 ± 0,1 ns/op constructor 100 avgt 50 36,3 ± 0,6 ns/op constructor 1000 avgt 50 358,6 ± 6,4 ns/op newInstance 10 avgt 50 91,0 ± 2,4 ns/op newInstance 100 avgt 50 127,2 ± 1,8 ns/op newInstance 1000 avgt 50 322,8 ± 7,2 ns/op //-Xint length Mode Cnt Score Error Units constructor 10 avgt 50 126,3 ± 5,9 ns/op constructor 100 avgt 50 154,7 ± 2,6 ns/op constructor 1000 avgt 50 364,2 ± 6,2 ns/op newInstance 10 avgt 50 251,2 ± 11,3 ns/op newInstance 100 avgt 50 287,5 ± 11,4 ns/op newInstance 1000 avgt 50 486,5 ± 8,5 ns/op

置換はベンチマークに影響を与えませんが、C2がまだ機能していないときに、アプリケーションの起動時にコードを高速化します。このモードの詳細については、記事の最後で説明します。

キーささいなこと

コンストラクターConcurrentReferenceHashMap

戻りましょう

 ConcurrentReferenceHashMap(/*...*/) { Assert.isTrue(initialCapacity >= 0, "Initial capacity must not be negative"); Assert.isTrue(loadFactor > 0f, "Load factor must be positive"); Assert.isTrue(concurrencyLevel > 0, "Concurrency level must be positive"); Assert.notNull(referenceType, "Reference type must not be null"); this.loadFactor = loadFactor; this.shift = calculateShift(concurrencyLevel, MAXIMUM_CONCURRENCY_LEVEL); int size = 1 << this.shift; this.referenceType = referenceType; int roundedUpSegmentCapacity = (int) ((initialCapacity + size - 1L) / size); this.segments = (Segment[]) Array.newInstance(Segment.class, size); for (int i = 0; i < this.segments.length; i++) { this.segments[i] = new Segment(roundedUpSegmentCapacity); } }

Array.newInstance

、 Array.newInstance

をコンストラクターに置き換えると、コンパイルエラーが発生します。しかし、このサイクルは非常に興味深いものです。むしろ、 segments

分野へのアピールです。壊滅的な（場合によっては）パフォーマンスがどのようになるかを確認するには、そのような魅力をNitzan Wakartの記事The volatile read supriseでアドバイスします。

この記事で説明されているケースは、問題のコードと相関しているように思えます。セグメントに焦点を当てる：

 this.segments = (Segment[]) Array.newInstance(Segment.class, size); for (int i = 0; i < this.segments.length; i++) { this.segments[i] = new Segment(roundedUpSegmentCapacity); }

配列を作成した直後に、 ConcurrentReferenceHashMap.segments

フィールドに書き込まれ、ループが相互作用するのはこのフィールドです。 Segmentコンストラクター内では、ボラティリティフィールドreferences

レコードがありreferences

。

 private volatile Reference<K, V>[] references; public Segment(int initialCapacity) { //... this.references = createReferenceArray(this.initialSize); //... }

これは、 segments

フィールドへのアクセスを改善することができないことを意味します。つまり、その内容はサイクルの各ターンで読み出されます。この声明の真実を検証するには？最も簡単な方法は、コードを別のパッケージにコピーし、 Segment.references

フィールドの宣言からvolatile

を削除することSegment.references

。

 protected final class Segment extends ReentrantLock { //  private volatile Reference<K, V>[] references; //  private Reference<K, V>[] references; }

何かが変更されたかどうかを確認します。

 @Benchmark public Object original() { return new tsypanov.map.original.ConcurrentReferenceHashMap(); } @Benchmark public Object nonVolatileSegmentReferences() { return new tsypanov.map.nonvolatile.ConcurrentReferenceHashMap(); }

パフォーマンスが大幅に向上しています（JDK 8）：

 Benchmark Mode Cnt Score Error Units original avgt 100 732,1 ± 15,8 ns/op nonVolatileSegmentReferences avgt 100 610,6 ± 15,4 ns/op

JDK 11では、費やされる時間は短縮されましたが、相対的なギャップはほとんど変わりませんでした。

 Benchmark Mode Cnt Score Error Units original avgt 100 473,8 ± 11,2 ns/op nonVolatileSegmentReferences avgt 100 401,9 ± 15,5 ns/op

もちろん、 volatile

はその場所に戻され、別の方法を探す必要があります。ボトルネックが発見されました-これはフィールドへのアクセスです。もしそうなら、 segments

変数を作成し、配列を埋めてからフィールドに書き込むことができます：

 Segment[] segments = (Segment[]) Array.newInstance(Segment.class, size); for (int i = 0; i < segments.length; i++) { segments[i] = new Segment(roundedUpSegmentCapacity); } this.segments = segments;

その結果、このような単純な改善でも、大幅な増加が達成されました。

Jdk 8

 Benchmark Mode Cnt Score Error Units originalConcurrentReferenceHashMap avgt 100 712,1 ± 7,2 ns/op patchedConcurrentReferenceHashMap avgt 100 496,5 ± 4,6 ns/op

Jdk 11

 Benchmark Mode Cnt Score Error Units originalConcurrentReferenceHashMap avgt 100 536,0 ± 8,4 ns/op patchedConcurrentReferenceHashMap avgt 100 486,4 ± 9,3 ns/op

「Arrays :: newInstance」を「new T []」に置き換えるとどうなりますか

IdeaからSpring Boothアプリケーションを起動するとき、開発者はしばしば「起動最適化を有効にする」フラグを設定します-XX:TieredStopAtLevel=1 -noverify

これにより、VM引数に-XX:TieredStopAtLevel=1 -noverify

が追加されます。指定された引数で測定してみましょう：

 // JDK 8 -XX:TieredStopAtLevel=1 -noverify Benchmark Mode Cnt Score Error Units originalConcurrentReferenceHashMap avgt 100 1920,9 ± 24,2 ns/op patchedConcurrentReferenceHashMap avgt 100 592,0 ± 25,4 ns/op // JDK 11 -XX:TieredStopAtLevel=1 -noverify Benchmark Mode Cnt Score Error Units originalConcurrentReferenceHashMap avgt 100 1838,9 ± 8,0 ns/op patchedConcurrentReferenceHashMap avgt 100 549,7 ± 6,7 ns/op

3倍以上の増加！

これは何のためですか？

特に、これはSpring Data JPAでプロジェクションを返すクエリを高速化するために必要です。

JMCプロファイルは、 ConcurrentReferenceHashMap

作成にフォームのクエリの実行にかかる時間のほぼ5分の1がかかることを示しています

 public interface SimpleEntityRepository extends JpaRepository<SimpleEntity, Long> { List<HasIdAndName> findAllByName(String name); }

HasIdAndName

はビュー投影です

 public interface HasIdAndName { int getId(); String getName(); }

また、 ConcurrentReferenceHashMap

Springコードで数十回ConcurrentReferenceHashMap

れているため、間違いなく不要です。

結論

パフォーマンスを改善することは、一見したほど難しいことではありません
サイクルの近くでの揮発性アクセスは、考えられるボトルネックの1つです
不変式を探し、それらをサイクルから外します

何を読む

Nitzan Wakartによる記事

コード例

変更点：

https://github.com/spring-projects/spring-framework/pull/1873

https://github.com/spring-projects/spring-framework/pull/2051

ConcurrentReferenceHashMapの作成を高速化する

知能

数学

配列

キーささいなこと

「Arrays :: newInstance」を「new T []」に置き換えるとどうなりますか

これは何のためですか？

結論

何を読む

More articles: