LinuxのTyunメモリとネットワークスタック:負荷の高いサーバーを新しいディストリビューションに移行するストーリー

画像



最近まで、Odnoklassnikiは部分的に更新されたOpenSuSE 10.2を主要なLinuxディストリビューションとして使用していました。 ただし、それを維持することがますます困難になったため、昨年からCentOS 7への積極的な移行に切り替えました。CentOSの移行の準備段階で、すべての内部手順が開発され、構成と構成ポリシーが準備されました(CFEngineを使用します)。 したがって、多くの場合、あるディストリビューションから別のディストリビューションへの移行は、キックスタートを介してOSをインストールし、開発の展開システムを使用してアプリケーションを展開することです。 これは多くの場合に発生しますが、すべてではありません。



しかし、ビデオ配信サーバーの移行中に発生した最大の問題。 それらを解決するのに6ヶ月かかりました。



構成について簡単に説明します。





動画の配信を特徴付けるいくつかの指標:





問題1-強力なCPUシステム時間の増加



画像



最初のサーバーを起動した直後に、システム時間のプロセッサー負荷が急速に増加し始めました。

同時に、多くの移行およびksoftirqdプロセスが上部に表示されました。 まず、カーネル設定を変更しようとしました。 私たちを助けなかったもの:



次の負荷増加で、perf topはisolate_freepages_blockで50%の負荷を示しました。 残念ながら、呼び出しの名前は何も教えてくれません。 しかし、フリーページという言葉は少し恥ずかしかったです。 サーバーの空きメモリは約45 GBでした。 私たちの経験から、サーバー上に多くの空きメモリがあり、カーネルがまだその不足について不平を言っている場合(これはOOMキラーの実行で表現されることもある)、 問題はフラグメンテーションである可能性が高いことをすでに知っています。 ディスクキャッシュをフラッシュ( echo 3 > /proc/sys/vm/drop_caches



echo 3 > /proc/sys/vm/drop_caches



サーバーがすぐにecho 3 > /proc/sys/vm/drop_caches



され、仮定のみが確認されました。



メモリの断片化は一般的な問題であり(Linuxに固有のものではありません)、それに対処するためにカーネルに定期的に変更が加えられます。 フラグメンテーションの原因の1つは、コア自体、またはディスクキャッシュであり、無効化もボリューム制限もできません。 これは、この場合の断片化がディスクキャッシュによって引き起こされたという意味ではありませんが、正確な理由はそれほど重要ではありませんでした。 さらに重要なのは決定-デフラグです。 カーネルにはこのようなメカニズムがありますが、対処できないことは明らかです(代わりに、代わりにメモリリリースが開始されました-グローバルリクレーム)。 最適化は、空きメモリが特定のマーク(ゾーンの透かし)を下回ったときにのみ開始され、この場合は遅すぎました。 それをより早く開始する唯一の方法は、sysctlを介してmin_free_kbytes上げることです。 このパラメータは、メモリの一部を解放するようにカーネルに指示し、この要件を満たすために、デフラグをより早く実行する必要があります。 この例では、1 GBの値で十分でした。



問題2-スワップのままにする



まず、vm.swappiness = 0を使用することに言及する価値があります。これは、スワップが私たちにとって邪悪であることを確信しているからです。 そのため、サーバーには約45 GBの空きメモリがありますが、それでも定期的にスワップ状態になります。 これはどのように可能ですか? まず、Linuxのメモリは大きなものではないことを思い出してください。





最初はフラグメンテーションを再度非難しましたが、min_free_kbytesのさらなる増加とvfs_cache_pressureの増加は役に立ちませんでした。 numastatユーティリティ( numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








精通しなければなりませんでしたnumastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product








  1. numastat -m ).



    . tmpfs ( ), 1 .



    numastat? , :

    Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

    NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



    3 —

    , , , . ( , , ):



    CPU0 ( softirq). , , , . ( , ethtool -l/-L




    ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

    ?.. .

    16 , Intel ( 10 ), , , 16.



    16 — , RSS. RSS- , 4 , , — 16. .



    Intel, . -, ( , ). , .. , , .



    -, Flow director.



    , , , , . , .



    Flow director

    Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







    Perfect filter (ntuple)

    8 000 . Perfect filter / ethtool -u/U flow-type



    .



    Signature Filter

    32 000 . , ATR , ( ) __ SYN .



    During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

    [1]



    .., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



    , fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



    , — ethtool -k ethN ntuple on



    . ATR, Perfect filter, , , , .



    RPS (receive packet steering)

    . 2 ( RSS) — scaling.txt.



    ( irq_balancer).



    . , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



    .. , , , , , .. — — . , scaling.txt RFS.



    RFS (receive flow steering)

    ? , , , . .. , , , .



    [2]:

    RPS RFS RPS RFS



    Accelerated RFS

    scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



    , . , kernel panic.



    , . , firmware .



    4 — broken pipe

    broken pipe OpenSuSE, CentOS . , , “” .



    Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



    — "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



    画像



    , — interrupt coalescing. , .



    5 —

    , Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



    , CPU, — . interrupt coalescing, ethtool -c/-C



    . interrupt coalescing — , .



    画像



    , softirq , , .

    :

    [rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





    :

    , 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

    .





    , , -, . , . .





    https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




  2. numastat -m ).



    . tmpfs ( ), 1 .



    numastat? , :

    Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

    NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



    3 —

    , , , . ( , , ):



    CPU0 ( softirq). , , , . ( , ethtool -l/-L




    ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

    ?.. .

    16 , Intel ( 10 ), , , 16.



    16 — , RSS. RSS- , 4 , , — 16. .



    Intel, . -, ( , ). , .. , , .



    -, Flow director.



    , , , , . , .



    Flow director

    Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







    Perfect filter (ntuple)

    8 000 . Perfect filter / ethtool -u/U flow-type



    .



    Signature Filter

    32 000 . , ATR , ( ) __ SYN .



    During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

    [1]



    .., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



    , fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



    , — ethtool -k ethN ntuple on



    . ATR, Perfect filter, , , , .



    RPS (receive packet steering)

    . 2 ( RSS) — scaling.txt.



    ( irq_balancer).



    . , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



    .. , , , , , .. — — . , scaling.txt RFS.



    RFS (receive flow steering)

    ? , , , . .. , , , .



    [2]:

    RPS RFS RPS RFS



    Accelerated RFS

    scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



    , . , kernel panic.



    , . , firmware .



    4 — broken pipe

    broken pipe OpenSuSE, CentOS . , , “” .



    Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



    — "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



    画像



    , — interrupt coalescing. , .



    5 —

    , Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



    , CPU, — . interrupt coalescing, ethtool -c/-C



    . interrupt coalescing — , .



    画像



    , softirq , , .

    :

    [rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





    :

    , 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

    .





    , , -, . , . .





    https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




  3. numastat -m ).



    . tmpfs ( ), 1 .



    numastat? , :

    Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

    NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



    3 —

    , , , . ( , , ):



    CPU0 ( softirq). , , , . ( , ethtool -l/-L




    ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

    ?.. .

    16 , Intel ( 10 ), , , 16.



    16 — , RSS. RSS- , 4 , , — 16. .



    Intel, . -, ( , ). , .. , , .



    -, Flow director.



    , , , , . , .



    Flow director

    Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







    Perfect filter (ntuple)

    8 000 . Perfect filter / ethtool -u/U flow-type



    .



    Signature Filter

    32 000 . , ATR , ( ) __ SYN .



    During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

    [1]



    .., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



    , fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



    , — ethtool -k ethN ntuple on



    . ATR, Perfect filter, , , , .



    RPS (receive packet steering)

    . 2 ( RSS) — scaling.txt.



    ( irq_balancer).



    . , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



    .. , , , , , .. — — . , scaling.txt RFS.



    RFS (receive flow steering)

    ? , , , . .. , , , .



    [2]:

    RPS RFS RPS RFS



    Accelerated RFS

    scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



    , . , kernel panic.



    , . , firmware .



    4 — broken pipe

    broken pipe OpenSuSE, CentOS . , , “” .



    Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



    — "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



    画像



    , — interrupt coalescing. , .



    5 —

    , Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



    , CPU, — . interrupt coalescing, ethtool -c/-C



    . interrupt coalescing — , .



    画像



    , softirq , , .

    :

    [rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





    :

    , 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

    .





    , , -, . , . .





    https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




  4. numastat -m ).



    . tmpfs ( ), 1 .



    numastat? , :

    Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

    NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



    3 —

    , , , . ( , , ):



    CPU0 ( softirq). , , , . ( , ethtool -l/-L




    ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

    ?.. .

    16 , Intel ( 10 ), , , 16.



    16 — , RSS. RSS- , 4 , , — 16. .



    Intel, . -, ( , ). , .. , , .



    -, Flow director.



    , , , , . , .



    Flow director

    Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







    Perfect filter (ntuple)

    8 000 . Perfect filter / ethtool -u/U flow-type



    .



    Signature Filter

    32 000 . , ATR , ( ) __ SYN .



    During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

    [1]



    .., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



    , fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



    , — ethtool -k ethN ntuple on



    . ATR, Perfect filter, , , , .



    RPS (receive packet steering)

    . 2 ( RSS) — scaling.txt.



    ( irq_balancer).



    . , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



    .. , , , , , .. — — . , scaling.txt RFS.



    RFS (receive flow steering)

    ? , , , . .. , , , .



    [2]:

    RPS RFS RPS RFS



    Accelerated RFS

    scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



    , . , kernel panic.



    , . , firmware .



    4 — broken pipe

    broken pipe OpenSuSE, CentOS . , , “” .



    Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



    — "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



    画像



    , — interrupt coalescing. , .



    5 —

    , Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



    , CPU, — . interrupt coalescing, ethtool -c/-C



    . interrupt coalescing — , .



    画像



    , softirq , , .

    :

    [rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





    :

    , 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

    .





    , , -, . , . .





    https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




  1. numastat -m ).



    . tmpfs ( ), 1 .



    numastat? , :

    Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

    NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



    3 —

    , , , . ( , , ):



    CPU0 ( softirq). , , , . ( , ethtool -l/-L




    ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

    ?.. .

    16 , Intel ( 10 ), , , 16.



    16 — , RSS. RSS- , 4 , , — 16. .



    Intel, . -, ( , ). , .. , , .



    -, Flow director.



    , , , , . , .



    Flow director

    Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







    Perfect filter (ntuple)

    8 000 . Perfect filter / ethtool -u/U flow-type



    .



    Signature Filter

    32 000 . , ATR , ( ) __ SYN .



    During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

    [1]



    .., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



    , fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



    , — ethtool -k ethN ntuple on



    . ATR, Perfect filter, , , , .



    RPS (receive packet steering)

    . 2 ( RSS) — scaling.txt.



    ( irq_balancer).



    . , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



    .. , , , , , .. — — . , scaling.txt RFS.



    RFS (receive flow steering)

    ? , , , . .. , , , .



    [2]:

    RPS RFS RPS RFS



    Accelerated RFS

    scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



    , . , kernel panic.



    , . , firmware .



    4 — broken pipe

    broken pipe OpenSuSE, CentOS . , , “” .



    Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



    — "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



    画像



    , — interrupt coalescing. , .



    5 —

    , Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



    , CPU, — . interrupt coalescing, ethtool -c/-C



    . interrupt coalescing — , .



    画像



    , softirq , , .

    :

    [rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





    :

    , 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

    .





    , , -, . , . .





    https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




  2. numastat -m ).



    . tmpfs ( ), 1 .



    numastat? , :

    Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

    NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



    3 —

    , , , . ( , , ):



    CPU0 ( softirq). , , , . ( , ethtool -l/-L




    ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

    ?.. .

    16 , Intel ( 10 ), , , 16.



    16 — , RSS. RSS- , 4 , , — 16. .



    Intel, . -, ( , ). , .. , , .



    -, Flow director.



    , , , , . , .



    Flow director

    Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







    Perfect filter (ntuple)

    8 000 . Perfect filter / ethtool -u/U flow-type



    .



    Signature Filter

    32 000 . , ATR , ( ) __ SYN .



    During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

    [1]



    .., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



    , fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



    , — ethtool -k ethN ntuple on



    . ATR, Perfect filter, , , , .



    RPS (receive packet steering)

    . 2 ( RSS) — scaling.txt.



    ( irq_balancer).



    . , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



    .. , , , , , .. — — . , scaling.txt RFS.



    RFS (receive flow steering)

    ? , , , . .. , , , .



    [2]:

    RPS RFS RPS RFS



    Accelerated RFS

    scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



    , . , kernel panic.



    , . , firmware .



    4 — broken pipe

    broken pipe OpenSuSE, CentOS . , , “” .



    Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



    — "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



    画像



    , — interrupt coalescing. , .



    5 —

    , Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



    , CPU, — . interrupt coalescing, ethtool -c/-C



    . interrupt coalescing — , .



    画像



    , softirq , , .

    :

    [rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





    :

    , 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

    .





    , , -, . , . .





    https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




  3. numastat -m ).



    . tmpfs ( ), 1 .



    numastat? , :

    Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

    NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



    3 —

    , , , . ( , , ):



    CPU0 ( softirq). , , , . ( , ethtool -l/-L




    ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

    ?.. .

    16 , Intel ( 10 ), , , 16.



    16 — , RSS. RSS- , 4 , , — 16. .



    Intel, . -, ( , ). , .. , , .



    -, Flow director.



    , , , , . , .



    Flow director

    Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







    Perfect filter (ntuple)

    8 000 . Perfect filter / ethtool -u/U flow-type



    .



    Signature Filter

    32 000 . , ATR , ( ) __ SYN .



    During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

    [1]



    .., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



    , fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



    , — ethtool -k ethN ntuple on



    . ATR, Perfect filter, , , , .



    RPS (receive packet steering)

    . 2 ( RSS) — scaling.txt.



    ( irq_balancer).



    . , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



    .. , , , , , .. — — . , scaling.txt RFS.



    RFS (receive flow steering)

    ? , , , . .. , , , .



    [2]:

    RPS RFS RPS RFS



    Accelerated RFS

    scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



    , . , kernel panic.



    , . , firmware .



    4 — broken pipe

    broken pipe OpenSuSE, CentOS . , , “” .



    Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



    — "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



    画像



    , — interrupt coalescing. , .



    5 —

    , Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



    , CPU, — . interrupt coalescing, ethtool -c/-C



    . interrupt coalescing — , .



    画像



    , softirq , , .

    :

    [rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





    :

    , 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

    .





    , , -, . , . .





    https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




  4. numastat -m ).



    . tmpfs ( ), 1 .



    numastat? , :

    Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

    NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



    3 —

    , , , . ( , , ):



    CPU0 ( softirq). , , , . ( , ethtool -l/-L




    ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

    ?.. .

    16 , Intel ( 10 ), , , 16.



    16 — , RSS. RSS- , 4 , , — 16. .



    Intel, . -, ( , ). , .. , , .



    -, Flow director.



    , , , , . , .



    Flow director

    Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







    Perfect filter (ntuple)

    8 000 . Perfect filter / ethtool -u/U flow-type



    .



    Signature Filter

    32 000 . , ATR , ( ) __ SYN .



    During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

    [1]



    .., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



    , fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



    , — ethtool -k ethN ntuple on



    . ATR, Perfect filter, , , , .



    RPS (receive packet steering)

    . 2 ( RSS) — scaling.txt.



    ( irq_balancer).



    . , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



    .. , , , , , .. — — . , scaling.txt RFS.



    RFS (receive flow steering)

    ? , , , . .. , , , .



    [2]:

    RPS RFS RPS RFS



    Accelerated RFS

    scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



    , . , kernel panic.



    , . , firmware .



    4 — broken pipe

    broken pipe OpenSuSE, CentOS . , , “” .



    Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



    — "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



    画像



    , — interrupt coalescing. , .



    5 —

    , Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



    , CPU, — . interrupt coalescing, ethtool -c/-C



    . interrupt coalescing — , .



    画像



    , softirq , , .

    :

    [rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





    :

    , 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

    .





    , , -, . , . .





    https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




numastat -m ).



. tmpfs ( ), 1 .



numastat? , :

Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



3 —

, , , . ( , , ):



CPU0 ( softirq). , , , . ( , ethtool -l/-L




). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) — https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

?.. .

16 , Intel ( 10 ), , , 16.



16 — , RSS. RSS- , 4 , , — 16. .



Intel, . -, ( , ). , .. , , .



-, Flow director.



, , , , . , .



Flow director

Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







Perfect filter (ntuple)

8 000 . Perfect filter / ethtool -u/U flow-type



.



Signature Filter

32 000 . , ATR , ( ) __ SYN .



During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

[1]



.., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



, fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



, — ethtool -k ethN ntuple on



. ATR, Perfect filter, , , , .



RPS (receive packet steering)

. 2 ( RSS) — scaling.txt.



( irq_balancer).



. , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



.. , , , , , .. — — . , scaling.txt RFS.



RFS (receive flow steering)

? , , , . .. , , , .



[2]:

RPS RFS RPS RFS



Accelerated RFS

scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



, . , kernel panic.



, . , firmware .



4 — broken pipe

broken pipe OpenSuSE, CentOS . , , “” .



Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



— "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



画像



, — interrupt coalescing. , .



5 —

, Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



, CPU, — . interrupt coalescing, ethtool -c/-C



. interrupt coalescing — , .



画像



, softirq , , .

:

[rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





:

, 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

.





, , -, . , . .





https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




  1. numastat -m ).



    . tmpfs ( ), 1 .



    numastat? , :

    Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

    NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



    3 —

    , , , . ( , , ):



    CPU0 ( softirq). , , , . ( , ethtool -l/-L




    ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) —
    https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

    ?.. .

    16 , Intel ( 10 ), , , 16.



    16 — , RSS. RSS- , 4 , , — 16. .



    Intel, . -, ( , ). , .. , , .



    -, Flow director.



    , , , , . , .



    Flow director

    Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







    Perfect filter (ntuple)

    8 000 . Perfect filter / ethtool -u/U flow-type



    .



    Signature Filter

    32 000 . , ATR , ( ) __ SYN .



    During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

    [1]



    .., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



    , fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



    , — ethtool -k ethN ntuple on



    . ATR, Perfect filter, , , , .



    RPS (receive packet steering)

    . 2 ( RSS) — scaling.txt.



    ( irq_balancer).



    . , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



    .. , , , , , .. — — . , scaling.txt RFS.



    RFS (receive flow steering)

    ? , , , . .. , , , .



    [2]:

    RPS RFS RPS RFS



    Accelerated RFS

    scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



    , . , kernel panic.



    , . , firmware .



    4 — broken pipe

    broken pipe OpenSuSE, CentOS . , , “” .



    Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



    — "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



    画像



    , — interrupt coalescing. , .



    5 —

    , Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



    , CPU, — . interrupt coalescing, ethtool -c/-C



    . interrupt coalescing — , .



    画像



    , softirq , , .

    :

    [rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





    :

    , 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

    .





    , , -, . , . .





    https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




  2. numastat -m ).



    . tmpfs ( ), 1 .



    numastat? , :

    Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

    NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



    3 —

    , , , . ( , , ):



    CPU0 ( softirq). , , , . ( , ethtool -l/-L




    ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) —
    https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

    ?.. .

    16 , Intel ( 10 ), , , 16.



    16 — , RSS. RSS- , 4 , , — 16. .



    Intel, . -, ( , ). , .. , , .



    -, Flow director.



    , , , , . , .



    Flow director

    Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







    Perfect filter (ntuple)

    8 000 . Perfect filter / ethtool -u/U flow-type



    .



    Signature Filter

    32 000 . , ATR , ( ) __ SYN .



    During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

    [1]



    .., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



    , fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



    , — ethtool -k ethN ntuple on



    . ATR, Perfect filter, , , , .



    RPS (receive packet steering)

    . 2 ( RSS) — scaling.txt.



    ( irq_balancer).



    . , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



    .. , , , , , .. — — . , scaling.txt RFS.



    RFS (receive flow steering)

    ? , , , . .. , , , .



    [2]:

    RPS RFS RPS RFS



    Accelerated RFS

    scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



    , . , kernel panic.



    , . , firmware .



    4 — broken pipe

    broken pipe OpenSuSE, CentOS . , , “” .



    Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



    — "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



    画像



    , — interrupt coalescing. , .



    5 —

    , Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



    , CPU, — . interrupt coalescing, ethtool -c/-C



    . interrupt coalescing — , .



    画像



    , softirq , , .

    :

    [rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





    :

    , 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

    .





    , , -, . , . .





    https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product




  3. numastat -m ).



    . tmpfs ( ), 1 .



    numastat? , :

    Per-node process memory usage (in MBs) for PID 7781 (java) Node 0 Node 1 Total Huge 0 0 0 Heap 0.20 0.14 0.34 Stack 118.82 137.87 256.70 Private 80200.73 123323.81 203524.55 Total 80319.76 123461.82 203781.58 Per-node system memory usage (in MBs): Node 0 Node 1 Total MemTotal 131032.75 131072.00 262104.75 MemFree 1228.93 639.84 1868.78 MemUsed 129803.82 130432.16 260235.98 Active 23224.13 121073.42 144297.55 Inactive 101138.88 3753.98 104892.85 Active(anon) 1690.50 120997.86 122688.36 Inactive(anon) 79528.66 3560.95 83089.61 Active(file) 21533.63 75.57 21609.20 Inactive(file) 21610.21 193.03 21803.24 Unevictable 0 0 0 Mlocked 0 0 0 Dirty 0.11 0.02 0.13 Writeback 0 0 0 FilePages 122397.46 124295.47 246692.93 Mapped 78436.03 122947.26 201383.29 AnonPages 1966.62 532.02 2498.64 Shmem 79251.21 123964.70 203215.90 KernelStack 2.44 2.57 5.01 PageTables 158.62 252.29 410.91 NFS_Unstable 0 0 0 Bounce 0 0 0 WritebackTmp 0 0 0 Slab 1801.95 1932.29 3734.23 SReclaimable 1653.13 1818.79 3471.92 SUnreclaim 148.82 113.49 262.31 AnonHugePages 1856.00 498.00 2354.00 HugePages_Total 0 0 0 HugePages_Free 0 0 0 HugePages_Surp 0 0 0

    NUMA, 1 tmpfs . interleave (numactl —interleave=all, ) .



    3 —

    , , , . ( , , ):



    CPU0 ( softirq). , , , . ( , ethtool -l/-L




    ). , 1 Intel 8, 10 — 128. . , , CPU0. irq_balancer. . , , . set_irq_affinity ( ) — RSS (receive side scaling) —
    https://www.kernel.org/doc/Documentation/networking/scaling.txt. 8 , 8 , 16 , .

    ?.. .

    16 , Intel ( 10 ), , , 16.



    16 — , RSS. RSS- , 4 , , — 16. .



    Intel, . -, ( , ). , .. , , .



    -, Flow director.



    , , , , . , .



    Flow director

    Flow director, Intel, 2 — Signature Filter ( ATR — Application Targeted Receive) Perfect filter. Flow director -IP , . / flow director: ethtool -S ethN|grep fdir







    Perfect filter (ntuple)

    8 000 . Perfect filter / ethtool -u/U flow-type



    .



    Signature Filter

    32 000 . , ATR , ( ) __ SYN .



    During transmission of a packet (every 20 packets by default), a hash is calculated based on the 5-tuple. The (up to) 15-bit hash result is used as an index in a hash lookup table to store the TX queue. When a packet is received, a similar hash is calculated and used to look up an associated Receive Queue. For uni-directional incoming flows, the hash lookup tables will not be initialized and Flow Director will not work. For bidirectional flows, the core handling the interrupt will be the same as the core running the process handling the network flow.

    [1]



    .., , SYN, 20 ( , AtrSampleRate). ATR, RSS (.. 16 ). . flow director ethtool -S ethN | grep fdir



    , fdir_miss. .. ATR, RSS. 16 . ( https://sourceforge.net/p/e1000/bugs/464/ ), , — RPS.



    , — ethtool -k ethN ntuple on



    . ATR, Perfect filter, , , , .



    RPS (receive packet steering)

    . 2 ( RSS) — scaling.txt.



    ( irq_balancer).



    . , , , 16. , , , , . RPS, RSS, 16 (.. ), ( , rps_cpus , ).



    .. , , , , , .. — — . , scaling.txt RFS.



    RFS (receive flow steering)

    ? , , , . .. , , , .



    [2]:

    RPS RFS RPS RFS



    Accelerated RFS

    scaling.txt, Accelerated RFS. ? RFS . Accelerated RFS , . Mellanox.



    , . , kernel panic.



    , . , firmware .



    4 — broken pipe

    broken pipe OpenSuSE, CentOS . , , “” .



    Broken pipe — , , pipe. pipe , — . , , , , pipe broken pipe. , , ( half-duplex tcp close sequence ), - . ? ? , .. , , . .



    — "packet reordering" ( http://en.wikipedia.org/wiki/Out-of-order_delivery ), . .. , , , , RFS broken pipe .



    画像



    , — interrupt coalescing. , .



    5 —

    , Counter Strike latency, ( , ) . , ( ) (), softirq . softirq 50% .



    , CPU, — . interrupt coalescing, ethtool -c/-C



    . interrupt coalescing — , .



    画像



    , softirq , , .

    :

    [rx|tx]-usecs — [rx|tx]-frames — , *-irq — *-[low|high] —





    :

    , 2024 [3] broken pipe 40 ( 50 ) ( CFEngine )

    .





    , , -, . , . .





    https://networkbuilders.intel.com/docs/network_builders_RA_packet_processing.pdf https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance/ComparingMutiqueueSupportLinuxvsFreeBSD https://wiki.centos.org/About/Product







All Articles