ã«ãŒãã«ã¯ãã¹ãŠã®æªã®æ ¹æºã§ãã
ããã§ãã€ãã³ãããŒã©ãŒã§epoll ïŒïŒ / kqueueïŒïŒã䜿çšããŠã誰ãé©ããªãã§ãããã C10Kã®åé¡ã解決ããããã«ãããŸããŸãªãœãªã¥ãŒã·ã§ã³ïŒ libevent / libev / libuv ïŒããããããŸããŸãªããã©ãŒãã³ã¹ãšããªãé«ããªãŒããŒãããããããŸãã ãã®èšäºã§ã¯ã DPDKã䜿çšããŠ1,000äžã®æ¥ç¶ïŒC10MïŒãåŠçããåé¡ã解決ããäžè¬çãªã¢ããªã±ãŒã·ã§ã³ãœãªã¥ãŒã·ã§ã³ã§ãããã¯ãŒã¯èŠæ±ãåŠçããéã«æ倧ã®ããã©ãŒãã³ã¹ãå®çŸããæ¹æ³ã«ã€ããŠèª¬æããŸãã ãã®ã¿ã¹ã¯ã®äž»ãªæ©èœã¯ãOSã«ãŒãã«ãããŠãŒã¶ãŒç©ºéïŒãŠãŒã¶ãŒç©ºéïŒãžã®ãã©ãã£ãã¯ãåŠçãã責任ã®å§ä»»ãå²ã蟌ã¿ãšDMAãã£ãã«ã®åŠçã®æ£ç¢ºãªå¶åŸ¡ã VFIOã®äœ¿çšããã®ä»ã®ããŸãæ確ã§ãªãèšèã§ãã Java Nettyã¯ã Disruptorãã¿ãŒã³ãšãªãããŒããã£ãã·ã¥ã䜿çšããŠãã¿ãŒã²ããã¢ããªã±ãŒã·ã§ã³ç°å¢ãšããŠéžæãããŸããã
èŠããã«ãããã¯æ¢åã®ããŒããŠã§ã¢ãœãªã¥ãŒã·ã§ã³ãšåæ§ã®ããã©ãŒãã³ã¹ã®ãã©ãã£ãã¯ãåŠçããéåžžã«å¹ççãªæ¹æ³ã§ãã OSã«ãŒãã«èªäœãæäŸããè³éã䜿çšãããªãŒããŒãããã¯é«ãããããããã®ãããªã¿ã¹ã¯ã§ã¯ã»ãšãã©ã®åé¡ã®åå ã«ãªããŸãã åé¡ã¯ãã¿ãŒã²ãããããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ã®ãã©ã€ããŒããã®ãµããŒããšãã¢ããªã±ãŒã·ã§ã³å šäœã®ã¢ãŒããã¯ãã£æ©èœã«ãããŸãã
ãã®èšäºã§ã¯ãé«æ§èœãœãªã¥ãŒã·ã§ã³ãæ§ç¯ããããã®ã DPDKã®åé¡ã®ã€ã³ã¹ããŒã«ãæ§æã䜿çšããããã°ããããã¡ã€ãªã³ã°ãããã³å±éã«ã€ããŠè©³ãã説æããŠããŸãã
ãªãdpdkãªã®ãïŒ
Netmap ã OpenOnloadãããã³pf_ringããããŸãã
ãããããã
netmapã®éçºã«ãããäž»ãªã¿ã¹ã¯ã¯ã䜿ãããããœãªã¥ãŒã·ã§ã³ãéçºããããšã§ããããããã£ãŠãæ¢åã®ãœãªã¥ãŒã·ã§ã³ã®ç§»æ€ãå€§å¹ ã«ç°¡çŽ åã§ããæãäžè¬çãªåæselectïŒïŒã€ã³ã¿ãŒãã§ã€ã¹ãæäŸãããŸãã netmap 'ironã®æè»æ§ãšæœè±¡åã®èŠ³ç¹ããã¯ãæããã«ååãªæ©èœããããŸããã ããã«ãããããããããã¯æãæé ãªäŸ¡æ Œã§åºãæ®åããŠãããœãªã¥ãŒã·ã§ã³ã§ãïŒ
pf_ring
pf_ringã¯pcap 'aãããªãŒããŒã¯ããã¯ãããæ段ãšããŠç»å ŽããŸãããæŽå²çã«ãéçºã®æç¹ã§ããã«äœ¿çšã§ããå®å®ãããœãªã¥ãŒã·ã§ã³ã¯ãããŸããã§ããã åãããããããã«æ¯ã¹ãŠå€ãã®æãããªå©ç¹ã¯ãããŸããããç¬èªã®ZCããŒãžã§ã³ã§ã¯IOMMUããµããŒããããŠããŸãã 倪å€ããã補åèªäœã¯é«æ§èœãå質ãç¹åŸŽãšããŠãããã pcapãã³ããåéããã³åæããæ段ã«éããããŠãŒã¶ãŒã¢ããªã±ãŒã·ã§ã³ã§ãã©ãã£ãã¯ãåŠçããããšãç®çãšããŠããŸããã§ããã pf_ring 'a ZCã®äž»ãªæ©èœã¯ãæ¢åã®ãããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ãã©ã€ããŒããã®å®å šãªç¬ç«æ§ã§ãã
Openonload
OpenOnloadã¯ãSolarFlareã®é«åºŠã«å°éåãããé«æ§èœã®
ãã®ä»
Napatechã®ãœãªã¥ãŒã·ã§ã³ããããŸãããç§ãç¥ãéã ã圌ãã¯SolarFlareã®ãããªå€©æã®ãªãç¬èªã®APIãåããã©ã€ãã©ãªãæã£ãŠããã ãã§ãã
åœç¶ãæ¢åã®ãã¹ãŠã®ãœãªã¥ãŒã·ã§ã³ãæ€èšããããã§ã¯ãããŸããããã¹ãŠã«ééããããšã¯ã§ããŸããã§ããããããããäžèšã®èª¬æãšå€§ããç°ãªãããšã¯ãªããšæããŸãã
DPDK
æŽå²çã«ã10 / 40GbEã§åäœããããã®æãäžè¬çãªã¢ããã¿ãŒã¯ã e1000 igb ixgbe i40eãã©ã€ããŒã«ãã£ãŠæäŸãããIntelã¢ããã¿ãŒã§ãã ãããã£ãŠããããã¯é«æ§èœãã©ãã£ãã¯åŠçããŒã«ã®é »ç¹ãªã¿ãŒã²ããã¢ããã¿ã§ãã ãã®ããã Netmapãšpf_ringã䜿çšããŸãããéçºè ã¯
DPDKã¯Intelã®ãªãŒãã³ãœãŒã¹ãããžã§ã¯ãã§ãããã©ã®ãªãã£ã¹ïŒ 6WIND ïŒãæ§ç¯ãããã¡ãŒã«ãŒããã©ã€ããŒïŒ Mellanoxãªã©ïŒãæææäŸããããšã«åºã¥ããŠããŸãã åœç¶ãããã«åºã¥ããœãªã¥ãŒã·ã§ã³ã®åçšãµããŒãã¯çŽ æŽããããããªãå€ãã®ãã³ããŒïŒ6WINDãAricentãALTEN Calsoft LabsãAdvantechãBrocadeãRadisysãTietoãWind RiverãLannerãMobicaïŒã«ãã£ãŠæäŸãããŸãã
DPDKã¯æãåºç¯ãªæ©èœãåããŠãããæ¢åã®ããŒããŠã§ã¢ãæœè±¡åããŸãã
ããã¯äŸ¿å©ãªãã®ã§ã¯ãããŸããã§ãã-é«ããããããæ倧ã®çç£æ§ãéæããã®ã«ååãªæè»æ§ããããŸããã
ãµããŒããããŠãããã©ã€ããŒãšã«ãŒãã®ãªã¹ã
- Chelsio cxgbeïŒ ã¿ãŒãããŒã¿ãŒ5 ïŒ
- Cisco enicïŒãã¹ãŠã®ä»®æ³ã€ã³ã¿ãŒãã§ã€ã¹ã«ãŒãã·ãªãŒãºïŒ
- Emulex oceïŒ OneConnect OCe14000ãã¡ããªãŒïŒ
- ã¡ã©ããã¯ã¹mlx4ïŒ ConnectX-3 ã ConnectX-3 Pro ïŒ
- QLogic / Broadcom bnx2xïŒ NetXtreme II ïŒ
Linuxã«ãŒãã«ã®Intelãã¹ãŠã®ãã©ã€ããŒ
- e1000ïŒ82540ã82545ã82546ïŒ
- e1000eïŒ82571..82574ã82583ãICH8..ICH10ãPCH..PCH2ïŒ
- igbïŒ82575..82576ã82580ãI210ãI211ãI350ãI354ãDH89xxïŒ
- ixgbeïŒ82598..82599ãX540ãX550ïŒ
- i40eïŒX710ãXL710ïŒ
- fm10k
ãããã¯ãã¹ãŠããŠãŒã¶ãŒç©ºéã§å®è¡ããããã®ããŒãªã³ã°ã¢ãŒããã©ã€ããŒãšããŠç§»æ€ãããŸãïŒ usermode ïŒã
ä»ã«äœãïŒ
å®éãã¯ãããŸã ãµããŒãããããŸã
- QEMU ã Xen ã VMware ESXiã«åºã¥ãä»®æ³å
- ãããã¡ã®ã³ããŒã«åºã¥ãæºä»®æ³åãããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹
ãããã¯æª - ãã¹ãçšã®AF_PACKETãœã±ãããšPCAPãã³ã
- ãªã³ã°ãããã¡ãŒãåãããããã¯ãŒã¯ã¢ããã¿ãŒ
DPDKã¢ãŒããã¯ãã£
*é ã®äžã§æ©èœããŠããã®ã§ãçŸå®ã¯å°ãç°ãªãå ŽåããããŸã
DPDKèªäœã¯ãäžé£ã®ã©ã€ãã©ãªïŒ lib dadã®å 容ïŒã§æ§æãããŠããŸãã
- librte_ acl - VLANã®
CEPã¢ã¯ã»ã¹å¶åŸ¡ãªã¹ã - librte_ compat-ãã€ããªã€ã³ã¿ãŒãã§ã€ã¹ïŒABIïŒäºææ§ã®ãšã¯ã¹ããŒã
- librte_ ether-ã€ãŒãµãããã¢ããã¿ãå¶åŸ¡ããã€ãŒãµããããã¬ãŒã ãæäœããŸã
- librte_ ivshmem - ivshmemãšã®ãããã¡ãŒã®å ±æ
- librte_ kvargs-ããŒãšå€ã®åŒæ°ã®è§£æ
- librte_ mbuf- ã¡ãã»ãŒãžãããã¡ç®¡çïŒ ã¡ãã»ãŒãžãããã¡ -mbufïŒ
- librte_ net -ARP / IPv4 / IPv6 / TCP / UDP / SCTPã䜿çšããBSD'sh IPã¹ã¿ãã¯ã®äžéš
- librte_ power- é»åããã³åšæ³¢æ°ã®ç®¡çïŒ cpufreq ïŒ
- librte_ sched -QOSéå±€ã¹ã±ãžã¥ãŒã©ãŒ
- librte_ vhost-ä»®æ³ãããã¯ãŒã¯ã¢ããã¿ãŒ
- librte_ cfgfile-æ§æãã¡ã€ã«ã®è§£æ
- librte_ ãã£ã¹ããªãã¥ãŒã¿ãŒ -æ¢åã®ã¿ã¹ã¯éã§ããã±ãŒãžãé åžããæ段
- librte_ hash- ããã·ã¥é¢æ°
- librte_ jobstats-ã¿ã¹ã¯å®è¡æéã®æž¬å®
- librte_ lpm-åæ¹è¡šã®æ€çŽ¢ã«äœ¿çšãããæé·ãã¬ãã£ãã¯ã¹äžèŽé¢æ°
- librte_ mempool-ã€ã³ã¡ã¢ãªãªããžã§ã¯ãããŒã«ãããŒãžã£ãŒ
- librte_ pipeline-ããããã¬ãŒã ã¯ãŒã¯ã®ãã€ãã©ã€ã³
- librte_ reorder-ã¡ãã»ãŒãžãããã¡ãŒå ã®ãã±ããã䞊ã¹æ¿ãã
- librte_ table-ã«ãã¯ã¢ããããŒãã«ã®å®è£
- librte_ cmdline-ã³ãã³ãã©ã€ã³ã§åŒæ°ã解æãã
- librte_ eal-ãã©ãããã©ãŒã äŸåç°å¢
- librte_ ip_frag -IPãã±ããã®æçå
- librte_ kni -KNIãšå¯Ÿè©±ããããã®API
- librte_ malloc-æšæž¬ãããã
- librte_ meter -QOSã¡ããªãã¯
- librte_ port-ãããã¯ãŒã¯ãã±ããã®ããŒãå®è£
- librte_ ring- ãªã³ã°ããã¯ããªãŒFIFOãã¥ãŒ
- librte_ timer-ã¿ã€ããŒãšã«ãŠã³ã¿ãŒ
Linuxã§ã®UIOãã©ã€ããŒïŒ lib / librte_eal / linuxapp ïŒãããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ïŒ
- uio_igb-ã€ãŒãµããããããã¯ãŒã¯ã¢ããã¿ãŒ
- xen_dom0-ååããã¯ãªã¢
ããã³BSD
- nic_uio
ãŸãããŠãŒã¶ãŒç©ºéïŒuserspaceïŒã§å®è¡ãããåè¿°ã®ããŒãªã³ã°ã¢ãŒããã©ã€ããŒïŒ PMD ïŒïŒe1000ãe1000eãigbãixgbeãi40eãfm10kãªã©ã
ã«ãŒãã«ãããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ ïŒKNIïŒã¯ãã«ãŒãã«ãããã¯ãŒã¯APIãšå¯Ÿè©±ãã DPDKã§åäœããã€ã³ã¿ãŒãã§ã€ã¹ã®ããŒãã«ioctlåŒã³åºããè¡ã ãäžè¬çãªãŠãŒãã£ãªãã£ïŒ ethtool ã ifconfig ã tcpdump ïŒã䜿çšããŠãããã管çã§ããããã«ããç¹æ®ãªãã©ã€ããŒã§ãã
ã芧ã®ãšããã DPDKã¯netmapã«ããä»ã®ãœãªã¥ãŒã·ã§ã³ãšæ¯èŒããŠãããŒããŠã§ã¢ã¢ãŒãã®ããŒã¯ãµã€ãã
ã¿ãŒã²ããã·ã¹ãã ã®èŠä»¶ãšåŸ®èª¿æŽ
å ¬åŒææžã®äž»ãªæšå¥šäºé ã¯ç¿»èš³ãããè£è¶³ãããŠããŸãã
DPDKã䜿çšããããã®XENããã³VMwareãã€ããŒãã€ã¶ãŒã®æ§æã®åé¡ã¯è§£æ±ºãããŠããŸãã ã
å šè¬
DPDKãIntel Communications Chipset 89xxã®äžã«çœ®ããšã次ã®ããã«ãªããŸãã
ãã«ãããã«ã¯ã coreutils ã gcc ãã«ãŒãã«ããããŒã glibcããããŒãå¿ èŠã§ãã
clangããµããŒããããŠããã Intelã®iccããµããŒããããŠããããã§ãã
ãã«ããŒã¹ã¯ãªãããå®è¡ããã«ã¯-Python 2.6 / 2.7
Linuxã«ãŒãã«ã¯ãUIOãµããŒããšããã»ã¹ã®ã¢ãã¬ã¹ç©ºéã®ç£èŠã䜿çšããŠã³ã³ãã€ã«ããå¿ èŠããããŸãããããã¯ã«ãŒãã«ãã©ã¡ãŒã¿ãŒã§ãã
CONFIG_UIO
CONFIG_UIO_PDRV
CONFIG_UIO_PDRV_GENIRQ
CONFIG_UIO_PCI_GENERIC
ãããŠ
CONFIG_PROC_PAGE_MONITOR
grsecurityã§ã¯ãPROC_PAGE_MONITORãã©ã¡ãŒã¿ãŒã¯æ å ±ãå€ããããšèããããŠãããšããäºå®ã«æ³šæãåèµ·ããããšæããŸããããã¯ãã«ãŒãã«ã®è匱æ§ãæªçšãã ASLRããã€ãã¹ããã®ã«åœ¹ç«ã¡ãŸãã
HPET
é«ç²ŸåºŠã®å®æçãªå²ã蟌ã¿ãæŽçããã«ã¯ã HPETã¿ã€ããŒãå¿ èŠã§ãã
空宀ç¶æ³ãèŠãããšãã§ããŸã
BIOSã§æå¹ã«ããŸãgrep hpet /proc/timer_list
é«åºŠ-> PCH-IOæ§æ->é«ç²ŸåºŠã¿ã€ããŒãããŠã CONFIG_HPETããã³CONFIG_HPET_MMAPãæå¹ã«ããŠã«ãŒãã«ãæ§ç¯ããŸãã
ããã©ã«ãã§ã¯ã HPDKãµããŒãã¯DPDKèªäœã§ç¡å¹ã«ãªã£ãŠããããã config / common_linuxappãã¡ã€ã«ã§CONFIG_RTE_LIBEAL_USE_HPETãã©ã°ãæåã§èšå®ããŠæå¹ã«ããå¿ èŠããããŸã ã
å Žåã«ãã£ãŠã¯ã HPETã䜿çšããããšããå§ãããŸããä»ã®å Žåã¯TSCã§ãã
é«æ§èœãœãªã¥ãŒã·ã§ã³ãå®è£ ããã«ã¯ãç®çãç°ãªããäºãã®æ¬ ç¹ãè£ããããäž¡æ¹ã䜿çšããå¿ èŠããããŸãã éåžžãããã©ã«ãã¯TSCã§ãã HPETã¿ã€ããŒã®å¯çšæ§ã®åæåãšç¢ºèªã¯ã rte_eal_hpet_init ïŒint make_default ïŒ< rte_cycles.h >ãåŒã³åºãããšã«ãã£ãŠè¡ãããŸãã APIããã¥ã¡ã³ãã§èŠéããŠããã®ã¯å¥åŠã§ãã
ã³ã¢çµ¶çž
ã·ã¹ãã ã¹ã±ãžã¥ãŒã©ããªãããŒãããã«ã¯ãé«æ§èœã¢ããªã±ãŒã·ã§ã³ã®ããŒãºã«åãããŠããã»ããµã®è«çã³ã¢ãåé¢ããã®ãäžè¬çã§ãã ããã¯ç¹ã«ãã¥ã¢ã«ããã»ããµã·ã¹ãã ã«åœãŠã¯ãŸããŸãã
ã¢ããªã±ãŒã·ã§ã³ãå¶æ°ã®ã«ãŒãã«2ã4ã6ã8ã10ã§å®è¡ãããŠããå Žå-ã«ãŒãã«ãã©ã¡ãŒã¿ãŒããæ°ã«å ¥ãã®ããŒãããŒããŒã«è¿œå ã§ããŸã
isolcpus = 2,4,6,8,10åºç¯ãªgrubã®å Žåãããã¯/ etc / default / grubæ§æã®GRUB_CMDLINE_LINUX_DEFAULTãã©ã¡ãŒã¿ãŒã§ã ã
巚倧ããŒãž
ãããã¯ãŒã¯ãããã¡ã«ã¡ã¢ãªãå²ãåœãŠãã«ã¯ã倧ããªããŒãžãå¿ èŠã§ãã ä»®æ³ã¡ã¢ãªã¢ãã¬ã¹ãTLBã«å€æããããã«å¿ èŠãªåŒã³åºããå°ãªãããã倧ããªããŒãžã匷調衚瀺ãããšããã©ãŒãã³ã¹ã«ãã©ã¹ã®å¹æããããŸãã 確ãã«ãæçåãé¿ããããã«ãã«ãŒãã«ãããŒãããããã»ã¹ã§éç«ã£ãŠããã¯ãã§ãã
ãããè¡ãã«ã¯ãã«ãŒãã«ãã©ã¡ãŒã¿ãŒãè¿œå ããŸãã
hugepages = 1024ããã«ããã1024ããŒãžã®2MBãå²ãåœãŠãããŸãã
4ã®ã¬ãã€ãã®ããŒãžãéžæããã«ã¯ïŒ
default_hugepagesz = 1G hugepagesz = 1G hugepages = 4ãã ããé©åãªãµããŒããå¿ èŠã§ãã/proc/cpuinfoã®ããã»ããµãââã©ã°pdpe1gb ã
grep pdpe1gb /proc/cpuinfo | uniq
64ãããã¢ããªã±ãŒã·ã§ã³ã®å Žåã1GBããŒãžã®äœ¿çšãæšå¥šãããŸãã
NUMAã·ã¹ãã ã®ã«ãŒãã«éã®ããŒãžã®ååžã«é¢ããæ å ±ãååŸããã«ã¯ã次ã®ã³ãã³ãã䜿çšã§ããŸã
cat /sys/devices/system/node/node*/meminfo | fgrep Huge
NUMAã·ã¹ãã ã§ã®ã©ãŒãžããŒãžã®å²ãåœãŠãšè§£æŸã«é¢ããããªã·ãŒã®ç®¡çã«ã€ããŠã¯ã å ¬åŒããã¥ã¡ã³ããã芧ãã ããã
倧ããªããŒãžããµããŒãããã«ã¯ã CONFIG_HUGETLBFSãã©ã¡ãŒã¿ãŒã䜿çšããŠã«ãŒãã«ãæ§ç¯ããå¿ èŠããããŸã
ã©ãŒãžããŒãžã«å²ãåœãŠãããã¡ã¢ãªé åã®ç®¡çã¯ãåå¥ã®ã«ãŒãã«ã¹ã¬ããkhugepagedã§æé©åãå®è¡ããTransparent Hugepageã¡ã«ããºã ã«ãã£ãŠå®è¡ãããŸãã ããããµããŒãããã«ã¯ã CONFIG_TRANSPARENT_HUGEPAGEããã³ããªã·ãŒCONFIG_TRANSPARENT_HUGEPAGE_ALWAYSãŸãã¯CONFIG_TRANSPARENT_HUGEPAGE_MADVISEããã©ã¡ãŒã¿ãŒãšãšãã«åéããå¿ èŠããããŸã
OSã®ããŒãäžã«å€§ããªããŒãžãå²ãåœãŠãå Žåã§ããããŸããŸãªçç±ã§2 MBããŒãžã®é£ç¶ã¡ã¢ãªé åãå²ãåœãŠãããšãã§ããªãå¯èœæ§ãæ®ã£ãŠããããããã®ã¡ã«ããºã ã¯åŒãç¶ãéèŠã§ãã
Intelã®ãã©ãã¯ãŒããã®NUMAãšã¡ã¢ãªã«é¢ããè¶ å€§äœããããŸãã
Rad Hatã®å€§ããªããŒãžã®äœ¿çšã«é¢ããçãèšäºããããŸãã
ããŒãžãæ§æããŠåŒ·èª¿è¡šç€ºããåŸãããããããŠã³ãããå¿ èŠããããŸãããã®ããã«ã¯ãé©åãªããŠã³ããã€ã³ãã/ etc / fstabã«è¿œå ããå¿ èŠããããŸã
nodev /mnt/huge hugetlbfs defaults 0 0
1GBããŒãžã®å ŽåãããŒãžãµã€ãºã¯è¿œå ãã©ã¡ãŒã¿ãŒã§æå®ããå¿ èŠããããŸã
nodev /mnt/huge hugetlbfs pagesize=1GB 0 0
ç§ã®å人çãªèŠ³å¯ã«ãããšã DPDKãã»ããã¢ããããŠäœ¿çšããéã®æ倧ã®åé¡ã¯ã倧ããªããŒãžã§æ£ç¢ºã«çºçããŸãã 倧ããªããŒãžã®ç®¡çã«ã¯ç¹ã«æ³šæãæãå¿ èŠããããŸãã
ã¡ãªã¿ã«ã Power8ã§ã¯ãã©ãŒãžããŒãžã®ãµã€ãºã¯16 MBãš16 GBã§ãããç§ã«ãšã£ãŠã¯å°ãããéãã§ãã
ãšãã«ã®ãŒç®¡ç
DPDKã«ã¯ãããã»ããµã®åšæ³¢æ°ãå¶åŸ¡ããããŒã«ãæ¢ã«ãããããæšæºã®ããªã·ãŒã§ã¯æãåãããŸããã
ãããã䜿çšããã«ã¯ã SpeedStepãšC3 C6ãæå¹ã«ããå¿ èŠããããŸãã
BIOSã§ã¯ãèšå®ãžã®ãã¹ã¯æ¬¡ã®ããã«ãªããŸã
詳现èšå®->ããã»ããµæ§æ->匷åãããIntel SpeedStep Techl3fwd-powerã¢ããªã±ãŒã·ã§ã³ã¯ãé»æºç®¡çæ©èœã䜿çšããL3ã¹ã€ããã®äŸãæäŸããŸãã
詳现èšå®->ããã»ããµèšå®->ããã»ããµC3詳现èšå®->ããã»ããµèšå®->ããã»ããµC6
ã¢ã¯ã»ã¹æš©
rootæš©éã§ã¢ããªã±ãŒã·ã§ã³ãå®è¡ããããšã¯éåžžã«å®å šã§ã¯ãªãããšã¯æããã§ãã
ACLã䜿çšããŠãåã ã®ãŠãŒã¶ãŒã°ã«ãŒãã®ã¢ã¯ã»ã¹èš±å¯ãäœæããããšããå§ãããŸãã
setfacl -su::rwx,g::rwx,o:---,g:dpdk:rw- /dev/hpet setfacl -su::rwx,g::rwx,o:---,g:dpdk:rwx /mnt/huge setfacl -su::rwx,g::rwx,o:---,g:dpdk:rw- /dev/uio0 setfacl -su::rwx,g::rwx,o:---,g:dpdk:rw- /sys/class/uio/uio0/device/config setfacl -su::rwx,g::rwx,o:---,g:dpdk:rwx /sys/class/uio/uio0/device/resource*
ããã«ããã䜿çšããããªãœãŒã¹ãšuio0ããã€ã¹ã®dpdkãŠãŒã¶ãŒã°ã«ãŒããžã®ãã«ã¢ã¯ã»ã¹ãè¿œå ãããŸãã
ãã¡ãŒã ãŠã§ã¢
40GbEãããã¯ãŒã¯ã¢ããã¿ãŒã®å Žåãå°ããªãã±ããã®åŠçã¯ããªãå°é£ãªã¿ã¹ã¯ã§ããããã¡ãŒã ãŠã§ã¢ãããã¡ãŒã ãŠã§ã¢ãžã®ã€ã³ãã«ã§ã¯è¿œå ã®æé©åãå°å ¥ãããŠããŸãã FLV3Eã·ãªãŒãºã®ãã¡ãŒã ãŠã§ã¢ãµããŒãã¯DPDK 2.2-rc2ã«å®è£ ãããŠããŸããããããŸã§ã®ãšãããæé©ãªããŒãžã§ã³ã¯4.2.6ã§ãã ãã³ããŒã®ãµããŒãã«é£çµ¡ããŠã ã€ã³ãã«ã«çŽæ¥ã¢ããããŒããäŸé Œããããèªåã§ã¢ããããŒãããããšãã§ããŸãã
PCIeããã€ã¹ã®æ¡åŒµã©ãã«ãèŠæ±ãµã€ãºãèªã¿åãèšè¿°å
extended_tagããã³max_read_request_sizeãã¹ã®PCIeãã©ã¡ãŒã¿ãŒã¯ãå°ããªãã±ããã®åŠçé床ã«å€§ãã圱é¿ããŸãïŒ40GbEã¢ããã¿ãŒã§ã¯çŽ100ãã€ãïŒã äžéšã®BIOSããŒãžã§ã³ã§ã¯ãæåã§ã€ã³ã¹ããŒã«ã§ããŸã-100ãã€ãã®ãã±ããã®å Žåããããã125ãã€ããšã1ãã
次ã®ãã©ã¡ãŒã¿ãŒã䜿çšããŠDPDKããã«ããããšãã«ãconfig / common_linuxapp configã§å€ãèšå®ã§ããŸãã
CONFIG_RTE_PCI_CONFIGãŸãã¯ã setpci lspciã³ãã³ãã䜿çšããŸãã
CONFIG_RTE_PCI_EXTENDED_TAG
CONFIG_RTE_PCI_MAX_READ_REQUEST_SIZE
ããã¯ã PCIeããã€ã¹ã®MAX_REQUESTãã©ã¡ãŒã¿ãŒãšMAX_PAYLOADãã©ã¡ãŒã¿ãŒã®éãã§ãããæ§æã«ã¯MAX_REQUESTã®ã¿ãå«ãŸããŠããŸãã
i40eãã©ã€ããŒã®å Žåãèªã¿åãèšè¿°åã®ãµã€ãºã16ãã€ãã«æžããããšã¯çã«ããªã£ãŠããŸãããããè¡ãã«ã¯ã次ã®ãã©ã¡ãŒã¿ãŒãèšå®ããŸãïŒ config / common_linuxappãŸãã¯config / common_bsdappã®CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC
ãŸããæ¢åã®åªå é äœïŒæ倧ã¹ã«ãŒããããŸãã¯ãã±ããé 延ïŒã«å¿ããŠãã¬ã³ãŒãå²ã蟌ã¿CONFIG_RTE_LIBRTE_I40E_ITR_INTERVALã®åŠçã®æå°ééãæå®ããããšãã§ããŸãã
ãŸãã Mellanox mlx4ãã©ã€ããŒã«ãåæ§ã®ãã©ã¡ãŒã¿ãŒããããŸãã
CONFIG_RTE_LIBRTE_MLX4_SGE_WR_Nããã¯ããããããã©ãŒãã³ã¹ã«äœããã®åœ±é¿ãåãŒããŸãã
CONFIG_RTE_LIBRTE_MLX4_MAX_INLINE
CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE
CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS
ãããã¯ãŒã¯ã¢ããã¿ãŒã®ãã®ä»ã®ãã©ã¡ãŒã¿ãŒã¯ãã¹ãŠãããã°ã¢ãŒãã«é¢é£ä»ããããŠãããã¿ãŒã²ããã¢ããªã±ãŒã·ã§ã³ã®ãããã¡ã€ã«ãšãããã°ãéåžžã«çŽ°ããè¡ãããšãã§ããŸãããããã«ã€ããŠã¯åŸã§è©³ãã説æããŸãã
Intel VT-dçšã®IOMMU
ãã©ã¡ãŒã¿ã䜿çšããŠã«ãŒãã«ãæ§ç¯ããå¿ èŠããã
CONFIG_IOMMU_SUPPORT
CONFIG_IOMMU_API
CONFIG_INTEL_IOMMU
igb_uioãã©ã€ããŒã®å ŽåãããŒããªãã·ã§ã³ãèšå®ããå¿ èŠããããŸã
iommu = ptããã«ããã DMAã¢ãã¬ã¹ã®æ£ããå€æãè¡ãããŸãïŒ DMAåãããã³ã° ïŒã ãã€ããŒãã€ã¶ãŒã®ã¿ãŒã²ãããããã¯ãŒã¯ã¢ããã¿ãŒã«å¯ŸããIOMMUãµããŒãã¯ãªãã«ãªã£ãŠããŸãã IOMMUèªäœã¯ãé«æ§èœãããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ã«ãšã£ãŠã¯ããªãç¡é§ã§ãã DPDKã¯1察1ã®ãããã³ã°ãå®è£ ããŠããããã IOMMUãå®å šã«ãµããŒãããå¿ èŠã¯ãããŸããããããã¯å¥ã®ã»ãã¥ãªãã£éåã§ãã
ã«ãŒãã«ã¢ã»ã³ããªäžã«INTEL_IOMMU_DEFAULT_ONãã©ã°ãèšå®ãããŠããå ŽåãããŒããã©ã¡ãŒã¿ãŒã䜿çšããå¿ èŠããããŸãã
intel_iommu = onIntel IOMMUã®æ£ããåæåãä¿èšŒããŸãã
UIO ïŒ uio_pci_generic ã igb_uio ïŒã®äœ¿çšã¯ã VFIO ïŒvfio-pciïŒããµããŒãããã«ãŒãã«ã§ã¯ãªãã·ã§ã³ã§ãããã¿ãŒã²ãããããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ãšã®çžäºäœçšã®æ©èœãå®è£ ãããŠããããšã«æ³šæããŠãã ããã
igb_uio㯠ãã¿ãŒã²ãããããã¯ãŒã¯ã¢ããã¿ãŒã«ããäžéšã®å²ã蟌ã¿ãä»®æ³æ©èœã®ãµããŒãããªãå Žåã«å¿ èŠã§ã ãããã§ãªãå Žåã¯ã uio_pci_genericãå®å šã«äœ¿çšã§ããŸãã
igb_uioãã©ã€ããŒã«ã¯iommu = ptãã©ã¡ãŒã¿ãŒãå¿ èŠã§ãããvfio-pciãã©ã€ããŒã¯iommu = ptãã©ã¡ãŒã¿ãŒãšiommu = onã®äž¡æ¹ã§æ£ããæ©èœããŸãã
IOMMUã°ã«ãŒãã®äœæ¥ã®ç¹æ§ã«é¢é£ããŠã VFIOèªäœã¯éåžžã«å¥åŠã«æ©èœããŸããããã€ã¹ã«ãã£ãŠã¯ããã¹ãŠã®ããŒããVFIOã§ãã€ã³ãããå¿ èŠããããã®ãããã°ãäžéšã®ã¿å¿ èŠãªãã®ãããã°ãäœããã€ã³ãããå¿ èŠã®ãªããã®ããããŸãã
ããã€ã¹ãPCI-PCIããªããžã®èåŸã«ããå Žåãããªããžãã©ã€ããŒã¯ã¿ãŒã²ããã¢ããã¿ãŒãšåãIOMMUã°ã«ãŒãã«å«ãŸããŸãããããã£ãŠã VFIOãããªããžã®èåŸã«ããããã€ã¹ãååŸã§ããããã«ãããªããžãã©ã€ããŒãã¢ã³ããŒãããå¿ èŠããããŸãã
ã¹ã¯ãªããã䜿çšããŠãæ¢åã®ããã€ã¹ãšäœ¿çšããããã©ã€ããŒã®å Žæã確èªã§ããŸãã
./tools/dpdk_nic_bind.py --status
ãŸãããã©ã€ããŒãç¹å®ã®ãããã¯ãŒã¯ããã€ã¹ã«æ瀺çã«ãã€ã³ãããããšãã§ããŸãã
./tools/dpdk_nic_bind.py --bind=uio_pci_generic 04:00.1 ./tools/dpdk_nic_bind.py --bind=uio_pci_generic eth1
ãã ãã䟿å©ã§ãã
èšçœ®
以äžã«èª¬æããããã«ããœãŒã¹ãååŸããŠåéããŸãã
DPDKèªäœã«ã¯ãæ£ããã·ã¹ãã èšå®ãå®è¡ã§ãããµã³ãã«ã¢ããªã±ãŒã·ã§ã³ã®ã»ãããä»å±ããŠããŸãã
äžèšã®ããã«ãDPDKã®æ§æã¯ã config / common_linuxappããã³config / common_bsdappãã¡ã€ã«ã§ãã©ã¡ãŒã¿ãŒãèšå®ããããšã«ããè¡ãããŸã ã ãã©ãããã©ãŒã åºæã®ãã©ã¡ãŒã¿ãŒã®ããã©ã«ãå€ã¯ã config / defconfig_ *ãã¡ã€ã«ã«ä¿åãããŸãã
æåã«ãæ§æãã³ãã¬ãŒããé©çšããã ãã«ããã©ã«ããŒããã¹ãŠã®æŽ»æ§ãšã¿ãŒã²ããã§äœæãããŸãã
make config T=x86_64-native-linuxapp-gcc
次ã®ã¿ãŒã²ããç°å¢ã¯DPDK 2.2ã§å©çšå¯èœã§ãïŒç§çšïŒ
arm-armv7a-linuxapp-gcc arm64-armv8a-linuxapp-gcc arm64-thunderx-linuxapp-gcc arm64-xgene1-linuxapp-gcc i686-native-linuxapp-gcc i686-native-linuxapp-icc ppc_64-power8-linuxapp-gcc tile-tilegx-linuxapp-gcc x86_64-ivshmem-linuxapp-gcc x86_64-ivshmem-linuxapp-icc x86_64-native-bsdapp-clang x86_64-native-bsdapp-gcc x86_64-native-linuxapp-clang x86_64-native-linuxapp-gcc x86_64-native-linuxapp-icc x86_x32-native-linuxapp-gcc
ivshmemã¯QEMUã¡ã«ããºã ã§ãäžè¬çãªå°çšããã€ã¹ã䜿çšããŠãã³ããŒããã«è€æ°ã®ã²ã¹ãä»®æ³ãã·ã³éã§ã¡ã¢ãªé åãå ±æã§ããŸãã ã²ã¹ãOSéã®éä¿¡ã®å Žåã å ±æã¡ã¢ãªãžã®ã³ããŒãå¿ èŠã§ããã DPDKã®å Žåã¯ããã§ã¯ãããŸããã Ivshmemèªäœã¯éåžžã«åçŽã§ãã
æ§æãã³ãã¬ãŒãã®æ®ãã®ç®çã¯æããã§ããã¯ãã§ããããã§ãªããã°ããªããããèªãã§ããã®ã§ããããïŒ
æ§æãã³ãã¬ãŒãã«å ããŠãä»ã®ãªãã·ã§ã³ã®ãã©ã¡ãŒã¿ãŒããããŸã
EXTRA_CPPFLAGS - EXTRA_CFLAGS - EXTRA_LDFLAGS - EXTRA_LDLIBS - RTE_KERNELDIR - CROSS - V=1 - D=1 - O - `build` DESTDIR - `/usr/local`
次ã«ãå€ãè¯ã
make
makeã®ç®æšã®ãªã¹ãã¯ããäžè¬çãªãã®ã§ãã
all build clean install uninstall examples examples_clean
åäœããã«ã¯ã UIOã¢ãžã¥ãŒã«ãããŒãããå¿ èŠããããŸã
ãŸãã¯sudo modprobe uio_pci_generic
sudo modprobe uio sudo insmod kmod/igb_uio.ko
VFIOã䜿çšããŠããå Žå
sudo modprobe vfio-pci
KNIã䜿çšãããŠããå Žå
insmod kmod/rte_kni.ko
ãµã³ãã«ããã«ãããŠå®è¡ãã
DPDKã¯2ã€ã®ç°å¢å€æ°ã䜿çšããŠäŸãæ§ç¯ããŸãã
- RTE_SDK- DPDKãã€ã³ã¹ããŒã«ãããŠãããã©ã«ããŒãžã®ãã¹
- RTE_TARGET-ã¢ã»ã³ããªã«äœ¿çšãããæ§æãã³ãã¬ãŒãã®åå
ãããã¯ã察å¿ããMakefileã§äœ¿çšãããŸã ã
EALã¯ãã¢ããªã±ãŒã·ã§ã³ãæ§æããããã®ã³ãã³ãã©ã€ã³ãªãã·ã§ã³ãæ¢ã«æäŸããŠããŸãã
- -c <ãã¹ã¯>-ã¢ããªã±ãŒã·ã§ã³ãå®è¡ãããè«çã³ã¢ã®16é²æ°ãã¹ã¯
- -n <number>ããã»ããµãŒããšã®ã¡ã¢ãªãã£ãã«
- -b <ãã¡ã€ã³ïŒãã¹ïŒidentifier.function>ã...- PCIããã€ã¹ã®ãã©ãã¯ãªã¹ã
- --use-device <domainïŒbusïŒidentifier.function>ã...- PCIããã€ã¹ã®ãã¯ã€ããªã¹ãããã©ãã¯ãšåæã«äœ¿çšããããšã¯ã§ããŸãã
- --socket-mem MB-ããã»ããµãœã±ããããšã®ã©ãŒãžããŒãžã«å²ãåœãŠãããã¡ã¢ãªã®é
- -m MB-ã©ãŒãžããŒãžã«å²ãåœãŠãããã¡ã¢ãªã®éãããã»ããµã®ç©ççãªå Žæã¯ç¡èŠãããŸã
- -r <number>ã®ã¡ã¢ãªã¹ããã
- -vããŒãžã§ã³
- --huge-dir-倧ããªããŒãžãããŠã³ãããããã©ã«ããŒ
- --file-prefix-ã©ãŒãžããŒãžã®ãã¡ã€ã«ã·ã¹ãã ã«ä¿åããããã¡ã€ã«ã®ãã¬ãã£ãã¯ã¹
- --proc-type-è€æ°ã®ããã»ã¹ã§ã¢ããªã±ãŒã·ã§ã³ãèµ·åããããã«--file-prefixãšãšãã«äœ¿çšãããããã»ã¹ã€ã³ã¹ã¿ã³ã¹
- --xen- dom0-ã©ãŒãžããŒãžããµããŒãããªãXen domain0ã§ã®å®è¡
- --vmware-tsc-map- RDTSCã®ä»£ããã«ã VMWareãæäŸããTSCã«ãŠã³ã¿ãŒã䜿çšããŸã
- --base-virtaddr-ããŒã¹ä»®æ³ã¢ãã¬ã¹
- --vfio-intr-VFIOã䜿çšããå²ã蟌ã¿ã®ã¿ã€ã
ã·ã¹ãã å ã®ã«ãŒãã«çªå·ã確èªããã«ã¯ã hwlocããã±ãŒãžã®lstopoã³ãã³ãã䜿çšã§ããŸãã
ã©ãŒãžããŒãžãšããŠå²ãåœãŠããããã¹ãŠã®ã¡ã¢ãªã䜿çšããããšããå§ãããŸããããã¯ã-mããã³--socket-memãªãã·ã§ã³ã䜿çšãããŠããªãå Žåã®ããã©ã«ãã®åäœã§ãã 倧ããããŒãžã§äœ¿çšã§ãããããå°ãªãé£ç¶ããã¡ã¢ãªé åãå²ãåœãŠããšã EALåæåãšã©ãŒãçºçããå Žåã«ãã£ãŠã¯æªå®çŸ©ã®åäœãçºçããå¯èœæ§ããããŸãã
1GBã®ã¡ã¢ãªãå²ãåœãŠãã«ã¯
- nullãœã±ããïŒïŒã§--socket-mem = 1024ãæå®ããå¿ èŠããããŸã
- æåã®--socket-mem = 0.1024
- ãŒããš2çªç®-socket-mem = 1024,0,1024
Hello Worldããã«ãããŠå®è¡ããã«ã¯
export RTE_SDK=~/src/dpdk cd ${RTE_SDK}/examples/helloworld make ./build/helloworld -cf -n 2
ãããã£ãŠãã¢ããªã±ãŒã·ã§ã³ã¯2ã€ã®ã¡ã¢ãªã¹ããããã€ã³ã¹ããŒã«ãããŠããããšãèæ ®ããŠã4ã€ã®ã³ã¢ã§å®è¡ãããŸãã
ãããŠãç°ãªãã³ã¢ãã5ã€ã®Hello WorldãååŸããŸãã
é¶ãåµãããã³ãããã¯ãã£ã«ã®åé¡
ä»®æ³ãã·ã³ã®ããã©ãŒãã³ã¹ãæ¯èŒçé«ãããšãšãè¿œå ã®ã¡ã¢ãªç®¡çã¡ã«ããºã ãå°å ¥ã§ããå¯èœæ§ããããããã¿ãŒã²ãããã©ãããã©ãŒã ãšããŠJavaãéžæããŸããã åé¡ã¯ã責任ãã©ã®ããã«é åãããã§ããã¡ã¢ãªã®å²ãåœãŠå Žæãã¹ã¬ããã®ç®¡çå Žæãã¿ã¹ã¯ã®ã¹ã±ãžã¥ãŒã«æ¹æ³ã DPDKã¡ã«ããºã ã®ç¹å¥ãªç¹ã¯ãããªãè€éã§äºéã®äŸ¡å€ããããŸãã DPDK ã Netty ãããã³OpenJDKèªäœã®ãœãŒã¹ãéåžžã«äžæè°ã«æãå¿ èŠããããŸããã ãã®çµæã DPDKçµ±åãéåžžã«æ·±ãnettyã³ã³ããŒãã³ãã®ç¹æ®ããŒãžã§ã³ãéçºãããŸãã ã
ç¶ç¶ããã