é·ãéãiFunnyã¯ãããã¯ãšã³ããµãŒãã¹ããã³ã¢ãã€ã«ã¢ããªã±ãŒã·ã§ã³ã§çºçããã€ãã³ãã®ããŒã¿ããŒã¹ãšããŠRedshiftã䜿çšããŠããŸããã å®è£ æã«ãæŠããŠãã³ã¹ããšå©äŸ¿æ§ã®ç¹ã§åçã®ä»£æ¿æ段ããªãã£ãããããããéžæãããŸããã
ãã ããClickHouseã®å ¬éãªãªãŒã¹åŸã«ãã¹ãŠãå€æŽãããŸããã ç§ãã¡ã¯é·ãéãããç 究ããã³ã¹ããšæ¯èŒããããããã®ã¢ãŒããã¯ãã£ã§ãããµããããŠãæçµçã«ããã®å€ããããç§ãã¡ã«ãšã£ãŠã©ãã»ã©åœ¹ã«ç«ã€ããèŠãããšã«ããŸããã ãã®èšäºã§ã¯ãRedshiftã解決ã«åœ¹ç«ãŠãåé¡ãšããã®ãœãªã¥ãŒã·ã§ã³ãClickHouseã«ç§»è¡ããæ¹æ³ã«ã€ããŠèª¬æããŸãã
åé¡
iFunnyã¯Yandex.Metricaã«äŒŒããµãŒãã¹ãå¿ èŠãšããŠããŸããããåœå æ¶è²»å°çšã§ãã çç±ã説æããŸãã
å€éšã¯ã©ã€ã¢ã³ãã¯ã€ãã³ããæžã蟌ã¿ãŸãã ã¢ãã€ã«ã¢ããªã±ãŒã·ã§ã³ãWebãµã€ãããŸãã¯å éšããã¯ãšã³ããµãŒãã¹ã«ãªããŸãã ãããã®ã¯ã©ã€ã¢ã³ãã¯ãã€ãã³ãåä¿¡ãµãŒãã¹ãçŸåšå©çšã§ããªãããšãã15åãŸãã¯1æéã§éä¿¡ãããããšèª¬æããããšã¯éåžžã«å°é£ã§ãã å€ãã®é¡§å®¢ãããŠã圌ãã¯åžžã«ã€ãã³ããéããããšæã£ãŠããããŸã£ããåŸ ã€ããšãã§ããŸããã
ããããšã¯å¯Ÿç §çã«ããã®ç¹ã§éåžžã«å¯å®¹ãªå éšãµãŒãã¹ãšãŠãŒã¶ãŒãããŸããã¢ã¯ã»ã¹ã§ããªãåæãµãŒãã¹ã§ãæ£ããåäœããããšãã§ããŸãã ãããŠãã»ãšãã©ã®è£œå枬å®åºæºãšA / Bãã¹ãã®çµæã¯ãäžè¬çã«1æ¥ã«1åã ãããŸãã¯ããå°ãªãé »åºŠã§èŠãã®ãçã«ããªã£ãŠããŸãã ãããã£ãŠãèªã¿åãèŠä»¶ã¯éåžžã«äœããªããŸãã äºæ ãæŽæ°ãçºçããå Žåãæ°æéãŸãã¯æ°æ¥éãèªãããšãã§ããªãããŸãã¯ã¢ã¯ã»ã¹ã§ããªãå ŽåããããŸãïŒç¹ã«ç¡èŠãããŠããå ŽåïŒã
æ°å€ã«ã€ããŠèšãã°ã1æ¥ã«çŽ50åã®ã€ãã³ãïŒ300 GBã®å§çž®ããŒã¿ïŒãååŸããå¿ èŠããããŸãããSQLã¯ãšãªã«ã¢ã¯ã»ã¹å¯èœãªããããããªåœ¢åŒã§3ãæéã2幎éãå·ããã圢åŒã§ããŒã¿ãä¿åããŸã以äžã§ãããæ°æ¥ä»¥å ã«ãããããããããã«å€ããããšãã§ããŸãã
åºæ¬çã«ãããŒã¿ã¯æéé ã«äžŠã¹ãããã€ãã³ãã®ã³ã¬ã¯ã·ã§ã³ã§ãã çŽ300çš®é¡ã®ã€ãã³ãããããããããã«ç¬èªã®ããããã£ã»ããããããŸãã åæããŒã¿ããŒã¹ãšåæããå¿ èŠã®ãããµãŒãããŒãã£ã®ãœãŒã¹ããã®ããŒã¿ããŸã ãããŸããããšãã°ãMongoDBãŸãã¯å€éšã®AppsFlyerãµãŒãã¹ããã®ã¢ããªã±ãŒã·ã§ã³ã€ã³ã¹ããŒã«ã®ã³ã¬ã¯ã·ã§ã³ã§ãã
ããŒã¿ããŒã¹ã«ã¯çŽ40 TBã®ãã£ã¹ã¯ãå¿ èŠã§ããããã³ãŒã«ããã¹ãã¬ãŒãžã«ã¯çŽ250 TB以äžå¿ èŠã§ãã
Redshiftãœãªã¥ãŒã·ã§ã³
ãã®ãããã€ãã³ããåä¿¡ããå¿ èŠãããã¢ãã€ã«ã¯ã©ã€ã¢ã³ããšããã¯ãšã³ããµãŒãã¹ããããŸãã HTTPãµãŒãã¹ã¯ããŒã¿ãåãå ¥ããæå°éã®æ€èšŒãè¡ããããŒã«ã«ãã£ã¹ã¯äžã®ã€ãã³ãã1åããšã«ã°ã«ãŒãåããããã¡ã€ã«ã«åéããããã«å§çž®ããŠS3ãã±ããã«éä¿¡ããŸãã ãã®ãµãŒãã¹ã®å¯çšæ§ã¯ãã¢ããªã±ãŒã·ã§ã³ãšAWS S3ãåãããµãŒããŒã®å¯çšæ§ã«äŸåããŸãã ã¢ããªã±ãŒã·ã§ã³ã¯ç¶æ ãä¿åããªããããç°¡åã«ãã©ã³ã¹ãåããã¹ã±ãŒãªã³ã°ãã亀æã§ããŸãã S3ã¯ãè©å€ãšå¯çšæ§ãé«ãæ¯èŒçåçŽãªãã¡ã€ã«ã¹ãã¬ãŒãžãµãŒãã¹ãªã®ã§ãä¿¡é Œã§ããŸãã
次ã«ãäœããã®æ¹æ³ã§ããŒã¿ãRedshiftã«é ä¿¡ããå¿ èŠããããŸãã ããã§ã¯ãã¹ãŠãéåžžã«ç°¡åã§ããRedshiftã«ã¯çµã¿èŸŒã¿ã®S3ã€ã³ããŒã¿ãŒãããããããããŒã¿ã®ããŒãã«æšå¥šãããæ¹æ³ã§ãã ãããã£ãŠã10åããšã«ãRedshiftã«æ¥ç¶ãã
s3://events-bucket/main/year=2018/month=10/day=14/10_3*
ãã¬ãã£ãã¯ã¹ã䜿çšããŠããŒã¿ãããŠã³ããŒãããããã«èŠæ±ããã¹ã¯ãªãããéå§ãããŸã
s3://events-bucket/main/year=2018/month=10/day=14/10_3*
ããŠã³ããŒãã¿ã¹ã¯ã®ã¹ããŒã¿ã¹ãç£èŠããããã«ã Apache Airflowã䜿çšããŸãããšã©ãŒãçºçããå Žåã«æäœãç¹°ãè¿ããå®è¡å±¥æŽãæ確ã«ããããšãã§ããŸããããã¯ããã®ãããªã¿ã¹ã¯ã®å€ãã«ãšã£ãŠéèŠã§ãã åé¡ãçºçããå Žåã¯ãäžå®ã®æéééã§ããŠã³ããŒããç¹°ãè¿ããã1幎åã«S3ãããã³ãŒã«ããããŒã¿ãããŠã³ããŒãã§ããŸãã
åããšã¢ãããŒã§ãåãæ¹æ³ã§ãã¹ã±ãžã¥ãŒã«ã«åŸã£ãŠãããŒã¿ããŒã¹ã«æ¥ç¶ããå€éšãªããžããªããå®æçãªããŠã³ããŒããå®è¡ããã¹ã¯ãªãããæ©èœãããã
INSERT INTO ... SELECT ...
圢åŒã§ã€ãã³ããéçŽããŸã
INSERT INTO ... SELECT ...
Redshiftã®å¯çšæ§ã®ä¿èšŒã¯è匱ã§ãã 1é±éã«1åãæ倧30æéïŒæéæ ã¯èšå®ã§æå®ãããŸãïŒAWSã¯ãã¯ã©ã¹ã¿ãŒã®æŽæ°ãŸãã¯ãã®ä»ã®ã¹ã±ãžã¥ãŒã«ãããäœæ¥ãåæ¢ã§ããŸãã 1ã€ã®ããŒãã§é害ãçºçãããšããã¹ãã埩å ããããŸã§ã¯ã©ã¹ã¿ãŒã䜿çšã§ããªããªããŸãã ããã«ã¯éåžžçŽ15åããããçŽ6ãæããšã«çºçããŸãã çŸåšã®ã·ã¹ãã ã§ã¯ãããã¯åé¡ã§ã¯ãªããããšããšã¯å®æçã«å©çšã§ããªãããã«èšèšãããŠããŸããã
Redshiftã§ã¯ãds2.8xlargeã®4ã€ã®ã€ã³ã¹ã¿ã³ã¹ïŒ36 CPUã16 TB HDDïŒã䜿çšãããåèšã§64 TBã®ãã£ã¹ã¯ã¹ããŒã¹ãåŸãããŸãã
æåŸã®ãã€ã³ãã¯ããã¯ã¢ããã§ãã ããã¯ã¢ããã¹ã±ãžã¥ãŒã«ã¯ã¯ã©ã¹ã¿ãŒèšå®ã§æå®ã§ããæ£åžžã«æ©èœããŸãã
ClickHouse移è¡ã®åæ©
ãã¡ãããåé¡ããªããã°ã誰ãClickHouseã«ç§»è¡ããããšãèããªãã£ãã§ãããã ãããã圌ãã¯ããã§ããã
MergeTreeãšRedshiftãšã³ãžã³ãåããClickHouseã¹ãã¬ãŒãžã¹ããŒã ãèŠããšãã€ããªãã®ãŒãéåžžã«äŒŒãŠããããšãããããŸãã äž¡æ¹ã®ããŒã¿ããŒã¹ã¯åç¶ã§ãããå€æ°ã®åã§ããŸãæ©èœãããã£ã¹ã¯äžã®ããŒã¿ãéåžžã«ããå§çž®ããŸãïŒRedshiftã§ã¯ãåã ã®åããšã«å§çž®ã¿ã€ããæ§æã§ããŸãïŒã ããŒã¿ãåãæ¹æ³ã§ä¿åãããŸããäž»ããŒã§äžŠã¹æ¿ãããããããç¹å®ã®ãããã¯ã®ã¿ãèªã¿åããåã ã®ã€ã³ããã¯ã¹ãã¡ã¢ãªã«ä¿æããããšã¯ã§ããŸãããããã¯ã倧éã®ããŒã¿ãæ±ãå Žåã«éèŠã§ãã
æ¬è³ªçãªéãã¯ããã€ãã®ããã«ã现éšã«ãããŸãã
ãã€ãªãŒããŒãã«
ãã£ã¹ã¯äžã®ããŒã¿ããœãŒãããå®éã«Redshiftã§åé€ããã®ã¯æ¬¡ã®å Žåã§ãã
VACUUM <tablename>
ãã®å Žåãããã¥ãŒã åŠçã¯ãã®ããŒãã«ã®ãã¹ãŠã®ããŒã¿ã§æ©èœããŸãã 1ã€ã®ããŒãã«ã«3ãæãã¹ãŠã®ããŒã¿ãä¿åããå Žåããã®ããã»ã¹ã«ã¯æéãããããå€ãããŒã¿ãåé€ãããŠæ°ããããŒã¿ãè¿œå ããããããå°ãªããšãæ¯æ¥å®è¡ããå¿
èŠããããŸãã æ¯æ¥åå¥ã®ããŒãã«ãäœæãããã¥ãŒã§çµåããå¿
èŠããããŸãããããã¯ããã®ãã¥ãŒã®ããŒããŒã·ã§ã³ãšãµããŒãã®é£ããã ãã§ãªããã¯ãšãªã®ã¹ããŒããŠã³ã§ããããŸãã ãªã¯ãšã¹ãã«å¿ããŠãexplainã«ãã£ãŠå€æãããšããã¹ãŠã®ããŒãã«ãã¹ãã£ã³ãããŸããã ãŸãã1ã€ã®ããŒãã«ã®ã¹ãã£ã³ã«ãããæéã¯90åã§1ç§æªæºã§ãããã¯ãšãªã«ã¯å°ãªããšã1åããããŸãã ããã¯ããŸã䟿å©ã§ã¯ãããŸããã
éè€
次ã®åé¡ã¯éè€ã§ãã äœããã®æ¹æ³ã§ããããã¯ãŒã¯ãä»ããŠããŒã¿ãéä¿¡ããå Žåã2ã€ã®ãªãã·ã§ã³ããããŸãïŒããŒã¿ã倱ãããéè€ãåä¿¡ããŸãã ã¡ãã»ãŒãžã倱ãããšã¯ã§ããŸããã§ããããããã£ãŠãã€ãã³ãã®äžéšãè€è£œããããšããäºå®ã«åçŽã«äžèŽããŸããã æ°ããããŒãã«ãäœæããå€ãããŒãã«ããããŒã¿ãæ¿å ¥ããŠãéè€ããIDãæã€è¡ãåé€ããå€ãããŒãã«ãåé€ããŠæ°ããããŒãã«ã®ååãå€æŽããããšã§ã1æ¥ãããã®éè€ãåé€ã§ããŸãã æ¯æ¥ã®ããŒãã«ã®äžéšã«ãã¥ãŒããã£ããããããŒãã«ã®ååãå€æŽãããšãã«ããããå¿ããã«åé€ããå¿ èŠããããŸããã ãã®å Žåãããã¯ãç£èŠããããšãå¿ èŠã§ããããããªããã°ããã¥ãŒãŸãã¯ããŒãã«ã®1ã€ããããã¯ããã¯ãšãªã®å Žåããã®ããã»ã¹ã¯é·æéãã©ãã°ãããå¯èœæ§ããããŸããã
ç£èŠãšã¡ã³ããã³ã¹
Redshiftã®åäžã®ãªã¯ãšã¹ãã«ãããæéã¯æ°ç§æªæºã§ã¯ãããŸããã ãŠãŒã¶ãŒãè¿œå ããããã¢ã¯ãã£ããªãªã¯ãšã¹ãã®ãªã¹ãã衚瀺ãããããã ãã§ããæ°åç§åŸ ã€å¿ èŠããããŸãã ãã¡ããã蚱容ããããšãã§ãããã®ãããªã¯ã©ã¹ã®ããŒã¿ããŒã¹ã§ã¯ããã¯èš±å®¹ã§ããŸãããæçµçã«ã¯èšå€§ãªæéã®æ倱ã«ãªããŸãã
è²»çš
èšç®ã«ãããšããŸã£ããåããªãœãŒã¹ãæã€AWSã€ã³ã¹ã¿ã³ã¹ã«ClickHouseããããã€ããããšã¯ãã¡ããã©2åå®ããªããŸãã ãã¡ãããããã§ããå¿ èŠããããŸããRedshiftã䜿çšãããšãAWSã³ã³ãœãŒã«ã®ããã€ãã®ãã¿ã³ãã¯ãªãã¯ãããšããã«ä»»æã®PostgreSQLã¯ã©ã€ã¢ã³ãã«æ¥ç¶ã§ããæ¢è£œã®ããŒã¿ããŒã¹ãåŸããããããAWSãæ®ãã®äœæ¥ãè¡ããŸãã ããããããã¯äŸ¡å€ããããŸããïŒ ãã§ã«ã€ã³ãã©ã¹ãã©ã¯ãã£ããããããã¯ã¢ãããç£èŠãæ§æãã§ããããã§ããããã¯ãå éšãµãŒãã¹ã®æã«å¯ŸããŠè¡ããŸãã ClickHouseãµããŒãã«åãçµãã§ã¿ãŸãããïŒ
移è¡ããã»ã¹
æåã«ã1å°ã®ãã·ã³ããå°ããªClickHouseãã€ã³ã¹ããŒã«ããçµã¿èŸŒã¿ããŒã«ã䜿çšããŠãS3ããããŒã¿ãããŠã³ããŒãããããšãå®æçã«éå§ããŸããã ãããã£ãŠãClickHouseã®é床ãšæ©èœã«é¢ããæ³å®ããã¹ãããããšãã§ããŸããã
ããŒã¿ã®å°ããªã³ããŒã§æ°é±éãã¹ãããåŸãRedshiftãClickhouseã«çœ®ãæããã«ã¯ãããã€ãã®åé¡ã解決ããå¿ èŠãããããšãæããã«ãªããŸããã
- å±éããã€ã³ã¹ã¿ã³ã¹ãšãã£ã¹ã¯ã®çš®é¡ã
- ã¬ããªã±ãŒã·ã§ã³ã䜿çšããã«ã¯ïŒ
- ã€ã³ã¹ããŒã«ãæ§æãããã³å®è¡æ¹æ³ã
- ç£èŠæ¹æ³
- ã©ã®ãããªã¹ããŒã ã«ãªããŸããã
- S3ããããŒã¿ãé ä¿¡ããæ¹æ³ã
- ãã¹ãŠã®ã¯ãšãªãæšæºSQLããéæšæºã«æžãæããæ¹æ³ã¯ïŒ
ã€ã³ã¹ã¿ã³ã¹ãšãã£ã¹ã¯ã®çš®é¡ ã ããã»ããµããã£ã¹ã¯ãããã³ã¡ã¢ãªã®æ°ã«ã€ããŠãçŸåšã®Redshiftã®ã€ã³ã¹ããŒã«ã«åºã¥ããŠæ§ç¯ããããšã決å®ããŸããã ããŒã«ã«NVMeãã£ã¹ã¯ãåããi3ã€ã³ã¹ã¿ã³ã¹ãå«ãããã€ãã®ãªãã·ã§ã³ããããŸããããåã€ã³ã¹ã¿ã³ã¹ã§r5.4xlargeãš8T ST1 EBSã®åœ¢åŒã®ã¹ãã¬ãŒãžã§åæ¢ããããšã«ããŸããã èŠç©ããã«ãããšãããã«ãããRedshiftãšåçã®ããã©ãŒãã³ã¹ãååã®ã³ã¹ãã§åŸãããã¯ãã§ãã åæã«ãEBSãã£ã¹ã¯ã䜿çšããããšã§ãRedshiftã®å Žåãšã»ãŒåãããã«ãã¹ãããã·ã§ãããã£ã¹ã¯ã䜿çšããç°¡åãªããã¯ã¢ãããšãªã«ããªãå¯èœã«ãªããŸãã
ã¬ããªã±ãŒã·ã§ã³ ã ãã§ã«Redshiftã«ãããã®ããå§ããã®ã§ãã¬ããªã±ãŒã·ã§ã³ã䜿çšããªãããšã«ããŸããã ããã«ãããã«ããããŸã ã€ã³ãã©ã¹ãã©ã¯ãã£ã«ãªãZooKeeperãããã«èª¿æ»ããå¿ èŠã¯ãããŸãããããªã³ããã³ãã§ã¬ããªã±ãŒã·ã§ã³ãå®è¡ã§ããããã«ãªã£ãããšã¯çŽ æŽãããããšã§ãã
èšçœ® ãããæãç°¡åãªéšåã§ãã ååã«å°ããAnsibleããŒã«ãæ¢è£œã®RPMããã±ãŒãžãã€ã³ã¹ããŒã«ããåãã¹ãã«åãæ§æãäœæããŸãã
ã¢ãã¿ãªã³ã° ãã¹ãŠã®ãµãŒãã¹ãç£èŠããããã«ãPrometheusã¯Telegrafããã³Grafanaãšäžç·ã«äœ¿çšããããããClickHouseã§ãã¹ãã«TelegrafãšãŒãžã§ã³ããé 眮ããã ãã§ãGrafanaã§ããã·ã¥ããŒããåéããããã»ããµãã¡ã¢ãªããã£ã¹ã¯ããšã®çŸåšã®ãµãŒããŒè² è·ã瀺ããŸããã Grafanaãžã®ãã©ã°ã€ã³ãä»ããŠããã®ããã·ã¥ããŒãã«ãã¯ã©ã¹ã¿ãŒã«å¯ŸããçŸåšã¢ã¯ãã£ããªãªã¯ãšã¹ããS3ããã®ã€ã³ããŒãã®ã¹ããŒã¿ã¹ãããã³ãã®ä»ã®æçšãªãã®ããããããŸããã AWSã³ã³ãœãŒã«ãæäŸããããã·ã¥ããŒããããããã«åªãããããæçãªïŒãããŠéåžžã«é«éãªïŒããšãå€æããŸããã
ã¹ããŒã Redshiftã§æãéèŠãªééãã®1ã€ã¯ãã¡ã€ã³ã€ãã³ããã£ãŒã«ãã®ã¿ãå¥ã ã®åã«é 眮ããè¿œå ã«ã»ãšãã©äœ¿çšãããªããã£ãŒã«ããè¿œå ããããšã§ãã
1ã€ã®å€§ããªåã®ããããã£ã§ã äžæ¹ã§ãããã«ãããåéããã€ãã³ããæ£ç¢ºã«ææ¡ã§ããªãã£ãåæ段éã§ãã£ãŒã«ããå€æŽããæè»æ§ãåŸãããŸããããŸããããããã£ã¯1æ¥ã«5åå€æŽãããŸããã äžæ¹ã倧ããªåã®ããããã£ã®ãªã¯ãšã¹ãã«ã¯æéãããããŸããã ClickHouseã§ã¯ãæ£ããããšãããã«è¡ãããšã«ããã®ã§ãå¯èœãªãã¹ãŠã®åãåéãããããã«æé©ãªåãå ¥åããŸããã çµæã¯ãçŽ200åã®ããŒãã«ã§ãã
次ã®ã¿ã¹ã¯ã¯ãã¹ãã¬ãŒãžãšããŒãã£ã·ã§ã³åã«é©åãªãšã³ãžã³ãéžæããããšã§ããã
圌ãã¯åã³ããŒãã£ã·ã§ã³åã«ã€ããŠã¯èããŸããã§ããããRedshiftã§è¡ã£ãã®ãšåãããšãããŸãã-æ¯æ¥ã®ããŒãã£ã·ã§ã³ã§ãããçŸåšã¯ãã¹ãŠã®ããŒãã£ã·ã§ã³ã1ã€ã®ããŒãã«ã«ãªã£ãŠããŸãã
ãªã¯ãšã¹ããå€§å¹ ã«é«éåããã¡ã³ããã³ã¹ãç°¡çŽ åããŸãã ã¹ãã¬ãŒãžãšã³ãžã³ã¯ãåã«OPTIMIZE ... FINALãå®è¡ããããšã§ãç¹å®ã®ããŒãã£ã·ã§ã³ããéè€ãåé€ã§ãããããReplacingMergeTreeã«ãã£ãŠååŸãããŸããã ããã«ããšã©ãŒãäºæ ãçºçããå Žåãæ¯æ¥ã®ããŒãã£ã·ã§ã³ã¹ããŒã ã«ããã1ãæã®ããŒã¿ã§ã¯ãªã1æ¥ã®ããŒã¿ã®ã¿ãåŠçã§ããŸãã
s3ããClickHouseãžã®ããŒã¿ã®é ä¿¡ ã ããã¯æãé·ãããã»ã¹ã®1ã€ã§ããã S3ã®ããŒã¿ã¯JSONã§ãããRedshiftã§è¡ã£ãããã«ãåãã£ãŒã«ããç¬èªã®jsonpathã§æœåºããå¿ èŠããããããçµã¿èŸŒã¿ã®ClickHouseããŒã«ã䜿çšããŠããŠã³ããŒãã§ããŸããã§ããããŸããå€æ
DD96C92F-3F4D-44C6-BCD3-E25EB26389E9
ã¯ãã€ãã«å€æããFixedStringåïŒ16ïŒã«å ¥ããŸãã
Redshiftã®COPYã³ãã³ãã«äŒŒãç¹å¥ãªãµãŒãã¹ã欲ããã£ãã§ãã 圌ãã¯æºåãã§ããŠããªãã£ãã®ã§ãç§ã¯ãããããªããã°ãªããŸããã§ããã ä»çµã¿ã«ã€ããŠã¯å¥ã®èšäºãæžãããšãã§ããŸãããèŠããã«ãããã¯ClickHouseã䜿çšããŠãã¹ãŠã®ãã¹ãã«ãããã€ãããHTTPãµãŒãã¹ã§ãã ãããã®ãããããåç §ã§ããŸãã èŠæ±ãã©ã¡ãŒã¿ãŒã¯ããã¡ã€ã«ã®ååŸå ã®S3ãã¬ãã£ãã¯ã¹ãJSONããåã®ã»ãããžã®å€æã®ããã®jsonpathãªã¹ããããã³ååã®å€æã®ã»ãããæå®ããŸãã èŠæ±ã®éä¿¡å ã®ãµãŒããŒã¯ãS3äžã®ãã¡ã€ã«ã®ã¹ãã£ã³ãéå§ãã解æäœæ¥ãä»ã®ãã¹ãã«åæ£ããŸãã åæã«ãã€ã³ããŒãã§ããªãã£ãè¡ããšã©ãŒãšãšãã«ãå¥ã®ClickHouseããŒãã«ã«è¿œå ããããšãéèŠã§ãã ããã¯ãã€ãã³ãåä¿¡ãµãŒãã¹ãšãããã®ã€ãã³ããçæããã¯ã©ã€ã¢ã³ãã®åé¡ãšãã°ã調æ»ããã®ã«éåžžã«åœ¹ç«ã¡ãŸãã ã€ã³ããŒã¿ãŒãããŒã¿ããŒã¹ãã¹ãã«çŽæ¥é 眮ããããšã§ãè€éãªèŠæ±ã24æéãããã«éä¿¡ãããªããããååãšããŠã¢ã€ãã«ç¶æ ã®ãªãœãŒã¹ãå©çšããŸããã ãã¡ãããããã«ãªã¯ãšã¹ããããå Žåã¯ããã€ã§ãã€ã³ããŒã¿ãŒã®ãµãŒãã¹ãå¥ã®ãã¹ãã«ç§»åã§ããŸãã
å€éšãœãŒã¹ããã®ããŒã¿ã®ã€ã³ããŒãã«å€§ããªåé¡ã¯ãããŸããã§ããã ãããã®ã¹ã¯ãªããã§ã¯ãå®å ãRedshiftããClickHouseã«å€æŽããŸããã
æ¯æ¥ã³ããŒãäœæããã®ã§ã¯ãªããèŸæžåœ¢åŒã§MongoDBãæ¥ç¶ãããªãã·ã§ã³ããããŸããã æ®å¿µãªãããèŸæžã¯å¿ ãã¡ã¢ãªã«åãŸããªããã°ãªãããMongoDBã®ã»ãšãã©ã®ã³ã¬ã¯ã·ã§ã³ã®ãµã€ãºã¯ãããèš±å¯ããªããããåãŸããŸããã§ããã ããããèŸæžãæçšã§ãããèŸæžã䜿çšãããšãMaxMindããGeoIPããŒã¿ããŒã¹ã«æ¥ç¶ããŠã¯ãšãªã§äœ¿çšããã®ã«éåžžã«äŸ¿å©ã§ãã ãã®ããã«ããµãŒãã¹ã«ãã£ãŠæäŸãããã¬ã€ã¢ãŠãip_trieããã³CSVãã¡ã€ã«ã䜿çšããŸãã ããšãã°ãgeoip_asn_blocks_ipv4èŸæžã®æ§æã¯æ¬¡ã®ããã«ãªããŸãã
<dictionaries> <dictionary> <name>geoip_asn_blocks_ipv4</name> <source> <file> <path>GeoLite2-ASN-Blocks-IPv4.csv</path> <format>CSVWithNames</format> </file> <\/source> <lifetime>300</lifetime> <layout> <ip_trie /> </layout> <structure> <key> <attribute> <name>prefix</name> <type>String</type> </attribute> </key> <attribute> <name>autonomous_system_number</name> <type>UInt32</type> <null_value>0</null_value> </attribute> <attribute> <name>autonomous_system_organization</name> <type>String</type> <null_value>?</null_value> </attribute> </structure> </dictionary> </dictionaries>
ãã®èšå®ã
/etc/clickhouse-server/geoip_asn_blocks_ipv4_dictionary.xml
ã«é 眮ããã ãã§åå
/etc/clickhouse-server/geoip_asn_blocks_ipv4_dictionary.xml
ããã®åŸããã£ã¯ã·ã§ããªã«ã¯ãšãªãå®è¡ããŠãIPã¢ãã¬ã¹ã§ãããã€ããŒã®ååãååŸã§ããŸãã
SELECT dictGetString('geoip_asn_blocks_ipv4', 'autonomous_system_organization', tuple(IPv4StringToNum('192.168.1.1')));
ããŒã¿ã¹ããŒããå€æŽããŸã ã äžèšã®ããã«ãäºæ ãèšç»ãããäœæ¥ãçºçããå Žåã«ã¢ã¯ã»ã¹ã§ããªããªããããŒã¿ã®ã³ããŒã¯ãã§ã«s3ã«ããã劥åœãªæéå ã«ClickHouseã«è»¢éã§ãããããã¬ããªã±ãŒã·ã§ã³ããŸã 䜿çšããªãããšã«ããŸããã ã¬ããªã±ãŒã·ã§ã³ããªãå Žåã¯ãZooKeeperãå±éããŸããã§ããããŸããZooKeeperããªããããDDLã¯ãšãªã§ON CLUSTERåŒã䜿çšã§ããªããªããŸãã ãã®åé¡ã¯ãåClickHouseãã¹ãã«æ¥ç¶ããå°ããªPythonã¹ã¯ãªããïŒãããŸã§ã«8ã€ãããããŸããïŒã«ãã£ãŠè§£æ±ºãããæå®ãããSQLã¯ãšãªãå®è¡ããŸãã
ClickHouseã®äžå®å šãªSQLãµããŒã ã ãªã¯ãšã¹ããRedshiftæ§æããClickHouseæ§æã«è»¢éããããã»ã¹ã¯ãã€ã³ããŒã¿ãŒã®éçºãšäžŠè¡ããŠè¡ãããäž»ã«ã¢ããªã¹ãããŒã ã«ãã£ãŠåŠçãããŸããã å¥åŠãªããšã«ãåé¡ã¯JOINã§ãããŠã£ã³ããŠé¢æ°ã§ããããŸããã§ããã é åãšã©ã ãé¢æ°ã䜿çšããŠããããå®è¡ããæ¹æ³ãç解ããã«ã¯ãæ°æ¥ããããŸããã ãã®åé¡ãClickHouseã«é¢ããã¬ããŒãã§ããåãäžããããŠããã®ã¯è¯ãããšã§ã ãããšãã°ã events.yandex.ru / lib / talks / 5420ãªã©ãèšå€§ãªæ°ããããŸãã ãã®æç¹ã§ãããŒã¿ã¯ãã§ã«2ã€ã®å Žæã«äžåºŠã«æžã蟌ãŸããŠããŸãïŒRedshiftãšæ°ããClickHouseã®äž¡æ¹ã«ããããããªã¯ãšã¹ãã転éãããšãã«çµæãæ¯èŒããŸããã ããããã£ã®1ã€ã®å€§ããªåãåé€ããã»ãšãã©ã®ã¯ãšãªãå¿ èŠãªåã§ã®ã¿åäœãå§ãããããé床ãæ¯èŒããããšã¯åé¡ã§ããããåœç¶ãå€§å¹ ã«å¢å ããŸããããããããã£åãåå ããªãã£ãã¯ãšãªã¯åãæ¹æ³ã§åäœããããå°ãéããªããŸããã
çµæã¯æ¬¡ã®ã¹ããŒã ã§ãã
çµæ
çµè«ãšããŠã次ã®å©ç¹ããããŸããã
- 90ã§ã¯ãªã1ã€ã®ããŒãã«
- ãµãŒãã¹èŠæ±ã¯ããªç§ã§å®è¡ãããŸã
- ã³ã¹ããåæžããŸãã
- éè€ããã€ãã³ããç°¡åã«åé€ãã
ãŸããæºåãã§ããŠããæ¬ ç¹ããããŸãã
- äºæ ã®å Žåã¯ãèªåã§ã¯ã©ã¹ã¿ãŒã修埩ããå¿ èŠããããŸã
- ã¹ããŒãã®å€æŽã¯ãåãã¹ãã§åå¥ã«è¡ãå¿ èŠããããŸã
- æ°ããããŒãžã§ã³ãžã®æŽæ°ã¯èªåã§è¡ãå¿ èŠããããŸã
ããŒã¿ã¹ããŒã ãå€§å¹ ã«å€æŽãããããããªã¯ãšã¹ãã®é床ãæ£é¢ããæ¯èŒããããšã¯ã§ããŸããã å€ãã®ã¯ãšãªã¯ããã£ã¹ã¯ããèªã¿åãããŒã¿ãå°ãªãããã«é«éã«ãªããŸããã è¯ãæ¹æ³ã§ããã®ãããªå€æŽã¯Redshiftã§è¡ããªããã°ãªããŸããã§ããããClickHouseãžã®ç§»è¡ãšçµã¿åãããããšã決å®ãããŸããã
ãã¹ãŠã®ç§»è¡ãšæºåã«ã¯çŽ3ãæããããŸããã 圌女ã¯7æã®åããã9ææ«ãŸã§æ©ããŠã2人ã®åå ãèŠæ±ããŸããã 9æ27æ¥ãRedshiftããªãã«ããŠä»¥æ¥ãClickHouseã®ã¿ã«åãçµãã§ããŸãã ãã§ã«2ãæ匷ã§ããã ãã®çšèªã¯çãã§ããããããŸã§ã®ãšãããã¯ã©ã¹ã¿ãŒå šäœãç«ã¡äžãããããããŒã¿æ倱ãé倧ãªãã°ã«ééããããšã¯ãããŸããã ç§ãã¡ã®åã«æ°ããããŒãžã§ã³ã®æŽæ°ãåŸ ã£ãŠããŸãïŒ