3幎åãAlexander㯠ãBadooã§ã¹ã±ãŒã©ãã«ãªãªã¢ã«ã¿ã€ã ã«è¿ãã€ãã³ãåŠçã·ã¹ãã ãã©ã®ããã«æ§ç¯ããããèªããŸããã ãã以æ¥ãããã»ã¹ãé²åããããªã¥ãŒã ãå¢å€§ããã¹ã±ãŒãªã³ã°ãšãã©ãŒã«ããã¬ã©ã³ã¹ã®åé¡ã解決ããå¿ èŠããããããæç¹ã§æ ¹æ¬çãªå¯Ÿçãå¿ èŠã«ãªããŸãã- æè¡ã¹ã¿ãã¯ã®å€æŽ ã
埩å·åãããBadooã§Spark + Hadoopãã³ãã«ãClickHouseã«çœ®ãæãã ããŒããŠã§ã¢ã3åç¯çŽããè² è·ã6åã«å¢ãããæ¹æ³ããããžã§ã¯ãã®çµ±èšãåéããçç±ãšæ¹æ³ããããŠãã®ããŒã¿ãã©ãåŠçããããåŠç¿ããŸãã
ã¹ããŒã«ãŒã«ã€ããŠïŒ Alexander Krasheninnikov ïŒ alexkrash ïŒ -Badooã®ããŒã¿ãšã³ãžãã¢ãªã³ã°è²¬ä»»è ã 圌ã¯ãã¯ãŒã¯ããŒãã«åãããŠã¹ã±ãŒãªã³ã°ããBIã€ã³ãã©ã¹ãã©ã¯ãã£ã«åŸäºããããŒã¿åŠçã€ã³ãã©ã¹ãã©ã¯ãã£ãæ§ç¯ããããŒã ã管çããŠããŸãã 圌ã¯ãHadoopãSparkãClickHouseãªã©ãé åžããããã¹ãŠã®ãã®ã倧奜ãã§ãã OpenSourceããã¯ãŒã«ãªåæ£ã·ã¹ãã ãæºåã§ãããšç¢ºä¿¡ããŠããŸãã
çµ±èšåé
ããŒã¿ããªãå Žåãç§ãã¡ã¯ç²ç®ã§ããããããžã§ã¯ãã管çã§ããŸããã ãã®ããããããžã§ã¯ãã®å®è¡å¯èœæ§ãç£èŠããããã«çµ±èšãå¿ èŠã§ãã ãšã³ãžãã¢ãšããŠãç§ãã¡ã¯è£œåã®æ¹åã«åªããå¿ èŠããããŸããæ¹åãããå Žåã¯æž¬å®ããŠãã ããã ãããç§ã®ä»äºã®ã¢ãããŒã§ãã ãŸã第äžã«ãç§ãã¡ã®ç®æšã¯ããžãã¹äžã®å©çã§ãã çµ±èšã¯ãããžãã¹äžã®è³ªåã«å¯ŸããçããæäŸããŸã ã ãã¯ãã«ã«ã¡ããªãã¯ã¯ãã¯ãã«ã«ã¡ããªãã¯ã§ãããããžãã¹ã¯ææšã«ãé¢å¿ããããèæ ®ããå¿ èŠããããŸãã
çµ±èšã©ã€ããµã€ã¯ã«
çµ±èšã®ã©ã€ããµã€ã¯ã«ã4ã€ã®ãã€ã³ãã§å®çŸ©ããŸããããããã«ã€ããŠåå¥ã«èª¬æããŸãã
ãã§ãŒãºã®å®çŸ©-圢åŒå
ã¢ããªã±ãŒã·ã§ã³ã§ã¯ãããã€ãã®ã¡ããªãã¯ãåéããŸãã ãŸãããããã¯ããžãã¹ææšã§ãã ããšãã°ãåçãµãŒãã¹ãããå Žåã1æ¥ã«1æéã«1ç§éã«äœæã®åçãã¢ããããŒããããã®ãçåã«æã£ãŠããŸãã 次ã®ã¡ããªãã¯ã¯ãæºæè¡çãã§ããã¢ãã€ã«ã¢ããªã±ãŒã·ã§ã³ãŸãã¯ãµã€ãã®å¿çæ§ãAPIæäœããŠãŒã¶ãŒããµã€ããšå¯Ÿè©±ããéããã¢ããªã±ãŒã·ã§ã³ã®ã€ã³ã¹ããŒã«ãUXã 3çªç®ã«éèŠãªææšã¯ã ãŠãŒã¶ãŒã®è¡åã远跡ããããšã§ãã ãããã¯ãGoogle AnalyticsãYandex.Metricsãªã©ã®ã·ã¹ãã ã§ãã ç¬èªã®ã¯ãŒã«ãªè¿œè·¡ã·ã¹ãã ããããããã§å€ãã®æè³ãããŠããŸãã
çµ±èšãæ±ãããã»ã¹ã§ã¯ãå€ãã®ãŠãŒã¶ãŒãé¢äžããŸã-ãããã¯éçºè ãšããžãã¹ã¢ããªã¹ãã§ãã å šå¡ãåãèšèªã話ãããšãéèŠãªã®ã§ãåæããå¿ èŠããããŸãã
å£é ã§äº€æžããããšã¯å¯èœã§ããããããæ£åŒã«çºçããå Žåã¯ãã€ãã³ãã®æ確ãªæ§é ã«ãããŠã¯ããã«åªããŠããŸãã
éçºè ãããžãã¹ã€ãã³ãã®æ§é ã圢åŒåãããš ãéçºè ãç»é²æ°ãèšããšãã¢ããªã¹ãã¯ç»é²ã®ç·æ°ã ãã§ãªããåœãæ§å¥ãããã³ãã®ä»ã®ãã©ã¡ãŒã¿ãŒã«é¢ããæ å ±ãæäŸãããããšãç解ããŸãã ãããŠããã®æ å ±ã¯ãã¹ãŠåœ¢åŒåãããäŒç€Ÿã®ãã¹ãŠã®ãŠãŒã¶ãŒã®ãããªãã¯ãã¡ã€ã³ã«ãããŸã ã ã€ãã³ãã«ã¯ãåä»ãæ§é ãšæ£åŒãªèª¬æããããŸãã ããšãã°ããã®æ å ±ããããã³ã«ãããã¡åœ¢åŒã§ä¿åããŸãã
ã€ãã³ããç»é²ãã®èª¬æïŒ
enum Gender { FEMALE = 1; MALE = 2; } message Registration { required int32 userid =1; required Gender usergender = 2; required int32 time =3; required int32 countryid =4; }
ç»é²ã€ãã³ãã«ã¯ã ãŠãŒã¶ãŒããã£ãŒã«ããã€ãã³ãã®æé ãããã³ãŠãŒã¶ãŒã®ç»é²åœã«é¢ããæ å ±ãå«ãŸããŠããŸãã ãã®æ å ±ã¯ã¢ããªã¹ããå©çšã§ããå°æ¥ãäŒæ¥ã¯åœç€Ÿãåéãããã®ãç解ããŸãã
æ£åŒãªèª¬æãå¿ èŠãªã®ã¯ãªãã§ããïŒ
æ£åŒãªèª¬æã¯ãéçºè ãã¢ããªã¹ããããã³è£œåéšéã®çµ±äžæ§ã§ãã 次ã«ããã®æ å ±ã¯ãã¢ããªã±ãŒã·ã§ã³ã®ããžãã¹ããžãã¯ã®èª¬æãå®è¡ããŸãã ããšãã°ãããžãã¹ããã»ã¹ãèšè¿°ããå éšã·ã¹ãã ãããããã®äžã«æ°ããæ©èœãããç»é¢ããããŸãã
補åèŠä»¶ããã¥ã¡ã³ãã«ã¯ããŠãŒã¶ãŒããã®æ¹æ³ã§ã¢ããªã±ãŒã·ã§ã³ãšå¯Ÿè©±ãããšãã«ããŸã£ããåããã©ã¡ãŒã¿ãŒã§ã€ãã³ããéä¿¡ããå¿ èŠããããšããæ瀺ãå«ãã»ã¯ã·ã§ã³ããããŸãã ãã®åŸãæ©èœãã©ã®ããã«æ©èœããããããã³ããããæ£ãã枬å®ããããšãæ€èšŒã§ããŸãã æ£åŒãªèª¬æã«ããããã®ããŒã¿ãããŒã¿ããŒã¹ã«ä¿åããæ¹æ³ïŒNoSQLãSQLãªã©ïŒãããã«ç解ã§ããŸãã ããŒã¿ã¹ããŒãããã ãããã¯çŽ æŽãããã§ãã
ãµãŒãã¹ãšããŠæäŸãããäžéšã®åæã·ã¹ãã ã§ã¯ãã·ãŒã¯ã¬ããã¹ãã¬ãŒãžã«ã¯10ã15åã®ã€ãã³ããããªãã ç§ãã¡ã®æ°ã¯1000ãè¶ ããŠæé·ããæ¢ãŸãããšã¯ãããŸããã åäžã®ã¬ãžã¹ããªãªãã§ã¯çããããšã¯äžå¯èœã§ãã
ãã§ãŒãºæŠèŠã®å®çŸ©
çµ±èšã¯éèŠã§ãããšå€æããç¹å®ã®äž»é¡åéã«ã€ããŠèª¬æããŸãããããã¯è¯ãããšã§ãã
åéãã§ãŒãº-ããŒã¿åé
ç»é²ãã¡ãã»ãŒãžã®éä¿¡ãªã©ã®ããžãã¹ã€ãã³ããçºçãããšãã«ããã®æ å ±ãä¿åãããšåæã«ãçµ±èšã€ãã³ããåå¥ã«éä¿¡ããããã«ã·ã¹ãã ãæ§ç¯ããããšã«ããŸããã
ã³ãŒãã§ã¯ãçµ±èšã¯ããžãã¹ã€ãã³ããšåæã«éä¿¡ãããŸãã
ããŒã¿ãããŒã¯å¥ã®åŠçãã€ãã©ã€ã³ãééãããããã¢ããªã±ãŒã·ã§ã³ãå®è¡ãããããŒã¿ã¹ãã¢ãšã¯å®å šã«ç¬ç«ããŠåŠçãããŸãã
EDLã«ãã説æïŒ
enum Gender { FEMALE = 1; MALE = 2; } message Registration { required int32 user_id =1; required Gender user_gender = 2; required int32 time =3; required int32 country_id =4; }
ç»é²ã€ãã³ãã®èª¬æããããŸãã APIã¯èªåçã«çæãããéçºè ã¯ã³ãŒãããã¢ã¯ã»ã¹ã§ãã4è¡ã§çµ±èšãéä¿¡ã§ããŸãã
EDLããŒã¹ã®APIïŒ
\EDL\Event\Regist ration::create() ->setUserId(100500) ->setGender(Gender: :MALE) ->setTime(time()) ->send();
ã€ãã³ãé ä¿¡
ãããå€éšã·ã¹ãã ã§ãã ãããè¡ãã®ã¯ãåçããŒã¿ãæäœããããã®APIãæäŸããçŽ æŽããããµãŒãã¹ãããããã§ãã ãããã¯ãã¹ãŠãAerospikeãCockroachDBãªã©ã®ã¯ãŒã«ãªæ°ããããŒã¿ããŒã¹ã«ããŒã¿ãä¿åããŸãã
ããçš®ã®ã¬ããŒããäœæããå¿ èŠãããå Žåã¯ããã¿ããªãã©ãã ãããã®ããã©ãã ãããã®ãããšããã¹ã¯ã©ã³ãã«ã«è¡ãå¿ èŠã¯ãããŸããããã¹ãŠã®ããŒã¿ã¯å¥ã®ãããŒã«éä¿¡ãããŸãã åŠçã³ã³ãã¢-å€éšã·ã¹ãã ã ã¢ããªã±ãŒã·ã§ã³ã³ã³ããã¹ããããããžãã¹ããžãã¯ãªããžããªãããã¹ãŠã®ããŒã¿ã解ããããã«å¥ã®ãã€ãã©ã€ã³ã«éä¿¡ããŸãã
åéãã§ãŒãºã§ã¯ãã¢ããªã±ãŒã·ã§ã³ãµãŒããŒã®å¯çšæ§ãæ³å®ããŠããŸãã ãã®PHPããããŸãã
茞é
ããã¯ãã¢ããªã±ãŒã·ã§ã³ã³ã³ããã¹ãããè¡ã£ãããšãå¥ã®ãã€ãã©ã€ã³ã«éä¿¡ã§ãããµãã·ã¹ãã ã§ãã 茞éã¯ããããžã§ã¯ãã®ç¶æ³ã«å¿ããŠãèŠä»¶ããã®ã¿éžæãããŸãã
茞éã«ã¯ç¹åŸŽããããæåã®ä¿èšŒã¯é éä¿èšŒã§ãã ãã©ã³ã¹ããŒãã®ç¹æ§ïŒå°ãªããšã1åãæ£ç¢ºã«1åããã®ããŒã¿ã®éèŠåºŠã«åºã¥ããŠãã¿ã¹ã¯ã®çµ±èšãéžæããŸãã ããšãã°ã課éã·ã¹ãã ã®å Žåãçµ±èšã«çŸåšãããå€ãã®ãã©ã³ã¶ã¯ã·ã§ã³ã衚瀺ãããããšã¯åãå ¥ããããŸãããããã¯ééã§ãããäžå¯èœã§ãã
2çªç®ã®ãã©ã¡ãŒã¿ãŒã¯ãããã°ã©ãã³ã°èšèªã®ãã€ã³ãã£ã³ã°ã§ãã ã©ãããããããã©ã³ã¹ããŒããšããåãããå¿ èŠãããããããããžã§ã¯ããèšè¿°ãããŠããèšèªã«å¿ããŠéžæãããŸãã
3çªç®ã®ãã©ã¡ãŒã¿ãŒã¯ã¹ã±ãŒã©ããªãã£ã§ãã 1ç§ãããæ°çŸäžã®ã€ãã³ãã«ã€ããŠè©±ããŠããã®ã§ãå°æ¥ã®ã¹ã±ãŒã©ããªãã£ã念é ã«çœ®ããŠãããšããã§ãããã
å€ãã®ãã©ã³ã¹ããŒããªãã·ã§ã³ããããŸãïŒRDBMSã¢ããªã±ãŒã·ã§ã³ãFlumeãKafkaãŸãã¯LSDã ç§ãã¡ã¯LSDã䜿çšããŸã-ããã¯ç§ãã¡ã®ç¹å¥ãªæ¹æ³ã§ãã
ã©ã€ãã¹ããªãŒãã³ã°ããŒã¢ã³
LSDã¯çŠæ¢ç©è³ªãšã¯é¢ä¿ãããŸããã ããã¯ã 掻çºã§éåžžã«é«éãªã¹ããªãŒãã³ã°ããŒã¢ã³ã§ãããæžã蟌ã¿çšã®ãšãŒãžã§ã³ããæäŸããŸããã ããã調æŽããããšãã§ããä»ã®ã·ã¹ãã ãšã®çµ±åããããŸã ïŒHDFSãKafka-éä¿¡ãããããŒã¿ãåé 眮ã§ããŸãã LSDã«ã¯INSERTã®ãããã¯ãŒã¯ã³ãŒã«ããªãããããã¯ãŒã¯ããããžãå¶åŸ¡ã§ããŸãã
æãéèŠãªããšã¯ãããã¯Badooã®ãªãŒãã³ãœãŒã¹ã§ã -ãã®ãœãããŠã§ã¢ãä¿¡é Œããªãçç±ã¯ãããŸããã
ãããå®ç§ãªæªéã§ããã°ãã«ãã«ã®ä»£ããã«ãã¹ãŠã®äŒè°ã§LSDã«ã€ããŠè°è«ããŸããããã¹ãŠã®LSDã«ã¯è»èã®ããšããããŸãã ç§ãã¡ã«ã¯ç§ãã¡ã«åã£ãç¬èªã®å¶éããããç§ãã¡ã«åã£ãŠããŸãïŒ LSDã§ã¯è€è£œããµããŒããããŠããã ãå°ãªããšã1åã®é ä¿¡ä¿èšŒããããŸãã ãŸããééååŒã®å Žåãããã¯æé©ãªãã©ã³ã¹ããŒãã§ã¯ãããŸããããäžè¬ã«ACIDããµããŒããããé žæ§ãããŒã¿ããŒã¹ãä»ããŠã®ã¿ãéãšéä¿¡ããå¿ èŠããããŸãã
åéãã§ãŒãºã®èŠçŽ
åã®ã·ãªãŒãºã®çµæã«åºã¥ããŠãããŒã¿ã®æ£åŒãªèª¬æãåããéçºè ãããã€ãã³ããéä¿¡ããã®ã«äŸ¿å©ãªåªããAPIãçæãããã®ããŒã¿ãã¢ããªã±ãŒã·ã§ã³ã³ã³ããã¹ãããå¥ã®ãã€ãã©ã€ã³ ã«è»¢éããæ¹æ³ãèŠã€ããŸããã ãã§ã«æªãã¯ãããŸããã次ã®æ®µéã«è¿ã¥ããŠããŸãã
ãã§ãŒãºããã»ã¹-ããŒã¿åŠç
ç»é²ãã¢ããããŒããããåçãæ祚ããããŒã¿ãåéããŸãã-ããããã¹ãŠãã©ããããïŒ ãã®ããŒã¿ãããé·ãå±¥æŽãšçããŒã¿ãå«ã ãã£ãŒããååŸããŸã ã ãã£ãŒãã¯ãã¹ãŠãç解ããŸããäŒç€Ÿã®åçãå¢å ããŠããããšãæ²ç·ããç解ããããã«éçºè ã§ããå¿ èŠã¯ãããŸããã ãªã³ã©ã€ã³ã¬ããŒããšã¢ãããã¯ã«çããŒã¿ã䜿çšããŸãã ããè€éãªã±ãŒã¹ã§ã¯ãã¢ããªã¹ãã¯ãã®ããŒã¿ã«å¯ŸããŠåæã¯ãšãªãå®è¡ããããšèããŠããŸãã ãããšãã®æ©èœã®äž¡æ¹ãç§ãã¡ã«ãšã£ãŠå¿ èŠã§ãã
ã°ã©ã
ãã£ãŒãã«ã¯å€ãã®åœ¢åŒããããŸãã
ãŸãã¯ãããšãã°ã10幎éã®ããŒã¿ã瀺ãå±¥æŽãæã€ã°ã©ãã
ãã£ãŒãããã®ãããªãã®ã§ãã
ããã¯ããã€ãã®ABãã¹ãã®çµæã§ãããé©ãã¹ãããšã«ãã¥ãŒãšãŒã¯ã®ã¯ã©ã€ã¹ã©ãŒãã«ã«äŒŒãŠããŸãã
ã°ã©ããæç»ããã«ã¯ãçããŒã¿ã®ã¯ãšãªãšæç³»åã® 2ã€ã®æ¹æ³ããããŸã ã ã©ã¡ãã®ã¢ãããŒãã«ãæ¬ ç¹ãšå©ç¹ããããŸããããããã«ã€ããŠã¯è©³ãã説æããŸããã ãã€ããªããã¢ãããŒãã䜿çšããŸããéçšã¬ããŒãçšã®çããŒã¿ã®çããããŒã«ããšãé·æä¿åçšã®æç³»åãä¿æããŸãã 2çªç®ã¯1çªç®ããèšç®ãããŸãã
1ç§ããã180äžã®ã€ãã³ãã«æé·ããæ¹æ³
ããã¯é·ã話ã§ã-1æ¥ã§æ°çŸäžã®RPSã¯çºçããŸããã Badooã¯10幎ã®æŽå²ãæã€äŒç€Ÿã§ãããããŒã¿åŠçã·ã¹ãã ã¯äŒç€Ÿãšãšãã«æé·ãããšèšããŸãã
æåã¯äœããããŸããã§ããã ããŒã¿ã®åéãéå§ããŸãã- æ¯ç§5,000ã€ãã³ãã«ãªããŸããã 1ã€ã®MySQLãã¹ããšä»ã®äœããªãïŒ ãªã¬ãŒã·ã§ãã«DBMSã¯ãã¹ãŠãã®ã¿ã¹ã¯ã«å¯Ÿå¿ããå¿«é©ã§ãããã©ã³ã¶ã¯ã·ã§ã³æ§ããããŸã-ããŒã¿ãå ¥åãããªã¯ãšã¹ããåä¿¡ããŸã-ãã¹ãŠãããŸãåäœããŸãã ã ããç§ãã¡ã¯ãã°ããäœãã§ããã
æ©èœçã·ã£ãŒãã£ã³ã°ã¯ãããæç¹ã§çºçããŸãããç»é²ããŒã¿ãããã«æ¥ãŠãããã«åçããããŸãã ãããã£ãŠã 1ç§éã«æ倧200,000件ã®ã€ãã³ããåŠçããããŸããŸãªçµã¿åããã¢ãããŒãã䜿çšãå§ããŸãããçããŒã¿ã§ã¯ãªãã éçŽããããã®ã§ããããããŸã§ã®ãšãããªã¬ãŒã·ã§ãã«ããŒã¿ããŒã¹å ã§ãã ã«ãŠã³ã¿ãŒãä¿åããŸãããã»ãšãã©ã®ãªã¬ãŒã·ã§ãã«ããŒã¿ããŒã¹ã®æ¬è³ªã¯ããã®ããŒã¿ã«å¯ŸããŠDISTINCTã¯ãšãªãå®è¡ããããšãã§ããªããšããããšã§ããã«ãŠã³ã¿ãŒã®ä»£æ°ã¢ãã«ã§ã¯DISTINCTãèšç®ã§ããŸããã
Badooã®ã¢ãããŒã¯ãæ¢ããããªãåãã§ãã ç§ãã¡ã¯æ¢ãŸãããããã«æé·ããŸããã 1ç§ããã200,000ã€ãã³ããšãããããå€ãè¶ ããæç¹ã§ãäžã§èª¬æããæ£åŒãªèª¬æãäœæããããšã«ããŸããã ãã以åã¯æ··ä¹±ããããçŸåšãã€ãã³ãã®æ§é åãããã¬ãžã¹ã¿ããããŸããHadoopã«æ¥ç¶ããŠã·ã¹ãã ã®ã¹ã±ãŒãªã³ã°ãéå§ãããã¹ãŠã®ããŒã¿ãHiveããŒãã«ã«å ¥ããããŸããã
Hadoopã¯ã巚倧ãªãœãããŠã§ã¢ããã±ãŒãžã§ãããã¡ã€ã«ã·ã¹ãã ã§ãã åæ£ã³ã³ãã¥ãŒãã£ã³ã°ã®å ŽåãHadoopæ°ã¯ããããã«ããŒã¿ãå ¥ããŠãåæã¯ãšãªãå®è¡ã§ããããã«ããŸãããšèšããŸãã ãã¹ãŠã®ãã£ãŒãã®å®æçãªèšç®ãäœæããŸããããããŸããããŸããã ãããããã£ãŒãã¯è¿ éã«æŽæ°ãããå Žåã«äŸ¡å€ããããŸãã1æ¥1åããã£ãŒãã®æŽæ°ãèŠãããšã¯ããã»ã©æ¥œãããããŸããã æ¬çªç°å¢ã§èŽåœçãªãšã©ãŒãåŒãèµ·ããäœããå±éããå Žåããã£ãŒãã1æ¥ããã«ã§ã¯ãªããããã«ããããããããšã確èªããããšæããŸãã ãã®ããããã°ãããããšã·ã¹ãã å šäœãå£åãå§ããŸããã ãã ãããã®æ®µéã§ãéžæãããã¯ãããžãŒã¹ã¿ãã¯ã«åºå·ã§ããããšã«æ°ä»ããŸããã
ç§ãã¡ã«ãšã£ãŠãJavaã¯æ°ãããã®ã§ãããç§ãã¡ã¯ãããæ°ã«å ¥ã£ãŠãããç°ãªãæ¹æ³ã§äœãã§ããããç解ããŸããã
1ç§ããã 40äžãã800,000ã€ãã³ãã®æ®µéã§ãHadoopãæãçŽç²ãªåœ¢åŒã§çœ®ãæããHiveã¯åæã¯ãšãªã®ãšã°ãŒãã¥ãŒã¿ãŒãšããŠSpark Streamingã䜿çšããŠã äžè¬çãªããã/ãªãã¥ãŒã¹ããã³ã¡ããªãã¯ã®å¢åèšç®ãäœæããŸããã 3幎åãç§ã¯ãããã©ããã£ãŠãã£ããã話ããŸããã ãããããSparkã¯æ°žé ã«çãç¶ããããã«æããŸããããããã§ãªãå Žåã¯åœãåœããããŸãã-Hadoopã®éçã«ã¶ã€ãããŸããã ãããããä»ã®æ¡ä»¶ããã£ããšããŠããHadoopã䜿ãç¶ããŸãã
ãã1ã€ã®åé¡ã¯ãHadoopã§ã°ã©ããèšç®ããããšã«å ããŠãã¢ããªã¹ãã«ãã£ãŠé§åãããä¿¡ããããªãã»ã©ã®4é建ãŠã®SQLã¯ãšãªã§ãããã°ã©ãã¯ããã«ã¯æŽæ°ãããŸããã§ããã å®éã«ã¯ãéçšããŒã¿åŠçã«ã¯ããªã泚æãå¿ èŠãªä»äºãããããããªã¢ã«ã¿ã€ã ã§é«éãã€ã¯ãŒã«ã§ãã
Badooã«ã¯ããšãŒããããšåç±³ã®å€§è¥¿æŽã®äž¡åŽã«ãã2ã€ã®ããŒã¿ã»ã³ã¿ãŒã察å¿ããŠããŸãã çµ±åã¬ããŒããäœæããã«ã¯ãã¢ã¡ãªã«ãããšãŒãããã«ããŒã¿ãéä¿¡ããå¿ èŠããããŸãã ããå€ãã®èšç®èœåãããããããã¹ãŠã®çµ±èšçµ±èšãä¿æããã®ã¯ãšãŒãããã®ããŒã¿ã»ã³ã¿ãŒã§ãã çŽ200ããªç§ã®ããŒã¿ã»ã³ã¿ãŒéã®åŸåŸ© -ãããã¯ãŒã¯ã¯ããªãæè»ã§ã-å¥ã®DCã«èŠæ±ãè¡ãããšã¯ã次ã®ã©ãã¯ã«è¡ãããšãšåãã§ã¯ãããŸããã
ã€ãã³ããšéçºè ã®æ£åŒåãéå§ãããããã¯ããããŒãžã£ãŒãé¢äžãããšãã誰ãããã¹ãŠãæ°ã«å ¥ã£ãŠããŸããã ã€ãã³ãã®ççºçãªæé·ããããŸãã ã çŸæç¹ã§ã¯ãã¯ã©ã¹ã¿ãŒã§éãè³Œå ¥ããææã§ããããç§ãã¡ã¯æ¬åœã«ããããããããããŸããã§ããã
1ç§éã«800,000ã€ãã³ãã®ããŒã¯ãéãããšãã«ãYandexãOpenSource ClickHouseã«ã¢ããããŒããããã®ãèŠã€ããŠãè©ŠããŠã¿ãããšã«ããŸããã 圌ãã¯äœããããããšããŠããæäžã«ã³ãŒã³ã®éŠ¬è»ãåãããã®çµæããã¹ãŠãããŸããã£ããšãã圌ãã¯æåã®çŸäžã®ã€ãã³ãã®ããã«å°ããªã¬ã»ãã·ã§ã³ãè¡ããŸããã ãããããClickHouseãã¬ããŒããçµäºããå¯èœæ§ããããŸãã
ClickHouseã䜿çšããŠããã®ãŸãŸäœ¿çšããŠãã ããã
ããããããã¯é¢çœããªãã®ã§ãããŒã¿åŠçã«ã€ããŠåŒãç¶ã説æããŸãã
ã¯ãªãã¯ããŠã¹
ClickHouseã¯éå»2幎éã®èªå€§åºåã§ããã玹ä»ããå¿ èŠã¯ãããŸããã2018幎ã®HighLoad ++ã«ã€ããŠã®ã¿ãããã«é¢ãã5ã€ã®ã¬ããŒããšãã»ãããŒããã³äŒè°ã«ã€ããŠèŠããŠããŸãã
ãã®ããŒã«ã¯ãèªåã§èšå®ããã¿ã¹ã¯ãæ£ç¢ºã«è§£æ±ºããããã«èšèšãããŠããŸãã HadoopããäžåºŠã«åãåã£ããªã¢ã«ã¿ã€ã æŽæ°ãšãããããããŸã ïŒã¬ããªã±ãŒã·ã§ã³ãã·ã£ãŒãã£ã³ã°ã ClickHouseãè©Šããªãçç±ã¯ãããŸããã§ãããHadoopã§ã®å®è£ ã§ã¯ããã§ã«åºãçªç ŽããŠããããšãç解ããŠããããã§ãã ãã®ããŒã«ã¯ã¯ãŒã«ã§ãããã¥ã¡ã³ãã¯äžè¬çã«ç«ãã€ããŸã-ç§ã¯èªåã§ããã«æžããã®ã§ããã¹ãŠãæ¬åœã«å¥œãã§ããã¹ãŠãçŽ æŽãããã§ãã ããããå€ãã®åé¡ã解決ããå¿ èŠããããŸããã
ClickHouseã§ã€ãã³ãã®ãããŒå šäœãã·ããããæ¹æ³ã¯ïŒ 2ã€ã®ããŒã¿ã»ã³ã¿ãŒããã®ããŒã¿ãçµåããæ¹æ³ã¯ïŒ ç§ãã¡ã管çè ã®ãšããã«æ¥ãŠããã¿ããªãClickHouseãã€ã³ã¹ããŒã«ããŸãããããšèšã£ããšããäºå®ããã圌ãã¯ãããã¯ãŒã¯ã2åã«åããããé 延ã¯ååã«ãªããŸãã ãããããããã¯ãŒã¯ã¯ãŸã æåã®çµŠäžãšåããããèããŠå°ããã§ãã
çµæãä¿åããæ¹æ³ã¯ ïŒ Hadoopã§ã¯ãã°ã©ãã£ãã¯ã¹ã®æç»æ¹æ³ãç解ããŸããããéæ³ã®ClickHouseã§ã©ã®ããã«æç»ããã®ã§ããïŒ éæ³ã®æã¯å«ãŸããŠããŸããã æç³»åã¹ãã¬ãŒãžã«çµæãé ä¿¡ããæ¹æ³ã¯ ïŒ
ç 究æã®è¬åž«ãèšã£ãããã«ã3ã€ã®ããŒã¿ã¹ããŒã ãæ€èšããŠãã ããïŒæŠç¥çãè«ççãç©ççã§ãã
æŠç¥çã¹ãã¬ãŒãžã¹ããŒã
2ã€ã®ããŒã¿ã»ã³ã¿ãŒããããŸãã ClickHouseã¯DCã«ã€ããŠäœãç¥ããªãæ¹æ³ãç¥ã£ãŠããããšãåŠã³ãåDCã§ã¯ã©ã¹ã¿ãŒããããããŸããã ããã§ã ããŒã¿ã¯å€§è¥¿æŽéã±ãŒãã«ã移åããªããªããŸãããDCã§çºçãããã¹ãŠã®ããŒã¿ã¯ããã®ã¯ã©ã¹ã¿ãŒã«ããŒã«ã«ã«ä¿åãããŸãã ããšãã°ãäž¡æ¹ã®DCã«ããã€ã®ç»é²ããããã調ã¹ãããã«ãçµåãããããŒã¿ã«å¯ŸããŠãªã¯ãšã¹ããè¡ãå ŽåãClickHouseã¯ãã®æ©äŒãæäŸããŸãã ãªã¯ãšã¹ãã®äœã¬ã€ãã³ã·ãšå¯çšæ§-æé«åäœïŒ
ç©çã¹ãã¬ãŒãžã¹ããŒã
ç¹°ãè¿ãã«ãªããŸãããããŒã¿ã¯ClickHouseãªã¬ãŒã·ã§ãã«ã¢ãã«ã«ã©ã®ããã«åé¡ãããŸãããã¬ããªã±ãŒã·ã§ã³ãšã·ã£ãŒãã£ã³ã°ã倱ããªãããã«äœããã¹ãã§ããããã ClickHouseã®ããã¥ã¡ã³ãã«ã¯ãã¹ãŠã詳ãã説æãããŠããŸããè€æ°ã®ãµãŒããŒãããå Žåã¯ããã®èšäºã«åºããããŸãã ãããã£ãŠãããã¥ã¢ã«ã®å 容ãã€ãŸããã¬ããªã±ãŒã·ã§ã³ãã·ã£ãŒãã£ã³ã°ãã·ã£ãŒãäžã®ãã¹ãŠã®ããŒã¿ãžã®ã¯ãšãªã«ã€ããŠã¯è©³ãã説æããŸããã
ã¹ãã¬ãŒãžããžãã¯
è«çå³ãæãèå³æ·±ãã§ãã 1ã€ã®ãã€ãã©ã€ã³ã§ãç°çš®ã€ãã³ããåŠçããŸãã ããã¯ãç»é²ãé³å£°ãåçã®ã¢ããããŒããæè¡ææšããŠãŒã¶ãŒã®è¡åã®è¿œè·¡ãªã©ãç°çš®ã€ãã³ãã®ã¹ããªãŒã ãããããšãæå³ããŸã ããããã®ã€ãã³ãã¯ãã¹ãŠå®å šã«ç°ãªãå±æ§ãæã£ãŠããŸã ã ããšãã°ãæºåž¯é»è©±ã§ç»é¢ãèŠãŸãã-ç»é¢IDãå¿ èŠã§ãã誰ãã«æ祚ããŸãã-æ祚ãè³æãå察ããç解ããå¿ èŠããããŸãã ãããã®ã€ãã³ãã«ã¯ãã¹ãŠç°ãªãå±æ§ããããç°ãªãã°ã©ããæç»ãããŸããããããã¯ãã¹ãŠåäžã®ãã€ãã©ã€ã³ã§åŠçããå¿ èŠããããŸãã ClickHouseã¢ãã«ã«é 眮ããæ¹æ³ã¯ïŒ
ã¢ãããŒãNo. 1-ã€ãã³ãããŒãã«ããšã ãã®æåã®ã¢ãããŒãã¯ãMySQLã§åŸãããçµéšããæšå®ãããã®ã§ããClickHouseã§ã€ãã³ãããšã«ã¿ãã¬ãããäœæããŸããã ããªãè«ççã«èãããŸãããå€ãã®å°é£ã«ééããŸããã
æ¬æ¥ã®ãã«ãããªãªãŒã¹ããããšãã«ã€ãã³ãã®æ§é ãå€æŽããããšããå¶éã¯ãããŸããã ãã®ãããã¯ãã©ã®éçºè ã§ãäœæã§ããŸãã ãã®ã¹ããŒã ã¯ãéåžžããã¹ãŠã®æ¹åã§å€æŽå¯èœã§ãã å¯äžã®å¿ é ãã£ãŒã«ãã¯ã ã¿ã€ã ã¹ã¿ã³ãã€ãã³ããšã€ãã³ãã®å 容ã§ãã ä»ã®ãã¹ãŠã¯ãªã³ã¶ãã©ã€ã§å€æŽãããããããããã®ãã¬ãŒããå€æŽããå¿ èŠããããŸãã ClickHouseã«ã¯ã¯ã©ã¹ã¿ãŒã§ALTERãå®è¡ããæ©èœããããŸãããããã¯ç¹çŽ°ã§ããªã±ãŒããªæé ã§ãããèªååãå°é£ã§ã¹ã ãŒãºã«æ©èœããŸããã ãããã£ãŠãããã¯ãã€ãã¹ã§ãã
1000ãè¶ ããããŸããŸãªã€ãã³ããããããã ãã·ã³ããšã«é«ãINSERTã¬ãŒããåŸãããŸãããã¹ãŠã®ããŒã¿ãåžžã«1000ã®ããŒãã«ã«èšé²ããŸãã ClickHouseã®å Žåãããã¯ã¢ã³ããã¿ãŒã³ã§ãã ããã·ã®ã¹ããŒã¬ã³ãããã°-ãLive in big sipsããClickHouse- ãLive in big batchã ã ãããè¡ãããªããšãã¬ããªã±ãŒã·ã§ã³ãåæ¢ããClickHouseã¯æ°ããæ¿å ¥ã®åãå ¥ããæåŠããŸããããã¯äžå¿«ãªã¹ããŒã ã§ãã
ã¢ãããŒã2-åºãããŒãã« ã ã·ããªã¢ã®ç·æ§ã¯ããã§ãŒã³ãœãŒãã¬ãŒã«ã«æ»ã蟌ãŸããå¥ã®ããŒã¿ã¢ãã«ã䜿çšããããšããŸããã ååã®ããŒãã«ãäœæããŸããåã€ãã³ãã«ã¯ããŒã¿çšã®åãäºçŽãããŠããŸãã 巚倧ãªã¹ããŒã¹ããŒãã«ãååŸããŸã -幞ããªããšã«ãããã¯éçºç°å¢ãè¶ ããŸããã§ãããæåã®æ¿å ¥ãããã¹ããŒã ã絶察ã«æªãããšãæããã«ãªã£ãããã§ãã
ããã§ããç§ã¯ãã®ãããªã¯ãŒã«ãªãœãããŠã§ã¢è£œåã䜿ããããå°ãä»äžããããšæã£ãŠããŸãããããããªããå¿ èŠãšãããã®ã§ãã
ã¢ãããŒã3-äžè¬çãªè¡šã ClickHouseã¯éã¹ã«ã©ãŒããŒã¿åããµããŒãããŠãããããé åã«ããŒã¿ãæ ŒçŽãã1ã€ã®å·šå€§ãªããŒãã«ããããŸã ã ã€ãŸããå±æ§ã®ååãæ ŒçŽãããåãšãå±æ§ã®å€ãæ ŒçŽãããé åãæã€å¥ã®åãéå§ããŸãã
ClickHouseã¯ãããã§ãã®ã¿ã¹ã¯ãéåžžã«ããŸãå®è¡ããŸãã ããŒã¿ãæ¿å ¥ããã ãã§ããã°ãããããçŸåšã®ã€ã³ã¹ããŒã«ã§ããã«10åçµã蟌ã¿ãŸãã
ãã ããè»èã®ããšã¯ã æååã®é åãä¿åããããã®ClickHouseã®ã¢ã³ããã¿ãŒã³ã§ãããããšã§ãã è¡é åã¯ããå€ãã®ãã£ã¹ã¯å®¹éãå æãããããããã¯æªãããšã§ããåçŽãªåãããçž®å°ããåŠçãå°é£ã§ãã ããããç§ãã¡ã®ã¿ã¹ã¯ã«ã€ããŠã¯ãå©ç¹ãäžåããããããã«ç®ãåããŸãã
ãã®ãããªããŒãã«ããSELECTãäœæããæ¹æ³ã¯ïŒ ç§ãã¡ã®ã¿ã¹ã¯ã¯ãæ§å¥ã§ã°ã«ãŒãåãããç»é²ãã«ãŠã³ãããããšã§ãã ãŸãã1ã€ã®é åã§æ§å¥ã®åã«å¯Ÿå¿ããäœçœ®ãèŠã€ãã次ã«ãã®ã€ã³ããã¯ã¹ã䜿çšããŠå¥ã®åã«ç§»åããŠããŒã¿ãååŸããå¿ èŠããããŸãã
ãã®ããŒã¿ã«ã°ã©ããæãæ¹æ³
ãã¹ãŠã®ã€ãã³ããèšè¿°ãããŠããããããããã¯å³å¯ãªæ§é ãæã¡ãã€ãã³ãã®ã¿ã€ãããšã«4é建ãŠã®SQLã¯ãšãªã圢æããå®è¡ããŠãçµæãå¥ã®ããŒãã«ã«ä¿åããŸãã
åé¡ã¯ãã°ã©ãäžã«2ã€ã®é£æ¥ããç¹ãæãã«ã¯ãããŒãã«å šäœãã¹ãã£ã³ããå¿ èŠãããããšã§ãã äŸïŒ1æ¥ãããã®ç»é²ã確èªããŸãã ãã®ã€ãã³ãã¯ãäžçªäžã®è¡ããæåŸãã2çªç®ã®è¡ãŸã§ã§ãã äžåºŠã¹ãã£ã³-çŽ æŽãããã 5ååŸãã°ã©ãäžã«æ°ãããã€ã³ããæç»ããŸã-åã³ãåã®ã¹ãã£ã³ãšäº€å·®ããããŒã¿ç¯å²ãã¹ãã£ã³ããåã€ãã³ãã«ã€ããŠåæ§ã«è¡ããŸãã è«ççã«èãããŸãããèŠæ ãã¯ãããããŸããã
ããã«ãããã€ãã®è¡ãååŸããå Žåãéèšã®çµæãèªã¿åãå¿ èŠããããŸã ã ããšãã°ãç¥ã®ããã¹ãã¹ã«ã³ãžããã¢ã§ç»é²ãããç·æ§ã§ãã£ããšããäºå®ããããèŠçŽçµ±èšãèšç®ããå¿ èŠããããŸãïŒç»é²æ°ãç·æ§æ°ããããã®äœäººããã«ãŠã§ãŒäººã§ãããã ããã¯ãåæããŒã¿ããŒã¹ROLLUPãCUBE ãããã³GROUPING SETSã®èŠ³ç¹ããåŒã³åºãããŸã-1è¡ãè€æ°è¡ã«ããŸãã
æ²»çæ¹æ³
幞ããªããšã«ãClickHouseã«ã¯ããã®åé¡ãã€ãŸãéèšé¢æ°ã®ã·ãªã¢ã«åãããç¶æ ã解決ããããŒã«ããããŸãã ããã¯ãããŒã¿ã®äžéšãäžåºŠã¹ãã£ã³ããŠããããã®çµæãä¿åã§ããããšãæå³ããŸãã ããã¯ãã©ãŒæ©èœã§ã ã 3幎åãSparkãšHadoopã§ãããæ£ç¢ºã«è¡ããŸããããYandexã®æé«ã®ãã€ã³ããClickHouseã«ã¢ããã°ãå®è£ ããã®ã¯çŽ æŽãããããšã§ãã
é ããªã¯ãšã¹ã
ä»æ¥ãšæšæ¥ã®ãŠããŒã¯ãŠãŒã¶ãŒæ°ãã«ãŠã³ããããšããããã£ãããšãããªã¯ãšã¹ãããããŸãã
SELECT uniq(user_id) FROM table WHERE dt IN (today(), yesterday())
ç©ççã«ã¯ãæšæ¥ã®ç¶æ ã®SELECTãäœæãããã®ãã€ããªè¡šçŸãååŸããŠãã©ããã«ä¿åã§ããŸãã
SELECT uniq(user_id), 'xxx' AS ts, uniqState(user id) AS state FROM table WHERE dt IN (today(), yesterday())
ä»æ¥ã¯ãä»æ¥ã«ãªããšããæ¡ä»¶ãå€æŽããã ãã§ãïŒ
'yyy' AS ts
ããã³
WHERE dt = today()
ããã³ã¿ã€ã ã¹ã¿ã³ãâ xxxâããã³â yyyâãåŒã³åºããŸãã , , 2 .
SELECT uniqMerge(state) FROM ageagate_table WHERE ts IN ('xxx', 'yyy')
:
- , ;
- ;
- .
, - . . , , , , ClickHouse, : «, ! , !»
, , . , . . â SQL-, . , , .
, - time series. : , , , time series.
time series : , , timestamp . , , . . , , , â , . , , ClickHouse -, , .
, , ClickHouse:
â « », â .
time series 2 , 20 20-80 . . ClickHouse GraphiteMergeTree , time series, .
8 ClickHouse , 6 - , 2 : 2 â , . 1.8 . , 500 . , 1,8 , 500 ! .
Hadoop
2 . . 3 , CPU â 4 . , .
Process
, , , . , , ClickHouse 3 000 . , , , overkill.
, , . ClickHouse, . , , , . , 8 3â4 . â .
Present â
, ? time series, time series , , , .
Drop Detect â SQL : SQL- , , .
Anomaly Detection â . , , 2% , â 40, , , , .
â , , - , Anomaly Detection.
Anomaly Detection
, time series . : , , . time series . , , . , drop detection â , .
UI.
. - , â . -, .
Present
, , . , : 1000 â alarm, 0 â alarm. .
Anomaly Detection , . Anomaly Detection Exasol , ClickHouse. Anomaly Detection 2 , .
, , 4 .
, , , . , , . , .
HighLoad++ , HighLoad++ - . , , :)
, PHP Russia , , . , , , 1,8 /, , 1 .