ïŒãã®ã¹ããŒã ã¯éåžžã«ã·ã³ãã«ã«èŠããŸãããå€ãã®ãå¢çç·ãã®ã±ãŒã¹ãããããšã«æ³šæããŠãã ãããçºçããå Žåã¯ãå°é家ã®è©äŸ¡ã«é Œã£ãŠãå ¥ã£ãŠããåé¡ã解決ããããã«ããã°ããŒã¿ãã¯ãããžãŒãå¿ èŠãã©ããããŸãã¯ãœãªã¥ãŒã·ã§ã³ãçãããšãã§ãããã©ãããè©äŸ¡ããŸãã ãã¯ã©ã·ãã¯ãRDBMSãã¯ãããžãŒã«åºã¥ããŠããŸãïŒã
ãã®èšäºã®ãã¬ãŒã ã¯ãŒã¯ã§ã¯ãäž»ã«äœ¿çšããããã¯ãããžãŒãšããããã®å©ããåããŠéçºããããœãªã¥ãŒã·ã§ã³ã«çŠç¹ãåœãŠãŸãã
æåã«ããã¯ãããžãŒãžã®é¢å¿ã®åææ¡ä»¶ã«ã€ããŠã®ããã€ãã®èšèã ããã°ããŒã¿ãéå§ããããŸã§ã«ãéè¡ã«ã¯ããŒã¿ãæäœããããã®ããã€ãã®ãœãªã¥ãŒã·ã§ã³ããããŸããã
- ããŒã¿ãŠã§ã¢ããŠã¹ïŒDWHããšã³ã¿ãŒãã©ã€ãºããŒã¿ãŠã§ã¢ããŠã¹ïŒ
- éçšããŒã¿ã¹ãã¢ïŒODSãéçšããŒã¿ãŠã§ã¢ããŠã¹ïŒ
ããã°ããŒã¿ã«ç®ãåããçç±ã¯äœã§ããïŒ
ITéšéã§ã¯æ®éçãªãœãªã¥ãŒã·ã§ã³ãæåŸ ãããŠããŸãããããã«ãããéè¡ãå©çšã§ãããã¹ãŠã®ããŒã¿ãå¯èœãªéãå¹ççã«åæããŠãããžã¿ã«è£œåãäœæããã«ã¹ã¿ããŒãšã¯ã¹ããªãšã³ã¹ãåäžãããããšãã§ããŸããã
åœæãDWHãšODSã«ã¯ããã€ãã®å¶éãããããããã®ãœãªã¥ãŒã·ã§ã³ããã¹ãŠã®ããŒã¿ãåæããããã®æ±çšããŒã«ãšããŠéçºããããšã¯ã§ããŸããã§ããã
- DWHã®å³ããããŒã¿å質èŠä»¶ã¯ããªããžããªå ã®ããŒã¿ã®é¢é£æ§ã«å€§ãã圱é¿ããŸãïŒããŒã¿ã¯ç¿æ¥åæã«å©çšã§ããŸãïŒã
- ODSã®å±¥æŽããŒã¿ã®äžè¶³ïŒå®çŸ©ã«ããïŒã
- ODSããã³DWHã§ãªã¬ãŒã·ã§ãã«DBMSã䜿çšãããšãæ§é åããŒã¿ã®ã¿ãæäœã§ããŸãã DWH / ODSïŒæžã蟌ã¿æã¹ããŒãïŒãžã®æžã蟌ã¿äžã«ããŒã¿ã¢ãã«ãå®çŸ©ããå¿ èŠããããããè¿œå ã®éçºã³ã¹ããå¿ èŠã«ãªããŸãã
- æ°Žå¹³ã¹ã±ãŒãªã³ã°ãœãªã¥ãŒã·ã§ã³ã®æ¬ åŠãå¶éãããåçŽã¹ã±ãŒãªã³ã°ã
ãããã®å¶éã«æ°ä»ãããšããããã°ããŒã¿æè¡ã«ç®ãåããããšã«ããŸããã ãã®ç¬éããã®åéã®ã³ã³ããã³ã·ãŒãå°æ¥çã«ç«¶äºäžã®åªäœæ§ãæäŸããããšã¯æããã ã£ãã®ã§ãå éšã®å°éç¥èãå¢ããå¿ èŠããããŸããã åœæãäžéã«ã¯å®çšçãªèœåããªãã£ããããå®éã«ã¯2ã€ã®éžæè¢ããããŸããã
-ãŸãã¯ïŒå€éšããïŒåžå ŽããããŒã ãç·šæããŸãã
-ãŸãã¯ãå®éã®æ¡åŒµãªãã§ãå éšç§»è¡ãéããŠæ奜家ãèŠã€ããŸãã
2çªç®ã®ãªãã·ã§ã³ãéžæããã®ã¯ã 圌ã¯ç§ãã¡ã«ãã£ãšä¿å®çã«èŠããã
ããã«ãããã°ããŒã¿ã¯åãªãããŒã«ã§ããããã®ããŒã«ã§ç¹å®ã®åé¡ã解決ããããã®å€ãã®ãªãã·ã§ã³ãããããšãç解ããããã«ãªããŸããã 解決ããåé¡ã«ã¯ã次ã®èŠä»¶ããããŸãã
- ããŸããŸãªåœ¢åŒã圢åŒã®ããŒã¿ãäžç·ã«åæã§ããå¿ èŠããããŸãã
- ãã©ãããªæ±ºå®è«çã¬ããŒããããšããŸããã¯ãªèŠèŠåãäºæž¬åæã«è³ããŸã§ãå¹ åºãåæäžã®åé¡ã解決ã§ããå¿ èŠããããŸãã
- 倧éã®ããŒã¿ãšãããããªã³ã©ã€ã³ã§åæããå¿ èŠæ§ãšã®éã§åŠ¥åç¹ãèŠã€ããå¿ èŠããããŸãã
- ïŒçæ³çã«ã¯ïŒå€æ°ã®åŸæ¥å¡ããã®èŠæ±ã«å¯Ÿå¿ã§ãããç¡å¶éã®ã¹ã±ãŒã©ãã«ãªãœãªã¥ãŒã·ã§ã³ãå¿ èŠã§ãã
æç®ãç 究ãããã©ãŒã©ã ãèªã¿ãå ¥æå¯èœãªæ å ±ã«ç²ŸéããåŸããããã®èŠä»¶ãæºãããœãªã¥ãŒã·ã§ã³ãã確ç«ãããã¢ãŒããã¯ãã£ãã³ãã¬ãŒãã®åœ¢ã§ãã§ã«ååšãããããŒã¿ã¬ã€ã¯ããšåŒã°ããããšãããããŸããã Data Lakeã®å®è£ ã決å®ããåŸã管çã¬ããŒããéçšçµ±åãäºæž¬åæãªã©ãããŒã¿ã«é¢é£ããã¿ã¹ã¯ã解決ã§ããèªçµŠèªè¶³ã®ãšã³ã·ã¹ãã ãDWH + ODS + Data Lakeãã®ååŸãç®æããŸããã
Data Lakeã®ããŒãžã§ã³ã¯ãå ¥åããŒã¿ã2ã€ã®ã¬ã€ã€ãŒã«åå²ãããå žåçãªã©ã ãã¢ãŒããã¯ãã£ãå®è£ ããŠããŸãã
-äž»ã«ã¹ããªãŒãã³ã°ããŒã¿ãåŠçãããããŒã¿ããªã¥ãŒã ãå°ãããå€æã¯æå°éã§ãããã€ãã³ãã®çºçãšåæã·ã¹ãã ã§ã®è¡šç€ºéã®æå°é 延ãéæããããé床ãå±€ã ããŒã¿åŠçã«ã¯ãSpark Streamingã䜿çšããçµæã®ä¿åã«ã¯Hbaseã䜿çšããŸãã
-ããŒã¿ããããã§åŠçããããããããã¬ã€ã€ãŒãäžåºŠã«æ°çŸäžä»¶ã®ã¬ã³ãŒããå«ããããšãã§ããŸãïŒããšãã°ãå¶æ¥æ¥ã®çµäºçµæã«åºã¥ããã¹ãŠã®ã¢ã«ãŠã³ãã®æ®é«ïŒãããã«ã¯æéããããå ŽåããããŸãããããªã倧éã®ããŒã¿ïŒã¹ã«ãŒãããïŒã ããŒã¿ãHDFSã®ãããã¬ã€ã€ãŒã«ä¿åããããŒã¿ã«ã¢ã¯ã»ã¹ããã«ã¯ãã¿ã¹ã¯ã«å¿ããŠHiveãŸãã¯Sparkã䜿çšããŸãã
ãããšã¯å¥ã«ãSparkã«ã€ããŠèšåããããšæããŸãã ããŒã¿åŠçã«åºã䜿çšãããŠãããæãéèŠãªå©ç¹ã¯æ¬¡ã®ãšããã§ãã
- ETLãã¡ã³ããšããŠäœ¿çšã§ããŸãã
- æšæºã®MapReduceãžã§ããããé«éã§ãã
- Hive / MapReduceãšæ¯èŒããŠãããé«éãªã³ãŒãèšè¿° DataFramesãSparkSQLã©ã€ãã©ãªã䜿çšãããªã©ãã³ãŒãã¯åé·ã§ã¯ãããŸããã
- MapReduceãã©ãã€ã ãããæè»æ§ãé«ããããè€éãªåŠçãã€ãã©ã€ã³ããµããŒãããŸãã
- Pythonããã³JVMèšèªããµããŒããããŠããŸãã
- çµã¿èŸŒã¿ã®æ©æ¢°åŠç¿ã©ã€ãã©ãªã
ãã¹ããŒããªã³ãªãŒããã¢ãããŒããå®è£ ããŠãããŒã¿ãå ã®æªå å·¥ã®åœ¢åŒã§ããŒã¿ã¬ã€ã¯ã«ä¿åããããšããŸã ã ããã»ã¹å¶åŸ¡ã§ã¯ãOozieãã¿ã¹ã¯ã¹ã±ãžã¥ãŒã©ãšããŠäœ¿çšããŸãã
æ§é åãããå ¥åããŒã¿ãAVRO圢åŒã§ä¿åããŸãã ããã«ã¯å©ç¹ããããŸãã
- ããŒã¿ã¹ããŒã ã¯ã©ã€ããµã€ã¯ã«äžã«å€æŽãããå¯èœæ§ããããŸãããããã¯ãããã®ãã¡ã€ã«ãèªã¿åãã¢ããªã±ãŒã·ã§ã³ã®ããã©ãŒãã³ã¹ã劚ããããšã¯ãããŸããã
- ããŒã¿ã¹ããŒãã¯ããŒã¿ãšå ±ã«ä¿åããããããåå¥ã«èšè¿°ããå¿ èŠã¯ãããŸããã
- å€ãã®ãã¬ãŒã ã¯ãŒã¯ã®ãã€ãã£ããµããŒãã
ãŠãŒã¶ãŒãBIããŒã«ã䜿çšããŠäœæ¥ããããŒã¿ããŒãã®å ŽåãParquetãŸãã¯ORC圢åŒã䜿çšããäºå®ã§ãã ã»ãšãã©ã®å Žåãããã«ãããåã®ã¹ãã¬ãŒãžãåå ã§ããŒã¿ã®ãµã³ããªã³ã°ãé«éåãããŸãã
ã¢ã»ã³ããªãšããŠãHadoopã¯ClouderaãšHortonworksãæ€èšããŸããã Hortonworksãéžæãããã®ã¯ããã®é åžã«ç¬èªã®ã³ã³ããŒãã³ããå«ãŸããŠããªãããã§ãã ããã«ãHortonworksã¯ãã®ãŸãŸã§ãSparkã®2çªç®ã®ããŒãžã§ã³ãšClouderaã§å©çšå¯èœã§ã-1.6ã®ã¿ã
Data LakeããŒã¿ã䜿çšããåæã¢ããªã±ãŒã·ã§ã³ã®äžã§ã2ã€æ³šæããŸãã
1ã€ç®ã¯ãPythonãšã€ã³ã¹ããŒã«ãããæ©æ¢°åŠç¿ã©ã€ãã©ãªãåããJupyterããã§ããããã¯ãããŒã¿ãµã€ãšã³ãã£ã¹ããäºæž¬åæãšã¢ãã«æ§ç¯ã«äœ¿çšããŸãã
2çªç®ã®åœ¹å²ãšããŠãçŸåšããŠãŒã¶ãŒãããŒãã«ãã°ã©ããåã°ã©ãããã¹ãã°ã©ã ãªã©ã®æšæºçãªé¡åã¬ããŒãã®ã»ãšãã©ãåå¥ã«æºåã§ããã»ã«ããµãŒãã¹BIã¯ã©ã¹ã®ã¢ããªã±ãŒã·ã§ã³ãæ€èšããŠããŸãã ITã®åœ¹å²ã¯ãããŒã¿ã¬ã€ã¯ã«ããŒã¿ãè¿œå ããã¢ããªã±ãŒã·ã§ã³ãšãŠãŒã¶ãŒã«ããŒã¿ãžã®ã¢ã¯ã»ã¹ãæäŸããããšããããŠããã ãã§ããããšãç解ãããŠããŸãã ãŠãŒã¶ãŒã¯èªåã§æ®ãã®äœæ¥ãè¡ãããšãã§ããŸããããã«ãããç¹ã«ãé¢å¿ã®ãã質åã«å¯ŸããåçãèŠã€ããããã®æçµæéã®ççž®ãæåŸ ãããŸãã
çµè«ãšããŠããããŸã§ã«éæããããšããäŒãããããšæããŸãã
- ãããã¬ã€ã€ãŒãã©ã³ããProdã«æã¡èŸŒã¿ãé¡åçåæïŒã€ãŸããããŒã¿ã䜿çšããã¢ããªã¹ãããããã§ã©ã®ããã«éä¿¡ãããããšãã質åã«çããããšããŸãïŒãšäºæž¬åæã®äž¡æ¹ã«äœ¿çšãããããŒã¿ãèªã¿èŸŒã¿ãŸãã ATMã§ã®çŸéåŒãåºããšååãµãŒãã¹ã®æé©åã
- 圌ãã¯Jupyter Hubãæèµ·ãããŠãŒã¶ãŒã«ææ°ã®ããŒã«ïŒscikit learnãXGBoostãVowpal WabbitïŒã§ããŒã¿ãåæããæ©äŒãäžããŸããã
- Prodã§ãã¹ããŒãã¬ã€ã€ãŒããã©ã³ããç©æ¥µçã«éçºããã³æºåããããŒã¿ã¬ã€ã¯ã§ãªã¢ã«ã¿ã€ã ã®ææ決å®ã¯ã©ã¹ã·ã¹ãã ãå®è£ ããŠããŸãã
- 補åããã¯ãã°ãã³ã³ãã€ã«ããŸããããã®å®è£
ã«ããããœãªã¥ãŒã·ã§ã³ã®æç床ãæéã§é«ããããšãã§ããŸãã èšç»äžïŒ
- çœå®³èæ§ã çŸåšããœãªã¥ãŒã·ã§ã³ã¯1ã€ã®ããŒã¿ã»ã³ã¿ãŒã«å±éãããŠãããå®éããµãŒãã¹ã®ç¶ç¶æ§ãä¿èšŒãããã®ã§ã¯ãããŸããããŸããããŒã¿ã»ã³ã¿ãŒã§çºçããå Žåãèç©ãããããŒã¿ãäžå¯éçã«å€±ãå¯èœæ§ããããŸãïŒãã®ç¢ºçã¯å°ããã§ããããŸã ååšããŠããŸãïŒã åé¡ãçºçããŸããïŒçµã¿èŸŒã¿ã®HDFSã§ã¯ãç°ãªãããŒã¿ã»ã³ã¿ãŒã§ä¿èšŒãããããŒã¿ã¹ãã¬ãŒãžãå®çŸã§ããŸããã ãã®ç¹ã§æ¹èšãããããã®éåœã¯ãŸã äžæã§ãããç¬èªã®ãœãªã¥ãŒã·ã§ã³ãå®è£ ããäºå®ã§ãã
- ã¡ã¿ããŒã¿ãšã³ãªããã¡ã³ãïŒAtlasïŒãã¡ã¿ããŒã¿ããŒã¹ã®ããŒã¿ç®¡ç/ã¬ããã³ã¹ãã¡ã¿ããŒã¿ããŒã¹ã®åœ¹å²ããŒã¹ã®ã¢ã¯ã»ã¹ã
- éžæããã¢ãŒããã¯ãã£ã³ã³ããŒãã³ãã®ä»£æ¿æ¡ã調ã¹ãŸãã æåã®åè£ïŒOozieã®ä»£æ¿ãšããŠã®Airflowããªã¬ãŒã·ã§ãã«DBMSããããŒã¿ãã¢ããããŒãããããã®Scoopã®ä»£æ¿ãšããŠã®ããé«åºŠãªCDCã
- CI / CDãã€ãã©ã€ã³ã®å®è£ ã ããŸããŸãªãã¯ãããžãŒãšããŒã«ã䜿çšãããŠãããããé ä¿¡ã®å質ãä¿èšŒããªãããã³ãŒãã®å€æŽãå¯èœãªéãè¿ éã«çç£ç°å¢ã«èªåçã«å±éã§ããããã«ããããšèããŠããŸãã
Raiffeisenbankã§ããã°ããŒã¿ã䜿çšããèšç»ã¯ãŸã ãããããããŸããããã«ã€ããŠèª¬æããŸãã
ãæž èŽããããšãããããŸããïŒ