MapReduceã«ã€ããŠé·ãéã話ããããã£ãã®ã§ãããã©ã®ããã«èŠãŠããããã¯åçŽã«ææãæããã»ã©ã®èãã§ãããå®éã«ã¯å€ãã®ç®çã«ãšã£ãŠéåžžã«ã·ã³ãã«ã§äŸ¿å©ãªã¢ãããŒãã§ãã ãããŠããããèªåã§å®çŸããããšã¯ããã»ã©é£ãããããŸããã
MapReduceãäœã§ãããã
ç解ããŠ
ããªã人ã®ããã«ãç§
ã¯ããã«èšã
ãªããã°ãªããŸããã èãåºãã人ã®ããã«-ããã§ã¯æçšãªãã®ã¯äœããããŸããã
MapReduceã®ã¢ã€ãã¢ã
å人çã«ã©ã®ããã«çãŸããã®ãããå§ããŸãããïŒãã®ããã«åŒã°ããŠããããšã¯ç¥ããŸããã§ãããããã¡ãããGoogleããããã£ãšåŸã§ç§ã«äŒãããŸããïŒã
ãŸãã圌女ãã©ã®ããã«çãŸããã®ãïŒã¢ãããŒããééã£ãŠããïŒã次ã«ãããæ£ããè¡ãæ¹æ³ã«ã€ããŠèª¬æããŸãã
ãŠã£ãããã£ã¢äžã®ãã¹ãŠã®åèªãæ°ããæ¹æ³ïŒééã£ãã¢ãããŒãïŒ
ãããŠåœŒå¥³ã¯ãããããã©ãã«ã§ãçãŸããŸãã-éåžžã®èšæ¶ãååã§ãªããšãã«åèªã®é »åºŠãæ°ããããã«ïŒãŠã£ãããã£ã¢äžã®ãã¹ãŠã®åèªã®é »åºŠãæ°ããããïŒã ãé »åºŠããšããèšèã®ä»£ããã«ãåºçŸåæ°ããããã¯ãã§ãããç°¡åã«ããããã«ãé »åºŠãã®ãŸãŸã«ããŸãã
æãåçŽãªã±ãŒã¹ã§ã¯ãããã·ã¥ïŒdictãmapãhashãé£æ³é
åãPHPã®arrayïŒïŒïŒãäœæãããã®äžã®åèªãèªã¿åãããšãã§ããŸãã
$dict['word1'] += 1
ããããããã·ã¥ã®äžã®ã¡ã¢ãªãçµäºãããã¹ãŠã®åèªã®100åã®1ã ããã«ãŠã³ãããå Žåã¯ã©ãããã°ããã§ããããïŒ
ã¡ã¢ãªããªããªããŸã§åèªã®äžéšãã«ãŠã³ãããããã·ã¥ããã£ã¹ã¯ã«ä¿åããŠããã®åé¡ã解決ããŸããã ã€ãŸãããã¡ã€ã«ã«å¯ŸããŠ1è¡ãã€çŽæ¥ïŒ
aardvark | 5
aachen | 2
åé¡ããããŸãã-ãããã®ãã¡ã€ã«ãããŒãžããæ¹æ³ã¯ïŒ çµå±ã®ãšããããããããRAM
å
šäœãå æããŸãã
æåã¯ãåãã¡ã€ã«ããæã人æ°ã®ãã1,000,000èªã®ã¿ãåãåºããŠçµåãããšããã¢ã€ãã¢ããããŸãããããã¯RAMã«åãŸããå°ãªããšããªã¹ãã®äžçªäžïŒæã人æ°ã®ããèªïŒãã«ãŠã³ãããŸãã ããã¯ãã¡ããæ©èœããŸããããæ°çŸäžã®äžäœã®åèªã倱ãããããã«å€ãã®åèªããã£ãããšãå€æããŸããã
ãã¡ã€ã«ããœãŒããããšããã¢ã€ãã¢ãçãŸããŸããã
次ã«ã20åã®ãœãŒãããããã¡ã€ã«ãååŸãããããã®ããããããæåã®1000è¡ãèªã¿åããŸãããããã¯ã»ãŒåãåèªïŒãœãŒãããããã¡ã€ã«ïŒã«ãªããŸãã èŠçŽããŠæ°ããããã·ã¥ãäœæããŸãããaaa ...ããªã©ã§å§ãŸãåèªã®ã¿ãå«ãŸããæ°ãããã¡ã€ã«ã«ä¿åãããŸãã 次ã®1000è¡ããã¹ãŠèªã¿åããŸãã ããã«ã¯ãã»ãšãã©ãã¹ãŠã®ãã¡ã€ã«ã«ãaab ...ããšããèšèããããŸãã
ãããã£ãŠãæ°ãããã¡ã€ã«ã¯ã¯ããã«å°ãã圢æãããŸãã ããããããã§ãèšèã¯ç¹°ãè¿ãããŸãã ããäžåºŠäžŠã¹æ¿ãã1000è¡ã§èªã¿ãåèšããŸãã ã»ãŒæ£ãããã¡ã€ã«ã§ããããšãããããŸãïŒäžéšã®åèªã¯ãŸã 1000è¡ãè¶
ããŠããå¯èœæ§ããããŸãïŒãæ°åç¹°ãè¿ããŸã...æçµçã«ããšã©ãŒãéåžžã«å°ãªããã¡ã€ã«ãååŸããŸãïŒãã ãããšã©ãŒã¯ãããŸãïŒã
éå±ãªãé·ãããããè¯ããªãã·ã§ã³ã¯çºçããŸããã§ããã
ééã£ãã¢ãããŒãã®åŒ±ç¹
ãã®ã¢ãããŒãã«ã¯1ã€ã®åŒ±ç¹ããããŸãããã€ãŸããå
ã®20åã®ãã¡ã€ã«ãçµåããããšã§ãã æ¹åããæ¹æ³ã¯ïŒ
åé¡ã¯ãäžéšã®åèªãäžéšã®ãã¡ã€ã«ã«ååšããªããã1000è¡ã®ç°ãªããããã¯ã«ååšãããšããäºå®ããçºçããŸãã ã€ãŸããæåã®1000è¡ã§ã¯ãªãã1è¡ã ãã§ãåãåèªã§20ãã¡ã€ã«ãã¹ãŠããååŸã§ããã°ã20ãã¡ã€ã«ãã¹ãŠã1ã€ã®ãã¹ã«çµåã§ããŸãã
ã©ããã£ãŠããã®ïŒ äžè¬çã«ãããã¯
MergeSortã¢ã«ãŽãªãºã ã®æåŸã®ã¹ãããã§ãâãœãŒãããããªã¹ããçµåããŸãã ããã£ãŠããå Žåã¯ãã¹ãããããŠãã ããã
20åãã¹ãŠã®ãã¡ã€ã«ã®æåã®è¡ãååŸããæå°ã®æåã®èŠçŽ ïŒåèªïŒãæ¢ããŸãããã¡ã€ã«ã䞊ã¹æ¿ããã®ã§ãäžè¬çã«ãã¹ãŠã®èŠçŽ ã§æå°ã«ãªããŸãã ããããaardvarkããšããåèªã ãšããŸãããããã®20ã®è¡ãã¹ãŠããããã®åèªãaardvarkãã«é¢é£ãããã®ã ããèªã¿åããŸãã ãããŠããããããããåé€ããå Žæãã-ãããã®ãã¡ã€ã«ã§ã®ã¿ã2è¡ç®ãèªã¿ãŸãã ç¹°ãè¿ãã«ãªããŸãããããã20åã®æå°å€ãæ¢ããŠããŸããé¡æšãããšãããã«ããã¹ãŠã®ãã¡ã€ã«ã®æåŸã«å°éãããŸã§ã§ãã
æãåçŽãªåœ¢åŒã®MapReduce
å®éã10幎åã«Googleãç§ãã¡ã®åã§çºæãããã®ãMapReduceãšåŒãã§ããŸãã
èªè»¢è»ã®çºæã¯ä»æ¥ãŸã§ç¶ããŠããŸãã
ãã®ããã
"foo bar baz bar"
ãšãã
è¡ããã
"foo bar baz bar"
ã
åºåã§åãåãå¿
èŠããããŸãïŒ
{ foo: 1, bar: 2, baz: 1 }
ã
æåã®ã¹ãããã§ã¯ã
è¡ãååŸããŠãããåèªã«åå²ãããã®ãããªé
åãäœæããŸãïŒãŸãã¯ããã¿ãã«ã-ãã¿ãã«ãïŒã
[ 'foo', 1 ]
[ 'bar', 1 ]
[ 'baz', 1 ]
[ 'bar', 1 ]
ïŒæ¬¡ã«ãæ確ã«ãªãå Žåã¯æ¬åŒ§ãšåŒçšç¬Šãçç¥ããŸãïŒ
ããããåãåãããœãŒãããŸãã
bar, 1
bar, 1
baz, 1
foo, 1
barã¯2åé£ç¶ããŠè¡šç€ºãããããã次ã®åœ¢åŒã§çµåããŸãã
bar, (1,1)
baz, (1)
foo, (1)
(1,1)
ã¯äžçš®ã®ãã¹ããããé
åã§ããã€ãŸããæè¡çã«ã¯-
["bar", [1,1]]
ã§ãã
次ã«
ãé
åã®2çªç®ã®èŠçŽ ãè¿œå
ããŸã ã ååŸãããã®ïŒ
bar, 2
baz, 1
foo, 1
ãŸãã«åœŒããæãã ãã®ã
äž»ãªè³ªåã¯ãã€ã®ã®ãã¿ã³ã®ã¢ã³ãŒãã£ãªã³ã®ããã«äœã§ãã...ãŸãã¯ç§ãã¡ã¯ããã§äœãããã®ã§ããïŒ
éå»ã«æ»ã
2è¡ã®ã¿ãåãŸã
ã1åããã1è¡ã«1ã€ã®æäœããå®è¡ã§ããªãã³ã³ãã¥ãŒã¿ãŒããããšæ³åããŠãã ããã ïŒç¬ããæ¢ããã«ã¯ãå°ãªããšã1åãŠã£ãããã£ã¢ã®ãã¹ãŠã®åèªãæ°ããåŸãèšå®ãããã¡ã¢ãªå¶éã§ç¬ãæš©å©ããããŸããã®ã°ãããã€ãããšããŠããããã¯ãŸã é©åããŸããã
ïŒ
"foo bar baz bar"
ïŒãã®æ¹æ³ã§2ã€ã®ãã¡ã€ã«ãäœæã§ããŸãã
file1.txt
[ 'bar', 1 ]
[ 'foo', 1 ]
file2.txt
[ 'bar', 1 ]
[ 'baz', 1 ]
ã¡ã¢ãªã«ã¯2è¡ãããŸãããã¹ãŠãæ£åžžã§ãããã¡ã¢ãªã®å¶éã«é©åããŠããŸãã
MergeSortã®
æé ã䜿çšããŠããããã®ãã¡ã€ã«ãè¡
ããšã«çµåã§ããŸãã
bar, (1,1)
baz, (1)
foo, (1)
åæã«ãã¡ã¢ãªã«ã¯2ã€ã®ãã¡ã€ã«ã®2è¡ã®ã¿ãä¿åãããŸã-å¿
èŠä»¥äžã§ã¯ãããŸããã
å®éãç§ãã¡ãè¡ã£ãããšã¯ãã§ã«MapReduceã§ãã
åèªïŒ
, 1
ïŒ
, 1
é
åãçæãã
ã¹ããã -
ãã®ã¹ãããã¯ããããããšåŒã°ããŸãã
(1,1)
ãèŠçŽããã¹ãããã¯ã
ãçž®å°ãã¹ãããã§ãã
æ®ãã®ã¹ãããã¯ãã¢ã«ãŽãªãºã èªäœã«ãã£ãŠå®è¡ãããŸãïŒMergeSortãä»ãã䞊ã¹æ¿ããšçµåïŒã
ããããåæžïŒ ããã¯äœã§ãã
ãããã®ã¹ãããèªäœã¯ããããããã®å Žåã¯ãŠãããã®çºè¡ãããªãã¥ãŒã¹ãã®å Žåã¯æãç³ã¿ã§æ§æããããšã¯éããŸããã ãããã¯ãåã«äœããåãå
¥ããããæäŸãããã§ããæ©èœã§ãã ç®çã«å¿ããŠã
ãã®å ŽåããMapãã¯ã1ã€ã®åèªãåãã
(, 1)
ãçæãããäœæããé¢æ°ã§ãã
ãŸãããReduceãã¯ãé
å
(, (1,1))
ãåãã
(, 2)
ãçæãããäœæããé¢æ°ã§ãã
åçŽã«Pythonã«å
¥ããïŒ
words = ["foo"ã "bar"ã "baz"]
def map1ïŒåèªïŒïŒ
return [wordã1]
arr = ["foo"ã[1,1]]
def reduce1ïŒarrïŒïŒ
return [arr [0]ãsumïŒarr [1]ïŒ]
ãŸãã¯PHPïŒ
$åèª=é
åïŒ "foo"ã "bar"ã "baz"ïŒ
é¢æ°map1ïŒ$ wordïŒ{
return arrayïŒ$ wordã1ïŒ;
}
arr =é
åïŒ "foo"ãé
åïŒ1,1ïŒïŒ
é¢æ°reduce1ïŒarrïŒ{
return arrayïŒ$ arr [0]ãarray_sumïŒ$ arr [1]ïŒïŒ;
}
ãããã£ãŠãã¡ã¢ãªå¶éããã€ãã¹ããŸããããé床å¶éããã€ãã¹ããæ¹æ³ã¯ãããŸããïŒ
ãã®ãããªã³ã³ãã¥ãŒã¿ãŒã2å°ãããšæ³åããŠãã ããã ãããã®ããããã«æåã®è¡ãäžããæåã«èšããŸãïŒããæ£ç¢ºã«ã¯ãMapReduceã¯èšããŸãïŒïŒå¥æ°ã®å Žæã§åèªã ããæ°ãããããŠ2çªç®ã®-å¶æ°ã®å Žæã§ã®ã¿åèªãæ°ããŸãã
æåã®çæç©ïŒ
"foo bar baz bar":
foo, 1
baz, 1
2çªç®ã¯ä»¥äžãçæããŸãã
"foo bar baz bar":
bar, 1
bar, 1
äžèšã®ããã«ïŒããæ£ç¢ºã«ã¯MapReduceïŒäž¡æ¹ããçµæãååŸãããœãŒãããŠããMergeSortãå®è¡ããŸãã
bar, (1,1)
baz, (1)
foo, (1)
1å°ã®ã³ã³ãã¥ãŒã¿ãŒãæ°ãããšããšãŸã£ããåãçµæã§ãïŒ
ïŒMapReduceïŒã¯2å°ã®ã³ã³ãã¥ãŒã¿ãŒã«å床é
åžããŠããŸãã1è¡ç®ã¯å¥æ°è¡ã®ã¿ãæäŸãã2è¡ç®ã¯å¶æ°è¡ãæäŸããåã³ã³ãã¥ãŒã¿ãŒã«Reduceã¹ãããïŒ2æ¡ç®ãè¿œå ïŒãäŸé ŒããŸãã
å®éããããã®è¡ãäºãã«ç¬ç«ããŠããããšã¯æãããªã®ã§ãçµæã¯åã³å¿
èŠãªãã®ã«ãªããŸãã
äž»ãªããšã¯ã2å°ã®ã³ã³ãã¥ãŒã¿ãŒ
ã䞊è¡ããŠåäœãããã®ããã
1å°ã®ã³ã³ãã¥ãŒã¿ãŒã®
2åã®é床ã§åäœããããšã§ãïŒãã ãã1å°ã®ã³ã³ãã¥ãŒã¿ãŒããå¥ã®ã³ã³ãã¥ãŒã¿ãŒãžã®ããŒã¿è»¢éã«ãããæéã®ãã¹ã¯é€ãïŒ
æ©é
ãµãµïŒ ãããã£ãŠãMapReduce-ããé«éã«å®è¡ããå¿
èŠãããããååãªã¡ã¢ãªããªãïŒãŸãã¯ãã®ããããïŒãã®ãèªã¿åãããã«å¿
èŠã§ãã
ããèå³æ·±ãäŸã¯ã人æ°ã«ãã䞊ã¹æ¿ãïŒã«ã¹ã±ãŒãïŒã§ãã
ãŠã£ãããã£ã¢äžã®åèªã®æ°ãæ°ããªãããåæã«äººæ°ã®é«ããã®ããæã人æ°ã®ãªããã®ãŸã§ããã®äººæ°ã®éé ã§ãªã¹ããäœæããããšããŸãã
ãŠã£ãããã£ã¢ã®ãã¹ãŠã®åèªãã¡ã¢ãªã«åãŸããªãããšã¯æããã§ãããããŠãæ»ãå€ã®ãœãŒãã®ããã«ããã®å·šå€§ãªé
åã¯ã¡ã¢ãªã«åãŸããŸããã MapReduceã®ã«ã¹ã±ãŒããå¿
èŠã§ããæåã®MapReduceã®çµæã¯ã2çªç®ã®MapReduceã®å
¥åã«éãããŸãã
æ£çŽã«èšããšããã«ã¹ã±ãŒãããšããèšèãæ£ãããã©ããã¯ããããŸãããå
·äœçã«ã¯MapReduceã«åœãŠã¯ãŸããŸãã ç§ã¯ãã®èšèãä»ã®äººã«ã¯ã§ããªãããã«èª¬æããŠããã®ã§ããã®èšèãèªåã§äœ¿ã£ãŠããŸãïŒèšèã®æ»ã®çµæã¯MapReduceã«èœã¡ãããã«2çªç®ã®MapReduceã«æµããŸãïŒã
ããŠãåèªãæ°ããæ¹æ³-ç§ãã¡ã¯ãã§ã«ç¥ã£ãŠããŸãïŒ
ãããŒããŒããºããŒã
ç§ãã¡ãæžããMapã¹ãããã¯ä»¥äžãçæããŸãã
foo, 1
bar, 1
baz, 1
foo, 1
ããã«MapReduceã¯ïŒããã°ã©ããšããŠã®ããªãã§ã¯ãªããããèªäœïŒãããã以äžã«çµåããŸãïŒ
bar, (1)
baz, (1)
foo, (1,1)
ãããŠãç§ãã¡ãæžããReduceã¹ãããã¯ä»¥äžãçæããŸã
bar, 1
baz, 1
foo, 2
ããã§ããŠã£ãããã£ã¢å
šäœãæ€èšãããã®é
åã«ã¯äœååãã®åèªãå«ãŸããŠãããšæ³åããŠãã ããã ã¡ã¢ãªå
ã§äžŠã¹æ¿ããããšã¯ã§ããŸããã
å¥ã®MapReduceãèŠãŠã¿ãŸããããä»åã¯Mapããã®ããªãã¯ãå®è¡ããŸãã
[, 15]
-> mapïŒïŒæ»ãå€->
[-15, ]
[2, 15]
-> mapïŒïŒæ»ãå€->
[-15, 2]
[3, 120]
-> mapïŒïŒæ»ãå€->
[-120, 3]
[4, 1]
-> mapïŒïŒæ»ãå€->
[-1, 4]
ããã¯äœã®ããã§ããïŒ
MapReduceã¯ãReduceã«é²ãåã«ããããã®ãã¹ãŠã®é
åãé
åã®æåã®èŠçŽ ïŒè² ã®æ°ã«çããïŒã§äžŠã¹æ¿ããŸãã MapReduceã¯ãããŒã¿ã®å
šéãã¡ã¢ãªã«åãŸããªãå Žåã§ããœãŒãããããšãã§ããŸãããããçŽ æŽãããç¹ã§ãã ãŠã£ãããã£ã¢ã®ãã¹ãŠã®åèªã«ã€ããŠã
arsort($words)
å®è¡ããããšã¯ã§ããŸããããMapReduceã¯å®è¡ã§ããŸãã
æ°åã®åã«ãã€ãã¹ãããã®ã¯ãªãã§ããïŒ
MapReduceã¯åžžã«
æé ã§ãœãŒãã
ããŸãããéé ã§ãœãŒã
ããå¿
èŠãããããã§ãã ããã§ã¯ãæé ã§äžŠã¹æ¿ãã䜿çšããŠãæ°å€ãéé ã«äžŠã¹æ¿ããæ¹æ³ã¯ãããŸããïŒ ãœãŒãã®åã«ãã€ãã¹1ãæããå床ãã€ãã¹1ãæããŸãã
æé ã®æ£ã®æ°ïŒ1ã15ã120
æé ã®è² ã®æ°ïŒ
-120, -15, -1
ïŒå¿
èŠãªã®ã¯ãã€ãã¹èšå·ã®ã¿ã§ã-1ãæããŠåçŽã«åé€ããŸãïŒ
次ã®ããšãReduceã®å
¥åã«ãªããŸãïŒ
-120, (3)
-15, (, 2) <-- - MergeSort !
-1, (4)
çŽ æµã§ããã2ã€ã®åèªã®ãé »åºŠãã¯15ã§ãMergeSortã«ãã£ãŠã°ã«ãŒãåãããŸããã ä¿®æ£ããŸãã
ããã§ãReduceã§ã¯ãæåã®æ°å€ã«-1ãæããã ãã§ãæåã®è¡ã«1ã€ã®é
åã2çªç®ã«2ã€ã®é
åã3çªç®ã«1ã€ã®é
åãçæããã ãã§ãã
å®éãMapReduceã䜿çšããå®æœåœ¢æ
ã«ãã£ãŠã¯ãå¿
èŠãªåºåé
åã1ã€ã ãã§ãããããReduceã¹ãããã§2ã€ã®é
åãåºåã§ããªãå ŽåããããŸãããã®åŸãããã°ã©ã ã®Reduceã¹ãããã®åŸã«å®è¡ããŠãã ããã
ååŸãããã®ïŒ
120, 3
15, ,
15, 2
1, 4
çŸäººïŒ å¿
èŠãªãã®ã
ç¹°ãè¿ããŸãããããã§åé¿ããäž»ãªãã®ã¯ãäŸã4è¡ã§ãããWikipediaã«ã¯èšæ¶ã«åãŸããªãæ°ååã®åèªãããããšãæãåºããŠãã ããã
ç°¡åãªMapReduceãäœæããŠéã¶æ¹æ³ã¯ïŒ
PHPã®å Žå ïŒ
æãåçŽãªäŸ ã
Python㯠æãåçŽãªäŸã§ã ïŒPythonããŒãžã§ã³ã«ã€ããŠã¯ä»¥äžãåç
§ïŒã
ã³ãŒãã§ã¯ããã¡ã€ã«ãšMergeSortã®æå³ã§ãé»ã§å€ããå°ãªããå®å
šãªMapReduceãäœæããããã«äœãã©ãã«çœ®ãã¹ããã瀺ããŸããã ãã ããããã¯ããã°ãMapReduceãã©ã®ããã«åäœããããç解ããããã®
åç
§å®è£
ã§ãã ããã¯ãŸã MapReduceã§ããå
·äœçã«ã¯ãã¡ã¢ãªã®èŠ³ç¹ããèŠããã®å®è£
ã¯ãéåžžã®ããã·ã¥ãããæçã§ã¯ãããŸããã
PHPãéžæããŸããããã»ãšãã©ã®ããã°ã©ããŒãPHPãèªã¿ãç®çã®èšèªã«ç¿»èš³ããã®ãç°¡åã«ãªãããããããã®ç®çã«ã¯æã劥åœã§ã¯ãããŸããã
ãã³ããšããŒã
ã¯ããé
åïŒjson_encodeïŒã®JSONè¡šçŸã1è¡ãã€ä¿åããããšããå§ãããŸã-åèªã®ã¹ããŒã¹ãUnicodeãæ°åãããŒã¿åã®åé¡ã¯ã次ã®ããã«ãªããŸãã
["foo", 1]
["bar", 1]
["foo", 1]
ãã³ã-Python
ã§ã¯ã MergeSortããã§ã«å®è£
ããŠ
ããæåŸã®ã¹ãããã¯
heapq.merge(*iterables)
ã§ãã
ã€ãŸãã10åã®ãã¡ã€ã«ãJSONè¡šçŸã§æ¥ç¶ããã ãã§ååã§ãã
items = listïŒãã¡ã€ã«å
ã®ãã¡ã€ã«åã®itertools.imapïŒjson.loadsãopenïŒfilenameïŒïŒïŒ
heapq.mergeå
ã®ã¢ã€ãã ïŒ*ã¢ã€ãã ïŒïŒ
ïŒ.... reduceïŒitemïŒ....
MergeSortãå®è£
ããPHPã§ã¯ã50è¡ã«ãããããªããã°ãªããªããšæããŸãã ãã¡ãããã³ã¡ã³ãã®äžã§ããè¯ãéžæè¢ãæããŠããã人ãããªãéãã
Pythonã§ã¯ãMapReduceã®
yield
ãš
__iter__
éåžžã«èå³æ·±ãããšãè¡ãããšãã§ããŸãïŒ ããšãã°ã次ã®ãšããã§ãã
x = MapReduceïŒïŒ
ãfoo bar barãã®åèªã®å Žå.splitïŒïŒïŒ
x.sendïŒïŒwordã1ïŒïŒ
åèªã®å Žåãxã«ãããã®ïŒ
å°å·èªãåèšïŒ1ïŒ
class MapReduce
-ããªãã¯èªåã§ãããæžããªããã°ãªããŸããïŒæãåçŽãªäœæ¥ãã©ãŒã
ã§24è¡ä»¥å
ã«
ä¿ããããããããŸãã-å€åå°ãªãã§ã-iter_groupãåçŽåããããšã¯PHPã®äŸã®group_tuples_by_first_elementé¢æ°ã«é¡äŒŒããŠããŸãïŒã
泚æ-ãã®æ¹æ³ã¯MapReduceã«ãšã£ãŠå€å
žçã§ã¯ãªããå€ãã®ãã·ã³ã§äžŠååããããšã¯å°é£ã§ãïŒãã ãããã®æ¹æ³ã§ã¯ã䜿çšå¯èœãªã¡ã¢ãªä»¥äžã®ããŒã¿ããªã¥ãŒã ã§äœæ¥ããããšã¯éåžžã«ç°¡åã§ãïŒã map1ãšreduce1ãé¢æ°ã§ãã
map_reduce(source_data, map1, reduce1)
ã¡ãœãã
map_reduce(source_data, map1, reduce1)
ãããæ£ç¢ºã§ãã
Hadoopã§ã®MapReduceã®å®è£
ã¯ãæãäžè¬çãªãœãªã¥ãŒã·ã§ã³ã§ãã ïŒç§ã¯ãããè©ŠããŠããªãããæã人æ°ããããã®ãç¥ã£ãŠããã ãã ïŒã
ããšãã
ã ãããããã§ãZaumiã®ãªãMapReduceã«ã€ããŠã®ç§ã®è©±ã圹ã«ç«ã€ããšãé¡ã£ãŠããŸãã
MapReduceã¯ã倧èŠæš¡ãªèšç®ã«éåžžã«åœ¹ç«ã¡ãŸãã SQLã¯ãšãªã®ã»ãšãã©ã¯ãããã€ãã®ããŒãã«ã«ãããMapReduce + join_iteratorã«å解ããã®ã¯ããã»ã©é£ãããããŸããïŒè©³çŽ°ã¯åŸã»ã©èª¬æããŸãïŒã
匷ã¿ãããå Žå-次ã®ãããã¯ã§ã¯ãMapReduceã䜿çšããŠäžè¬çãªåèªãããèå³æ·±ãã¿ã¹ã¯ãæ€èšããæ¹æ³ã«ã€ããŠèª¬æããŸã-ããšãã°ãã€ã³ã¿ãŒãããäžã®ãªã³ã¯ã®èªã¿æ¹ãä»ã®åèªã®æèã§ã®åèªã®é »åºŠãéœåžã®äººã
ãæ°çŸã®äŒæ¥ã®äŸ¡æ Œè¡šã«ãã補åãªã©ã
ã¯ããã¿ããªããã«ããŸãïŒ MapReduceã¯Googleã
ç¹èš±ãååŸããŠããŸãããä¿è·ã®ç®çã§äœ¿çšãããŠããŸãããã®æ¹æ³ãå
¬åŒã«èš±å¯ããHadoopãšåãã§ãã ã ãã-泚æããŠæ±ã£ãŠãã ããã
ããŒã2ïŒããé«åºŠãªäŸ
ãšã€ããž
ãã€ãã®ããã«ãHabrãã
2010
ïŒãã€ãç°¡åã«èª¬æããããšãåŠã³ãŸã....ïŒ