ããŒã¿ãããŠã³ããŒããã
ElasticsearchãšLogstashã®ææ°ããŒãžã§ã³ãæ¢ã«ã€ã³ã¹ããŒã«ãããŠãããšä»®å®ããŸãã 次ã®äŸã§ã¯ã Elasticsearch 1.1.0ãšLogstash 1.4.0ã䜿çšããŸããOpenGeoDBã¯ãSQLããã³CSV圢åŒã®ãã€ãã®å°çããŒã¿ãå«ããã€ãã®ãµã€ãã§ãã Elasticsearchã«ããŒã¿ãä¿åãããããCSVã®ããã«SQLã¯é©ããŠããŸããã ãã ããããã«ãããããããLogstashã䜿çšããŠElasticsearchã®ããŒã¿ãå€æããã³ã€ã³ããã¯ã¹åã§ããŸãã ã€ã³ããã¯ã¹ãäœæãããã¡ã€ã«ã«ã¯ããã¹ãŠã®éµäŸ¿çªå·ãå«ããã€ãã®éœåžã®ãªã¹ããå«ãŸããŠããŸãã ãã®ããŒã¿ã¯ã ãããªãã¯ãã¡ã€ã³ã©ã€ã»ã³ã¹ã§OpenGeoDB Webãµã€ãïŒããŠã³ããŒããã¡ã€ã«ïŒããããŠã³ããŒãã§ããŸãã
ããŒã¿åœ¢åŒ
ããŒã¿ã調ã¹ããšã次ã®åã§æ§æãããŠããããšãããããŸãã- loc_idïŒäžæã®èå¥åïŒããŒã¿ããŒã¹å ïŒ
- agsïŒå·ã®ç®¡çã³ãŒã
- asciiïŒå€§æåã®çãåå
- ååïŒãšã³ãã£ãã£å
- latïŒç·¯åºŠïŒåº§æšïŒ
- lonïŒçµåºŠïŒåº§æšïŒ
- amtïŒäœ¿çšãããŠããŸãã
- plzïŒéµäŸ¿çªå·ïŒè€æ°ããå Žåã¯ãã³ã³ãã§åºåã£ãŠãã ããïŒ
- vorwahlïŒé»è©±ã³ãŒã
- einwohnerïŒäººå£
- ãã¬ãã·ã¥ïŒæ£æ¹åœ¢
- kzïŒè»ã®ãã³ããŒãã¬ãŒãã·ãªãŒãº
- typïŒç®¡çåäœã¿ã€ã
- ã¬ãã«ïŒéå±€å ã®å°åœ¢ã®äœçœ®ã決å®ããæŽæ°
- ofïŒãã®ãšãªã¢ãå±ãããšãªã¢ã®èå¥å
- ç¡å¹ïŒåãç¡å¹ã§ã
ãã£ãŒã«ãåã緯床ãçµåºŠãé¢ç©ã人å£ãã©ã€ã»ã³ã¹ã¿ã°ã«èå³ããããŸãã ç§ãã¡ã¯ããã«ãããã«æ»ããŸã...
Elasticsearchã®ãšã³ããªã®ã€ã³ããã¯ã¹äœæ
csv logstashãã£ã«ã¿ãŒã䜿çš
次ã®ã¹ãããã¯ãããŒã¿ãElasticsearchã«é 眮ããããšã§ãã ãŸããLogstashãæ§æããŸãã 以äžã®èšå®ãopengeodb.confãã¡ã€ã«ã«ã³ããŒããŸãã åºåãæåãã³ã³ãã§ã¯ãªãã¿ãã§ããå Žåã§ãããcsvããã£ã«ã¿ãŒïŒã³ã³ãåºåãå€ïŒã䜿çšããŠããããšã«æ³šæããŠãã ããã
input { stdin {} } filter { # Step 1, possible dropping if [message] =~ /^#/ { drop {} } # Step 2, splitting csv { # careful... there is a "tab" embedded in the next line: # if you cannot copy paste it, press ctrl+V and then the tab key to create the control sequence # or maybe just tab, depending on your editor separator => ' ' quote_char => '|' # arbitrary, default one is included in the data and does not work columns => [ 'id', 'ags', 'name_uc', 'name', 'lat', 'lon', 'official_description', 'zip', 'phone_area_code', 'population', 'area', 'plate', 'type', 'level', 'of', 'invalid' ] } # Step 3, possible dropping if [level] != '6' { drop {} } # Step 4, zip code splitting if [zip] =~ /,/ { mutate { split => [ "zip", "," ] } } # Step 5, lat/lon love if [lat] and [lon] { # move into own location object for additional geo_point type in ES # copy field, then merge to create array for bettermap mutate { rename => [ "lat", "[location][lat]", "lon", "[location][lon]" ] add_field => { "lonlat" => [ "%{[location][lon]}", "%{[location][lat]}" ] } } } # Step 6, explicit conversion mutate { convert => [ "population", "integer" ] convert => [ "area", "integer" ] convert => [ "[location][lat]", "float" ] convert => [ "[location][lon]", "float" ] convert => [ "[lonlat]", "float" ] } } output { elasticsearch { host => 'localhost' index => 'opengeodb' index_type => "locality" flush_size => 1000 protocol => 'http' } }
å ã«é²ãåã«ããããLogstashãå®è¡ããŠããŒã¿ã®ã€ã³ããã¯ã¹ãäœæããæ¹æ³ã§ããããšã«æ³šæããŠãã ããã Elasticsearchãå®è¡ãããŠããå¿ èŠããããŸãã
cat DE.tab | logstash-1.4.0/bin/logstash -f opengeodb.conf
ãã®æ®µéã§ã¯ãéåžžã«å€ãã®ããšãèµ·ãããŸãïŒã€ã³ããã¯ã¹äœæã«ã¯1å以äžãããããšããããŸããããã¹ãŠã¯æ©åšã«äŸåããŸãïŒã æåã«æ³šæãã¹ãããšã¯ãLogstashæ§æã¯ãfileãåœä»€ã§ã®å ¥åã䜿çšããªãããšã§ãã ããã¯ããã®å ¥åã¡ãœãããUNIXã·ã¹ãã ã§ãtail -fãã®ããã«åäœãããã€ãŸããæ°ããããŒã¿ããã¡ã€ã«ã«è¿œå ãããããšãæåŸ ããããã§ãã ããã©ãããããã¡ã€ã«ã®ãµã€ãºã¯åºå®ãããŠãããããå ¥åãstdinãã䜿çšããŠãã¹ãŠã®ããŒã¿ãèªã¿åãæ¹ãåççã§ãã
ãã£ã«ã¿ãŒã»ã¯ã·ã§ã³ã¯6ã€ã®ã¹ãããã§æ§æãããŠããŸãã ãããã詳ããèŠãŠãããããã®æ©èœã説æããŸãããã
ã¹ããã1-ã³ã¡ã³ããç¡èŠãã
æåã®ã¹ãããã¯ãã³ã¡ã³ããåãé€ãããšã§ãã ãããã¯ãè¡é ã®ãã³ãèšå·ã§èå¥ã§ããŸãã ãããå¿ èŠãªã®ã¯ããã¡ã€ã«ã®æåã®è¡ãååãå«ãã³ã¡ã³ãã«ãããªãããã§ãã ã€ã³ããã¯ã¹ãäœæããå¿ èŠã¯ãããŸãããã¹ããã2-CSVãéã¢ã»ã³ãã«ããŸã
2çªç®ã®ã¹ãããã¯ãCSVã解æãããã¹ãŠã®ããŒãã¯ãŒã¯ãè¡ããŸãã å€ãtabãïŒã¿ãïŒã§ãã»ãã¬ãŒã¿ããåå®çŸ©ããå¿ èŠããããŸãããquote_charãã«ã¯ããã©ã«ãã§åŒçšç¬Šã1ã€ãããããã¯ããŒã¿ã®å€ã«ååšãããããå¥ã®æåã«çœ®ãæããå¿ èŠããããŸãããcolumnsãããããã£ã¯ãåŸã§ãã£ãŒã«ãåãšããŠäœ¿çšãããŸãã泚æïŒ ãããããã¡ã€ã«ãã³ããŒããå Žåãã¿ãã®ä»£ããã«è€æ°ã®ã¹ããŒã¹ãšããŠã³ããŒãããããããã»ãã¬ãŒã¿ãæåã眮ãæããå¿ èŠããããŸãã ã¹ã¯ãªãããæ£ããæ©èœããªãå Žåã¯ããŸãããã確èªããŠãã ããã
ã¹ããã3-äžèŠãªãšã³ããªãã¹ããããã
éœåžã«é¢ããæ å ±ã衚瀺ããã¬ã³ãŒãïŒãã¬ãã«ããã£ãŒã«ãã6ã®ã¬ã³ãŒãïŒã®ã¿ãå¿ èŠã§ãã æ®ãã®ãšã³ããªã¯åã«ç¡èŠããŸããã¹ããã4-éµäŸ¿çªå·ã®åŠç
4çªç®ã®ã¹ãããã¯ãé©åãªéµäŸ¿çªå·åŠçã§ãã ã¬ã³ãŒãã«è€æ°ã®éµäŸ¿çªå·ãããå ŽåïŒå€§éœåžãªã©ïŒããããã¯ãã¹ãŠ1ã€ã®ãã£ãŒã«ãã«å«ãŸããŠããŸãããã³ã³ãã§åºåãããŠããŸãã åäžã®å€§ããªè¡ãšããŠã§ã¯ãªãé åãšããŠä¿åããã«ã¯ããmutateããã£ã«ã¿ãŒã䜿çšããŠãã®ãã£ãŒã«ãã®å€ãåé¢ããŸãã ããã§ã¯ãããšãã°ãæ°å€ã®é åã®åœ¢åŒã§ãã®ããŒã¿ã®ã³ã³ãã³ãã䜿çšãããšãæ°å€ã®ç¯å²ã§æ€çŽ¢ã䜿çšã§ããŸããã¹ããã5-å°çããŒã¿æ§é
5çªç®ã®ã¹ãããã§ã¯ãå°çããŒã¿ããã䟿å©ãªåœ¢åŒã§ä¿åããŸãã DE.datãã¡ã€ã«ããèªã¿åãå Žåãlatãã£ãŒã«ããšlonãã£ãŒã«ããå¥ã ã«äœæãããŸãã ãã ãããããã®ãã£ãŒã«ãã®æå³ã¯ãäžç·ã«æ ŒçŽãããŠããå Žåã®ã¿ã§ãã ãã®ã¹ãããã§ã¯ãäž¡æ¹ã®ãã£ãŒã«ãã2ã€ã®ããŒã¿æ§é ã«æžã蟌ã¿ãŸãã Elasticsearchã¿ã€ãgeo_pointã®ããã«èŠãããã®çµæãæ§é {"location"ïŒ{"lat"ïŒxã "lon"ïŒy}}ã«ãªããŸãã ãã1ã€ã¯åçŽãªé åã®ããã«èŠããçµåºŠãšç·¯åºŠãïŒãã®é åºã§ïŒïŒå«ãã§ããŸãã Kibana bettermapã³ã³ããŒãã³ãã䜿çšããŠåº§æšã衚瀺ã§ããŸããã¹ããã6-å ¥åãããã£ãŒã«ãã®æ瀺çãªãã£ã¹ã
æåŸã®ãã£ã«ã¿ãŒã¹ãããã§ã¯ãç¹å®ã®ãã£ãŒã«ãã«ããŒã¿åãæ瀺çã«å²ãåœãŠãŸãã ãããã£ãŠãElasticsearchã¯å°æ¥çã«ãããã䜿çšããŠæ°å€æŒç®ãå®è¡ã§ããããã«ãªããŸããåºåã»ã¯ã·ã§ã³ã¯ãããŒãžã§ã³1.4以éã®Logtashã§ã®ã¿äœ¿çšå¯èœã§ãããããã£ãŠãå°ãªããšãããŒãžã§ã³ãããããšã確èªããŠãã ããã åã®ããŒãžã§ã³ã§ã¯ããelasticsearch_httpãã®åºåãæ瀺çã«æå®ããå¿ èŠããããŸãã ä»åŸã¯ããelasticsearchãåºåã1ã€ã ãã«ãªããšèšããŸããprotocol=> httpãæå®ããŠãããŒã9200çµç±ã§HTTPã䜿çšããŠElasticsearchã«æ¥ç¶ã§ããŸãã
ããŒã¿ã®èŠèŠåã«ã¯KibanaãšElasticsearchã䜿çšããŸãã
ããŒã¿ã«ã€ã³ããã¯ã¹ãä»ããããããKibanaã䜿çšããŠããã«åæã§ããŸãã bettermapãŠã£ãžã§ãããšäººå£ã®ãããªåçŽãªæ€çŽ¢ã¯ãšãªã䜿çšããŠïŒ[10000 TO *]ãã€ãã®ãã¹ãŠã®å€§éœåžã衚瀺ã§ããŸããããã¯ãããšãã°æ¬¡ã®ç®çã§äœ¿çšã§ããŸãã
- æã人å£ã®å€ãéœåžãèŠã€ãã
- è»ã®ãã³ããŒãã¬ãŒãã®äžè¬çãªã·ãªãŒãºã䜿çšããéœåžãæ€çŽ¢ããŸãïŒããšãã°ãGTã¯GÃŒterslohãšãã®åšèŸºã§äœ¿çšãããŸãïŒ
- éçŽã¹ã¯ãªããã䜿çšããŠãå¹³æ¹ããã¡ãŒãã«ãããã®äººå£å¯åºŠãæãé«ããããŸã°ããªãšãªã¢ãèŠã€ããŸãã Logstashã§ãããã®æ°å€ãäºåèšç®ããããšãã§ããŸãã
ãããã¯ãã¹ãŠéåžžã«åªããŠããŸãããæ¢åã®ã¢ããªã±ãŒã·ã§ã³ã®æ¹åã«ã¯åœ¹ç«ã¡ãŸããã ãã£ãšæ·±ãããå¿ èŠããããŸãã
èªåè£å®ãèšå®ãã
å°ãè±ç·ããŠããã®ããŒã¿ã䜿ã£ãŠã©ããªäŸ¿å©ãªããšãã§ãããèŠãŠã¿ãŸãããã éœåžãéµäŸ¿çªå·ãããã³ãããã®ããŒã¿ãå ¥åããå¿ èŠãããWebã¢ããªã±ãŒã·ã§ã³ã®å€ãã®å ŽåããããŸãã
è¯ãäŸã¯ããã§ãã¯ã¢ãŠãããã»ã¹ã§ãã ãã¹ãŠã®åºèã«äºåã«ãŠãŒã¶ãŒããŒã¿ãããããã§ã¯ãããŸããã ããã¯ã泚æã1åéãã®åºã§ããå Žåãããã°ãç»é²ããã«æ³šæããããšãã§ããŸãã ãã®å ŽåããŠãŒã¶ãŒããã§ãã¯ã¢ãŠãããã»ã¹ãé«éåã§ããããã«ããããšãé©åãªå ŽåããããŸãã åæã«ããç¡æãã®ããŒãã¹ã¯ã泚æã®é£ããã«ãã泚æã®æ倱ãŸãã¯ãã£ã³ã»ã«ã®é²æ¢ã«ãªããŸãã
Elasticsearchã«ã¯ããcompletion opinesterããšåŒã°ããéåžžã«é«éãªãã¬ãã£ãã¯ã¹æ€çŽ¢æ©èœããããŸãã ãããããã®æ€çŽ¢ã«ã¯æ¬ ç¹ããããŸãã ã€ã³ããã¯ã¹ãäœæããåã«ããŒã¿ãå°ãè£è¶³ããå¿ èŠããããŸããããã®ããã«LogstashããããŸãã ãã®äŸãããããç解ããã«ã¯ã ãå®äºããã¹ã¿ãŒãã®æŠèŠãèªãå¿ èŠããããŸãã
ãã³ã
ãŠãŒã¶ãŒãäœãã§ããéœåžã®ååãå ¥åã§ããããã«ããããšããŸãã ãŸããéžæããéœåžã«åã£ãéµäŸ¿çªå·ãèŠã€ããããããããã«ãéµäŸ¿çªå·ã®ãªã¹ããæäŸããããšæããŸãã ãŸããéã«ãæåã«ãŠãŒã¶ãŒã«éµäŸ¿çªå·ãå ¥åãããŠãããéœåžã«é¢ããæ å ±ãèªåçã«å ¥åããããšãã§ããŸãã
Logstashã®æ§æã«ããã€ãã®å€æŽãå ããŠãåäœããããã«ããŸãã ãã£ã«ã¿ãŒå éšã®ç°¡åãªæ§æããå§ããŸãããã æé 5ã®çŽåŸãæé 6ã®åã«ããã®æ§æã¹ãããããopengeodb.confã«è¿œå ããŸãã
# Step 5 and a half # create a prefix completion field data structure # input can be any of the zips or the name field # weight is the population, so big cities are preferred when the city name is entered mutate { add_field => [ "[suggest][input]", "%{name}" ] add_field => [ "[suggest][output]", "%{name}" ] add_field => [ "[suggest][payload][name]", "%{name}" ] add_field => [ "[suggest][weight]", "%{population}" ] } # add all the zips to the input as well mutate { merge => [ "[suggest][input]", zip ] convert => [ "[suggest][weight]", "integer" ] } # ruby filter to put an array into the event ruby { code => 'event["[suggest][payload][data]"] = event["zip"]' }
Logstashã¯ãããŒã¿ã®ã€ã³ããã¯ã¹ãå床äœæãããšãã«ããå®äºããã¹ã¿ãŒããšäºææ§ã®ããæ§é ã§ããŒã¿ãæžã蟌ã¿ãŸãã ãã ããElaticsearchã§ãããŒã«ãããæ©èœãèšå®ãããããã«ããã£ãŒã«ãäžèŽãã¿ãŒã³ãèšå®ããå¿ èŠããããŸãã ãããã£ãŠãoutput> elastichsearchã»ã¯ã·ã§ã³ã®Logstashèšå®ã§ãã³ãã¬ãŒããæ瀺çã«æå®ããå¿ èŠããããŸãã
# change the output to this in order to include an index template output { elasticsearch { host => 'localhost' index => 'opengeodb' index_type => "locality" flush_size => 1000 protocol => 'http' template_name => 'opengeodb' template => '/path/to/opengeodb-template.json' } }
ãã®ãã³ãã¬ãŒãã¯ãããã©ã«ãã®Logstashãã³ãã¬ãŒãã«éåžžã«äŒŒãŠããŸãããsuggestãã£ãŒã«ããšgeo_pointãã£ãŒã«ããè¿œå ãããŠããŸãã
{ "template" : "opengeodb", "settings" : { "index.refresh_interval" : "5s" }, "mappings" : { "_default_" : { "_all" : {"enabled" : true}, "dynamic_templates" : [ { "string_fields" : { "match" : "*", "match_mapping_type" : "string", "mapping" : { "type" : "string", "index" : "analyzed", "omit_norms" : true, "fields" : { "raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256} } } } } ], "properties" : { "@version": { "type": "string", "index": "not_analyzed" }, "location" : { "type" : "geo_point" }, "suggest" : { "type": "completion", "payloads" : true, "analyzer" : "whitespace" } } } } }
次ã«ãå€ãããŒã¿ïŒã€ã³ããã¯ã¹ãå«ãïŒãåé€ããŠãã€ã³ããã¯ã¹ã®åäœæãéå§ããŸã
curl -X DELETE localhost:9200/opengeodb cat DE.tab | logstash-1.4.0/bin/logstash -f opengeodb.conf
ããã§ããsojestããžã®ãªã¯ãšã¹ããå®äºã§ããŸã
curl -X GET 'localhost:9200/opengeodb/_suggest?pretty' -d '{ "places" : { "text" : "B", "completion" : { "field" : "suggest" } } }'
çµæã¯æ¬¡ã®ãšããã§ãã
{ "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "places" : [ { "text" : "B", "offset" : 0, "length" : 1, "options" : [ { "text" : "Berlin", "score" : 3431675.0, "payload" : {"data":["Berlin","10115","10117","10119","10178","10179","10243","10245","10247","10249","10315","10317","10318","10319","10365","10367","10369","10405","10407","10409","10435","10437","10439","10551","10553","10555","10557","10559","10585","10587","10589","10623","10625","10627","10629","10707","10709","10711","10713","10715","10717","10719","10777","10779","10781","10783","10785","10787","10789","10823","10825","10827","10829","10961","10963","10965","10967","10969","10997","10999","12043","12045","12047","12049","12051","12053","12055","12057","12059","12099","12101","12103","12105","12107","12109","12157","12159","12161","12163","12165","12167","12169","12203","12205","12207","12209","12247","12249","12277","12279","12305","12307","12309","12347","12349","12351","12353","12355","12357","12359","12435","12437","12439","12459","12487","12489","12524","12526","12527","12529","12555","12557","12559","12587","12589","12619","12621","12623","12627","12629","12679","12681","12683","12685","12687","12689","13051","13053","13055","13057","13059","13086","13088","13089","13125","13127","13129","13156","13158","13159","13187","13189","13347","13349","13351","13353","13355","13357","13359","13403","13405","13407","13409","13435","13437","13439","13442","13465","13467","13469","13503","13505","13507","13509","13581","13583","13585","13587","13589","13591","13593","13595","13597","13599","13627","13629","14050","14052","14053","14055","14057","14059","14089","14109","14129","14163","14165","14167","14169","14193","14195","14197","14199"]} }, { "text" : "Bremen", "score" : 545932.0, "payload" : {"data":["Bremen","28195","28203","28205","28207","28209","28211","28213","28215","28217","28219","28237","28239","28307","28309","28325","28327","28329","28355","28357","28359","28717","28719","28755","28757","28759","28777","28779","28197","28199","28201","28259","28277","28279"]} }, { "text" : "Bochum", "score" : 388179.0, "payload" : {"data":["Bochum","44787","44789","44791","44793","44795","44797","44799","44801","44803","44805","44807","44809","44866","44867","44869","44879","44892","44894"]} }, { "text" : "Bielefeld", "score" : 328012.0, "payload" : {"data":["Bielefeld","33602","33604","33605","33607","33609","33611","33613","33615","33617","33619","33647","33649","33659","33689","33699","33719","33729","33739"]} }, { "text" : "Bonn", "score" : 311938.0, "payload" : {"data":["Bonn","53111","53113","53115","53117","53119","53121","53123","53125","53127","53129","53173","53175","53177","53179","53225","53227","53229"]} } ] } ] }
æ°ã¥ãããããããŸããããéœåžã®äººå£ãéã¿ãšããŠäœ¿çšããããšã¯è«ççã§ãã 倧éœåžã¯å°éœåžãããå 端ã«ãããŸãã è¿ãããçµæã«ã¯ãéœåžã®ååãšãã®ãã¹ãŠã®éµäŸ¿çªå·ãå«ãŸããŸããããã䜿çšããŠããã©ãŒã ã«èªåçã«å ¥åããããšãã§ããŸãïŒç¹ã«éµäŸ¿çªå·ã1ã€ã ãèŠã€ãã£ãå ŽåïŒã
ä»æ¥ã¯ä»¥äžã§ãïŒ ãã ããããã¯ãããªãã¯ããŒã¿ããŒã¹ã«é©ããŠããã ãã§ã¯ãªãããšã«æ³šæããŠãã ããã ããªãã®äŒç€Ÿã®å¥¥æ·±ãã®ã©ããã§ã誰ããæ¢ã«æçšãªããŒã¿ãåéããŠããŠããããããªãã®ã¢ããªã±ãŒã·ã§ã³ã§è£è¶³ããã䜿çšãããã®ãåŸ ã£ãŠããããšã¯ééããããŸããã ããªãã®ååã«èããŠãã ããã ãã®ãããªããŒã¿ããŒã¹ã¯ã©ã®äŒç€Ÿã«ããããŸãã
翻蚳è ãã
ããã¯ç§ã®æåã®ç¿»èš³ã§ãã ãããã£ãŠãç§ã¯ãããæ¹åããç§ã®ééããææããã®ãå©ãããã¹ãŠã®äººã«åãã£ãŠæè¬ããŠããŸããããããšã