å°ãåãŸã§ã次ã®ã¿ã¹ã¯ãçºçããŸããã
- æ å ±ã·ã¹ãã ïŒIPïŒã«ã¯ã俳åªããã¥ãŒãžã·ã£ã³ãDJãããã«ã¯æ¿æ²»å®¶ãç§åŠè ããã®ä»ã®èå人ã®ããŒã¿ããŒã¹ïŒDBïŒããããŸãã
- IPã¯ã¢ããã¥ã¢ãšã¢ããã¥ã¢ã§ãã£ã±ãã§ããã®å€ãã¯ãããŸãã¯ãã®å§ããã·ã¢èªã«æ£ãã翻蚳ããã®ãé£ãããšæããŸããã ãããŠæã ããã·ã¢èªã§ååãå€åœèªã§æžãããæ¹æ³ãšã¯ããªãç°ãªã£ãŠèãããããšããããŸããã
- ååã¯ããªãèªç±ãªåœ¢åŒã§ããŒã¿ããŒã¹ã«å ¥åãããŸããã ååã®èªé ã¯ãã°ãã°å€æŽããïŒç¹ã«ã¢ãžã¢ã®ååã®å ŽåïŒãããã«ãããã¯ããŒã ãšä»®åã瀺ãããïŒãŸãã¯ç€ºãããªãïŒå¯èœæ§ããããŸãã
- åçŽãªééãããããŸãã-眮æãçç¥ããŸãã¯æåã®è¿œå ã æã«ã¯ãäœããã®çç±ã§ãæåã®äžéšãã©ãã³æåã§æžãããŠããŸããã
- 幞ããªããšã«ãå¶ç¶ã®ç»å Žäººç©ã¯ããŸããã§ããã
- ååã¯ãUTF-8ãšã³ã³ãŒãã£ã³ã°ã§MyISAM MySQLããŒãã«ã«ä¿åãããŸããã
- ããŒã¿ããŒã¹å ã®ååã®ç·æ°ã¯çŽ138.5åã§ããã
ããããäžããããããã®ã§ãã ãããŠã以äžããå¿ é ãã§ããïŒ
ããŒã¿ããŒã¹ãèªåçã«ã¹ãã£ã³ããããšã«ãããäœããã®åºæºã§é¡äŒŒããååã®ãã¢ãèå¥ããã³åœ¢æããããããããã«æååæã«æž¡ããŠãåã人ã«å±ããŠãããç°ãªã人ã«å±ããŠããããå€æããŸãã ããã«ãæ€çŽ¢ã«ã¯ããããã®åŠæ³ããã£ãã¯ãã§ãã ã€ãŸã ã ããããããªãã念é ã«çœ®ããŠãã... ããšã¯ç°ãªããã çµãããªããããè¿œãè¶ããã»ãããã ããšããååã«åºã¥ããŠãããã«ããã€ãã®ååãä»ããããšã«ãªã£ãŠãã ã ãããã£ãŠããããã«å€ãã®ãã¢ãé¡äŒŒæ§ã«ã€ããŠåæããå¿ èŠããããŸããã ããã«ãããããªã¢ã«ã¿ã€ã ã§ã¯ãªããæ¬åœã«éãããæéã§è¡ãããšã ã€ãŸãã1æ¥ã1é±éã§ã¯ãªããææªã®å Žåã«ã¯æ°æéã§ã§ãã
äžè¬ã«ãèŠããã«ãä»»æã®èªé ãã¿ã€ããã¹ããšã©ãŒãç°ãªã綎ãã®å¥ã ã®ãã£ãŒã«ãïŒå§ãåãããã«ããŒã ãªã©ïŒã«åå²ããããšãªããã»ãŒ15äžã®ååãããã¯ããŒã ãä»®åããããŸãã ããã«ãååã«ã¯ããŸããŸãªåœç±ããããŸã-ãã·ã¢èªãè±èªããŠãã€äººãããŒã©ã³ãèªããã©ã³ã¹èªãéåœèªãäžåœèªãæ¥æ¬èªãã€ã³ãããã€ãã£ãã¢ã¡ãªã«ã³ãªã©ã ãããã®äžãããåãååã«åŒ·ã䌌ãŠãããã¹ãŠã®çš®é¡ã®ãã¢ãèŠã€ããå¿ èŠããããŸãã
æåã¯ããã®ã¿ã¹ã¯ã¯éåžžã«é£ããããã§ããã ãããŸãæ€çŽ¢ã«äžæ £ããªç§ã«ãšã£ãŠãååã®é¡äŒŒæ§ãçžéãå€æããããã®æ£åŒãªåºæºãæ³åããããšããå°é£ã§ããã
ããããåŸã ã«ãããã®ç®çã«äœ¿çšãããã¢ãããŒããšã¢ã«ãŽãªãºã ã®ç解ã圢æããå§ããŸããã ãŸã第äžã«ããããã¯ãã¹ãã«åã®ãšã©ãŒãæ¢ãéã«ç¹ã«æåãããããŸãã¯åã«ããŸãæ©èœããã¢ã«ãŽãªãºã ã§ããïŒ
- é³å£°ã³ãŒãã£ã³ã°ïŒç¹ã«ãPyotr Kankovskyã«ããããã·ã¢ã®ã¡ã¿ãã©ã³ããšãã®ä¿®æ£[1] ïŒã
- ã¬ãŒãã³ã·ã¥ã¿ã€ã³ç·šéè·é¢[2]ã«åºã¥ãé¡äŒŒæ§ã®å°ºåºŠã
- äžè¬åœ¢åŒã®åé¢[3] ã
- Jaro-Winklerã¢ã«ãŽãªãºã [4] ã
- N-gramåæã [5] ã
åžžèã«ãããšã以äžãåãå ¥ããããŸããã
é³å£°ã³ãŒãã£ã³ã°ã¯ãããŒã¿ããŒã¹ã«ããŒã¿ãå ¥åããããã«é©åã§ã¯ãããŸããã é³å£°æ€çŽ¢ã§ã¯ãã¢ããã¥ã¢ã®ã¢ããã¥ã¢ã翻蚳ãããã®ã§ã¯ãªããè³ã§ééã£ãŠç¶ŽãããååãèŠã€ããããšãã§ããŸãã äŸã§èª¬æããŸãã
ããã§ã第3ã©ã³ã¯ã®Jeannette Devereaux [6]ã®è¹é·ãç§ãã¡ã®æ¯åœã®ãã¹ããŒããªãã£ã¹ã«æ¥ãŠããã£ãŒã¿ãŒã«åŸã£ãŠã¯ã£ãããšå€§å£°ã§çŸããŸãã ãŸãããã¹ããŒãæ åœè ã¯ãèªåã®å§ãšãDiverãããDevirããããã«ã¯ãDivirããè³ã§é²é³ã§ããŸãã ããã«ãã¹ãããããååã®åé³ã2åã«ããŸãã ããããã¹ãŠã®ãèŽèŠå¹»èŠãã¯ãé³å£°ã³ãŒãã£ã³ã°ã«ãããã¡ãžãŒæ€çŽ¢ã¢ã«ãŽãªãºã ã«ãã£ãŠå®å šã«è¿œè·¡ãããŸãã
ããããwoeã¯ãã©ã³ã¹èªã®è©³çŽ°ãç¥ããªãïŒãããè±èªã«ç²ŸéããŠããïŒã翻蚳è ãã§ãããè»äºIDããããŒã¿ãã³ããŒããããšã§ããZhennett DeVeroxããç°¡åã«æžãããšãã§ããŸãã ç¹ã«æèœã®ãã人ã¯ãã³ãŒã«ãµã€ã³ã瀺ãããååã®åã«ååã綎ãããšãã§ããŸãã
ãŸãã¯ãå€ãã®è±èªã話ãã翻蚳è ãã§ããããRalph FiennesããšããååããRalphãã§ã¯ãªããRalph Fiennesããšèªãå¿ èŠãããããšãç¥ããŸããã
ãããã£ãŠããã®å Žåã誀ã£ãååã®ã¹ãã«ã¯ã CC aãèãããšããã§ã¯ãªãã RTCRã®åœ¢åŒã®ããã«ã綎ããããããé³å£°ã³ãŒãã£ã³ã°ã¯é©åã§ã¯ãããŸããã
è³ã§ã¯ãªãç®ã§èªèãããååã«é³å£°ã¢ã«ãŽãªãºã ãé©çšããããšãããšãååã®é³å£°ã€ã³ããã¯ã¹ãã¹ãã«ã®ããªãšãŒã·ã§ã³ãããããã«ç°ãªããšããäºå®ã«ã€ãªããå¯èœæ§ããããŸãã
ããã§ã¯ãç·šéè·é¢ã䜿çšããã®ã§ããããïŒ
ãããã®ã¡ããªã¯ã¹ããä»»æã®åèªé ã®ãã¬ãŒãºã§ã¯ãªããåèªãæ¯èŒããããã«ç¹å¥ã«èšèšãããŠãããšããäºå®ããªããã°ãè¯ãèãã§ãã ååãšå§ãå Žæãå ¥ãæ¿ãããããããã®éã«ãšã€ãªã¢ã¹ãæ¿å ¥ãããããå¿ èŠãããããããã¯éåžžã«äžæºè¶³ãªçµæã瀺ããŸãã
ãã®å Žåãååãåå¥ã®åèªã«åå²ããããããããã§ã«æ¯èŒããå¿ èŠããããŸãã ãããŠããããã®åã ã®åèªãé³å£°ã€ã³ããã¯ã¹ã«æåã«æã£ãŠæ¥ããšããŠããããã¯ã¢ã«ãŽãªãºã ã®åäœãèããè€éã«ããŸãïŒãããŠé ãããŸãïŒã ãŸããååã«å«ãŸããåèªã®æ°ãå¯å€éã§ãããšããäºå®ã¯ãããã«è€éã«ãªããŸãã åèªã®çç¥ã䞊ã¹æ¿ãã®å¯èœæ§ãèãããšããã®ããã»ã¹ã®è€éããæ³åããããšãããã§ã«å°é£ã§ãã
ãã®ä»ã®æ¹æ³
äžè¬çãªåœ¢åŒãåºå¥ããããã®ã¡ããªãã¯ã®æ¬ ç¹ã«ã¯ãäžæ確ãªã€ã³ããã¯ã¹æ©èœãç¹ã«çãååã§é¢é£æ§ãèšç®ããæéãéåžžã«è€éã§ããããšãå«ãŸããŸãã æå€§å ±ééšåæååã®åäžã®å²ãåœãŠã§ããæéã®è€éãã¯OïŒnmïŒã§ãã ãããŠãæ®ãã®å ±ééšåæååã®ååž°çæ€çŽ¢ãç¶ç¶ããå¿ èŠããããããé¢é£æ§é¢æ°ã®èšç®ã¯éåžžã«é«äŸ¡ãªæäœã§ãããN-gramæ¯èŒã«ãã倱ãããŸãã
ç¹°ãè¿ããŸãããå Žæã®èšèãå€ãããšãé¢é£æ§ã¯åçã«äœäžããŸãã
Jaro-Winklerè·é¢èšç®ã¯ã¯ããã«é«éã«åäœããŸãã ããããæ®å¿µãªãããèšèã®å€åã«ãæµæããŸããã
ãã®ãããååã®åèªãåé¢ããŠçµåããå¿ èŠããªãããã«ãwilly-nillyã¯N-gramåæã«å°å¿µããªããã°ãªããŸããã§ããã ãããããã©ã€ã°ã©ã ã§ãããªããªã çãéåœèªã®ååã¯ãé·ãããN-gramã匷調ããã®ã«åœ¹ç«ã¡ãŸããã
ã¹ããŒãž1ã æ£é¢è¡æã
ãŸãããã©ã€ã°ã©ã æ¯èŒã®æéã®è€éããç解ããããã«ãããŒã¿ããŒã¹å šäœã®ååããã¢ã§æ¯èŒããæãåçŽãªããŒãžã§ã³ã®ã¹ã¯ãªãããäœæããŸããã
ãã®ã¹ã¯ãªããã¯ãASUS P5B-Eãã¶ãŒããŒãäžã®ã·ã³ã°ã«ã³ã¢Intel Pentium D 925ãšã»ãŒåã幎霢ã®éåžžã®ãããã³ã°ãããSeagateããŒããã©ã€ããæèŒãã8æ³ã®ã³ã³ãã¥ãŒã¿ãŒã§èªå® ã§äœæããã³ãã¹ããããŸããã
圌ããèšãããã«ããã®ã¹ã¯ãªããã¯é¡ã§æ©èœããŸããã ããŒã¿ããŒã¹ãèªã¿åãã以åã«æ£èŠåããåååãäžæã®ãã©ã€ã°ã©ã ã«åå²ãã以åã®åååïŒã€ãŸãã以åã«ãã¹ãããããŒã¿ããŒã¹å ã®ååïŒã®åºçŸåæ°ãã«ãŠã³ãããŸããã 2åã®äžèŽæ°ããããã®åååã®äžæã®ããªã°ã©ã ã®åèšã§é€ç®ããçµæãç¹å®ã®ãããå€ïŒããšãã°ã0.75ãŸãã¯0.8ïŒãè¶ ããå Žåãããã¯ãã¡ã€ã«ã«èšé²ãããŸããã
ååã®æ£èŠåã¯æ¬¡ã®ãšããã§ãã
- è±èªã®æåãšããã€ãã®æ°åã¯ãåæ§ã®ã¹ãã«ã®ãã·ã¢èªã®æåã«çœ®ãæããããŸããã
- å°æåã¯ãã¹ãŠå€§æåã«çœ®ãæããããŸããã
- æåãEãã¯ãEãã«ããbãã¯ãbãã«çœ®ãæããããŸããã
- ä»ã®ãã¹ãŠã®æåãèšå·ãå°å·ã§ããªãæåãããã³ãããã®ã·ãŒã±ã³ã¹ã¯ã2ã€ã®ã¹ããŒã¹ã«çœ®ãæããããŸããã
- ååã¯äºéã¹ããŒã¹ã§å²ãŸããŠããŸãã
ãã®ãããªæ£èŠåã®çµæãå¥èªç¹ãé³èš³æåãªãã§ãååèªãäºéã¹ããŒã¹ã§å²ãŸãããä»ã®åèªãšåºåããããããããšãªããç°¡ç¥åãããååãåŸãããŸããã
åèªéããã³ååã®å é ãšæ«å°Ÿã§ã®äºéã¹ããŒã¹ã®äœ¿çšã«ãããããªã°ã©ã åæã¢ã«ãŽãªãºã ã¯åèªã®å Žæã®å€æŽã«å®å šã«åå¿ããªããªããŸããã
æ£èŠåãããååã§ã¯ããã¹ãŠã®çš®é¡ã®äžæã®ãã©ã€ã°ã©ã ãéç«ã£ãŠããŸãã-3ã€ã®é£ç¶ããæåïŒã¹ããŒã¹ãå«ãïŒã®çµã¿åããã
次ã«ã2ã€ã®ååã«å¯ŸããŠæ¬¡ã®é¢é£ä¿æ°ãèšç®ãããŸããã2åã®æ°ã®äžè¬çãªããªã°ã©ã ã¯ãåååã®äžæã®ããªã°ã©ã ã®åèšã§é€ç®ãããŸããã
ãã©ã€ã°ã©ã ã®äžææ§ã®èŠä»¶ã¯ãçç±ã«ããè¡šæãããŠããŸãã äžæ¹ã§ã¯ãååã®ãã©ã€ã°ã©ã ã®äžææ§ãèæ ®ããã«ç·æ°ãæ°ãããšããJohn SmithããšãJohn John Smithããªã©ã®ååãèªä¿¡ãæã£ãŠåºå¥ã§ããŸãã ãŸããäžæã®ãã©ã€ã°ã©ã ã®ã¿ãèæ ®ãããšããããã®ååã¯åäžãšèŠãªãããããšã«ãªããŸãã
äžæ¹ã§ã第äžã«ããã®ãããªååãåºå¥ããå¿ èŠã¯ã»ãšãã©ãããŸããã ãããŠã第äºã«ããŠããŒã¯ãªãã®ã ãã§ãªãããã¹ãŠã®ãã©ã€ã°ã©ã ã®é¢é£æ§ã®èšç®ã¯ããã®æé ãèããè€éã«ããŸãã å ±éã®ãã©ã€ã°ã©ã ã®æ°ã2åã«ãã代ããã«ã2çªç®ã®ååã®åããã®ãã©ã€ã°ã©ã ãšã2çªç®ã®ååã®åããã®ãã©ã€ã°ã©ã ã®çºçã®åèšãèæ ®ããå¿ èŠããããŸãã ã€ãŸããããã«ãããé¢é£æ§ã®èšç®ã«è²»ããããæéãçŽ2åã«ãªããŸãã ãããŠã第äžã«ãé¢é£æ§ã®èšç®æé ã«ããã€ãã®æé©åãé©çšããããšã¯ã§ããŸãããããã«ã€ããŠã¯åŸã§èª¬æããŸãã
ãããã£ãŠãåé ã¢ã«ãŽãªãºã ã¯ãæåã®æ°åã®ååã§èšè¿°ããã³ãã¹ããããŸããã çµæã¯æ鬱ã§ããã äºæ³ã©ãããååã®é åå šäœã®äžèŽã®æ€çŽ¢æéã¯ããã®ãµã€ãºã«é¡èãª2次äŸåæ§ããããŸããã 130ã14äžã®ååã®é åå šäœãåŠçããåŸåã®äºæž¬ã¯ãçŽ2幎ã§ããã å€ãããŸãïŒ
ããããã¢ã«ãŽãªãºã ã®ããã©ãŒãã³ã¹ã®åæè©äŸ¡ããããŸããã
ã¹ããŒãž2ã æé©åæ¹æ³ãæ€çŽ¢ããŸãã
ãŸããæé©åãå¿ èŠãªã¢ã«ãŽãªãºã ã®èŠçŽ ãšãã·ã¹ãã å šäœã®é床ãäœäžãããæªåé«ããããã«ããã¯ããè©äŸ¡ããå¿ èŠããããŸããã æé©åã«ã¯ããã€ãã®é åããããŸããã
- ããŒã¿ããŒã¹ããã®æåã®éžæã§ã¯ãååã®é åå šäœãäœæããã®ã§ã¯ãªããä»ã®æäœã®æ°ãæžããããã«ããã¹ãæžã¿ã®ååãšå°ãªããšããããã«é¡äŒŒããååã®ã¿ãäœæããå¿ èŠããããŸãã ã³ãŒãã®ããŸããŸãªã»ã¯ã·ã§ã³ã®å®è¡æéãåæããçµæãããŒã¿ããŒã¹ãžã®ã¯ãšãªããããã«ããã¯ãã§ããããšãããããŸããã åãªã¯ãšã¹ãã®å®è¡æéãšåãµã³ãã«ã§çºè¡ãããåè£åã®æ°ã®äž¡æ¹ãåæžããå¿ èŠããããŸããã
- UTF-8ã®ãã«ããã€ããšã³ã³ãŒãã£ã³ã°ã®æååã䜿çšããŠå®è¡ãããéšåæååã®æ¯èŒãšã³ããŒã®æäœã¯ãã·ã³ã°ã«ãã€ããšã³ã³ãŒãã£ã³ã°ã®æååã䜿çšããåãæäœãããããã©ãŒãã³ã¹ãæ°åå£ããŸãã
- åèšããªã°ã©ã ã®å²åãå°ããå Žåãé¢é£æ§ä¿æ°ãæ£ç¢ºã«èšç®ããå¿ èŠã¯ãããŸããã ãã®å Žåãæããã«ãªããšããã«ãã®ããã»ã¹ãäžæããããšãå¯èœã«ãªããŸãã ä»ã®ãã¹ãŠã®ãã©ã€ã°ã©ã ãå ±éã§ããããšãå€æããå Žåã§ããå¿ èŠãªå¢çå€ãè¶ ããããšã¯ãããŸããã
- ãã®ä»ã®è©³çŽ°ã
ã¹ããŒãž3ã å®è£ ã
ãŸããæãç°¡åãªæ¹æ³ãšããŠãæ£èŠåãããUTF-8åã®ãšã³ã³ãŒãããWindows-1251ã®ãšã³ã³ãŒãã«åãæ¿ããè©Šã¿ãè¡ãããŸããã ããããMySQLã¯ãç°ãªãããŒãã«ã§ãã£ãŠããããŒã¿ããŒã¹ã®ç°ãªããšã³ã³ãŒãã£ã³ã°ã§è¡ãæ ŒçŽããããšã«éåžžã«jeããŠããããšãå€æããŸããã ããŒã¿ããŒã¹ã§ã®æäœäžã«ãããŒã«ã©ã€ãºäžå¯èœãªãšã©ãŒãçºçããŸãã ãããã£ãŠãæ£èŠåãããååã¯è¿œå ã®æ·»ä»ããŒãã«ã«UTF-8圢åŒã§ä¿åãããå¿ èŠã«å¿ããŠWindows-1251ã«ãã©ã³ã¹ã³ãŒããããŸãã
æ£èŠåãããååãä¿åãããšãæéãå€§å¹ ã«ç¯çŽã§ããŸãã æ£èŠåã¯ãåã¬ã³ãŒãã«å¯ŸããŠäžåºŠã ãå®è¡ããå¿ èŠããããŸãã
ããã«ãæ£èŠåãããååã®äžæã®ãã©ã€ã°ã©ã ã®æ°ã¯ãåãæ·»ä»ããŒãã«ã«ä¿åãããŸãã ãã®å€ã䜿çšãããšãååã®éããæããã«ãªããããå Žåã«é¢é£æ§ã®èšç®ã匷å¶çã«å®äºã§ããŸãã åæã«ãååã®äžæã®ããªã°ã©ã ã®æ°ãæ¯åèšç®ããå¿ èŠã¯ãªããªããŸããã
ããŠããããŠæãéèŠãªããšã«ã¯ãåŠçãããååã¯ãã©ã€ã°ã©ã ã€ã³ããã¯ã¹ã«ãã£ãŠã€ã³ããã¯ã¹ä»ããããŸãã ã€ãŸã è¿œå ã®è¡šã«ã¯ãæ£èŠåãããååã«ååšãããã¹ãŠã®äžæã®ãã©ã€ã°ã©ã ããªã¹ããããŠããŸãã åæã«ããé ããUTF-8ãšã³ã³ãŒãã£ã³ã°ã«æ ŒçŽããªãããã«ããã©ã€ã°ã©ã èªäœã¯æŽæ°ããã·ã¥ãšããŠæ ŒçŽãããŸãã-æ°åã®æäžäœ3ãã€ãã¯ãWindows-1251ãšã³ã³ãŒãã£ã³ã°ã®å¯Ÿå¿ããæåã®æ°å€ã«å¯Ÿå¿ããŠããŸããã
ããããåæã«ãå®å šã«ç¡å¹ãªã¢ãããŒããè¡ãããŸãããããã«èª¬æãããããã«ãæ€èšŒãããååããã®åããªã°ã©ã ã«å¯ŸããŠãåè£åã®ç¬èªã®éžæãè¡ããããããåäžã®é åã«çµåãããŸããã ããã«ããã®é åã®åååã«ã¯ã«ãŠã³ã¿ãŒãä»ããŠããŸãã-ããã€ã®ãµã³ãã«ã«åºäŒã£ããã§ãã ãããã®ååãããªã°ã©ã ã«å解ããå¿ èŠããªããããããã«ããé¢é£æ§ã®èšç®ã«ããããã®å©çãåŸããããšæ³å®ãããŠããŸãã-ãã®ã«ãŠã³ã¿ãŒã䜿çšããŠé¢é£æ§ãããã«èšç®ããã®ã«ååã§ããã
æ®å¿µãªãããMySQLããŒã¿ããŒã¹ããã®è€æ°ã®éžæïŒããåºãç¯å²ïŒãšçµ±äžãããååã®é åã®åœ¢æïŒããå°ãªãç¯å²ïŒã«ãããé¢é£æ§ã®èšç®ãåçŽåããããšã«ãã£ãŠåŸãããã²ã€ã³ãããäœåãé«ããªãŒããŒãããã³ã¹ããçºçããŸããã
説æããæé©åã®çµæãæéã³ã¹ãã®å€§å¹ ãªåæžãéæãããŸããã
åæãµã³ãã«ã«ãã©ã€ã°ã©ã ã€ã³ããã¯ã¹ã䜿çšãããšãçµæã®æéã®äºæž¬ã2幎ãã3ãææªæºã«ççž®ãããŸããã UTF-8圢åŒã®éšåæååã®ä»£ããã«æ°å€ã䜿çšãããšãå šäœã®åŠçæéãããã«10ã12ïŒ ççž®ãããŸãã ãããããã¯ãåãããã«ã2.5ãæã®é£ç¶æäœã¯é·ãããããã«æãããŸããã åæ§ã®ååã®æåã®éžæã¯ãããããªã¹ãããã®ãŸãŸã§ããã
ã¹ããŒãž4ã ãããªãæ¹åã
ãã©ã€ã°ã©ã ã€ã³ããã¯ã¹ããšã«ããŒã¿ããŒã¹ã«ç¹°ãè¿ãã¢ã¯ã»ã¹ããããšã¯ãå§ãã§ããŸããã å€æ°ã®ãµã³ãã«ãååã«é·ãåãããã ãã§ãªããçµ±åãããé åã¯éåžžã«å€§ãããåèŠçŽ ã«å ±éã®ããªã°ã©ã ã«ãŠã³ã¿ãŒã䜿çšããŠããã«ãããããããåŠçã«ããªãã®æéãããããŸããã
ãã®ã«ãŠã³ã¿ãŒã®ã¢ã€ãã¢ãæŸæ£ããªããã°ãªããŸããã§ããããWHERE IN SQLæ§é ã䜿çšããŠã1ã€ã®ã¯ãšãªã§åè£åã®åæã»ããå šäœãååŸããŸããã ããã«ãããäºæž¬ãæ°åã1ãæã«æžããããšãã§ããŸããã
次ã«ãé¢é£æ§ã®èšç®æé ãå€æŽãããŸããã é¢é£æ§ã®äžéãšååã«å«ãŸããäžæã®ãã©ã€ã°ã©ã ã®æ°ã«åºã¥ããŠãããã¹ãã®æ倧蚱容æ°ãã€ãŸããæ€èšŒæžã¿ã®ååã«å«ãŸããªãç³è«è åã®ãã©ã€ã°ã©ã ã®æ°ãèšç®ãããŸããã 次ã«ãç³è«è åã®ãã¹ãŠã®ããªã¬ãŒãæ€èšŒæžã¿ã®ååã«å ¥åããããã«ãã§ãã¯ããŸããã ããã¹ãã®æ°ã蚱容æ倧æ°ãè¶ ãããšããã«ãæ¯èŒãåæ¢ãããŸããã ããã«ãããæéäºæž¬ãæ°ããŒã»ã³ãççž®ã§ããŸããã ããããããã¯ãŸã ç£æ¥ç䟡å€ãšã¯ããé¢ããŠããŸããã
ã¹ããŒãž5ã 解決çã
ã¿ã¹ã¯ã¯äžæº¶æ§ã®ããã§ããã äžæ¹ã§ã¯ãåæãµã³ãã«ã®ååã®æ°ãå€§å¹ ã«æžããå¿ èŠããããŸããããä»æ¹ã§ã¯ããã®åæžã¯ãé¢é£æ§ã®èš±å®¹ç¯å²ã®äžéã«è¿ãååã誀ã£ãŠåãæšãŠãªãããã«ã人çºçãããŠã¯ãªããŸããã
ã¢ãããŒããèŠã€ããããšã¯ãèŠã€ãã£ãåæ§ã®ååã®ãã¢ãåã«èª¿ã¹ãããšã«ãã£ãŠå¯èœã«ãªããŸããã ãã®ãããªãã¢ã®ã»ãšãã©ãã¹ãŠã«ãé·ãã6ã7æåã®å ±éã®ãµãã·ãŒã±ã³ã¹ããããŸããïŒé£æ¥ããäºéã¹ããŒã¹ãå«ãïŒã åçŽãªæšè«ã䜿çšããŠïŒããã¯ãã®ããã¿ã®ããããã§ã¯ç€ºããŸããïŒã0.75ãè¶ ããé¢é£æ§ãå®çŸããã«ã¯ãæ£èŠåãããååã«ã¯5æå以äžã®å ±ééšåæååãå¿ èŠã§ãããšçµè«ä»ããããšãã§ããŸãã
ãããã£ãŠããã©ã€ã°ã©ã ã«ããäºåæ€çŽ¢ã®ã€ã³ããã¯ã¹ä»ãããããã³ã¿ã°ã©ã ïŒ5æåã®éšåæååïŒã«ããã€ã³ããã¯ã¹ä»ãã«é²ã¿ãŸãã PHPã®32ãããããŒãžã§ã³ã§32ããããæã€æŽæ°å€æ°ã«äºpentaæã®ããã·ã¥ãåãããããã«ã31æåïŒãããšãbããé€ãïŒãšã¹ããŒã¹ããããããåæåã5ããããšããŠè¡šããŸãã
å¥ã®å°ããªè¿œå ã
äžèšã§æžããããã«ãé¢é£æ§ãèšç®ãããšããååã®1ã€ã®äžæã®ãã©ã€ã°ã©ã ãæ¯èŒããã2çªç®ã®ååã®ãã©ã€ã°ã©ã ãšäžèŽããŸããã ãã®å Žåãæ倧蚱容æ°ã®ããã¹ãã®æ°ãè¶ ãããšãã«èšç®ãåæ¢ããŸããã
æåã®ååã®äžæã®ããªã°ã©ã ã®æ°ã2çªç®ã®ååã®äžæã®ããªã°ã©ã ã®æ°ãããã¯ããã«å°ãªãå Žåãèšç®ããããã¹ã®æ倧æ°ã¯è² ã«ãªãå¯èœæ§ããããŸãã ããã¯ããããã®ååãé¢é£æ§ã®ç¹å®ã®å€ã«æ±ºããŠäŒŒãŠããªãããšãæå³ããŸãã ãããŠããããããã§ãã¯ããå¿ èŠã¯ãããŸããã
ããããåã®äžæã®ãã©ã€ã°ã©ã ã®æ°ã2çªç®ã®ååã®ãããã®æ°ãããã¯ããã«å€ãå Žåããã®ã«ãããªãã¯éã®å Žåã«ã¯æ©èœããŸããã ãã®ãããã¯ãšãªã«å ããŠãæ£èŠåãããååã®äžæã®ããªã°ã©ã ã®æ°ã®å¢çå€ãå°å ¥ãããŸããã
ããã¯å¹Ÿåç©è°ããããæ¹åã§ãã 0.75ã®é¢é£æ§å¶éå€ãæå®ãããŠããå Žåãããã¯å éã«ã€ãªãããŸããããå察ã«ããµã³ãã«ã®è€éãã®å¢å ã«ãã3ïŒ ã®ããã©ãŒãã³ã¹ã®æ倱ã«ã€ãªãããŸãã ãã ããé¢é£æ§ã®å¶éã0.8ã«èšå®ãããšãæ¢ã«3ïŒ ã®å©çãåŸãããŸãã
ãã ãããã®ææ³ã¯å»æ¢ãããŸããã æ倱ã¯ââããã»ã©å€§ãããããŸããããæåã«ãé¢é£æ§ã®é«ãé¡äŒŒãããã¢ã®äºåæ€çŽ¢ãè¡ãããšãã§ããŸãã ãããŠãäºåæŽæµã®åŸã§ã®ã¿ãå¢çå€ã0.75ã«èšå®ããŸãã
çµæã
äžè¬çãªãã©ã€ã°ã©ã ã«ããåæéžæããäºpentaæã«ããéžæãžã®ç§»è¡ã®çµæãã¹ã¯ãªãããå€§å¹ ã«é«éåããããšãã§ããŸããã 138.5åã®ååã®ãã¡0.8ãã¢ã®é¢é£æ§ãæã€é¡äŒŒã®æ€çŽ¢ã¯ã1ãæã§ã¯ãªãçŽ5æéã§ããã ã€ãŸã æå®ãããååã®ããŒã¿ããŒã¹å šäœã§é¡äŒŒããååã1åæ€çŽ¢ãããšïŒåŠçæéã®2次äŸåæ§ãååŸããå ŽåïŒçŽ0.26ç§ã§ãã å€å°ã¯å€ãã§ãããããã¯åŒ·åãªããã»ããµãšé«æ§èœãã£ã¹ã¯ã·ã¹ãã ãåãããµãŒããŒã§ã¯ãªãã8幎åã®åæ»ç¶æ ã®ããŒã ã³ã³ãã¥ãŒã¿ãŒã§ã®ãã¹ãå®è¡ã§ããããšãèŠããŠããå¿ èŠããããŸãã
ååãšããŠããããµã°ã©ã ïŒ6æåã®ã€ã³ããã¯ã¹éšåæååïŒã§ãåææ€çŽ¢ãå®è¡ããããšãå¯èœã§ãã ããã«ãããæ°åã®å éãåŸãããŸããã ããããã¢ãžã¢ã®ççž®åã®å€ãã®ãã¢ïŒããã³ãããã ãã§ã¯ãªãïŒãæé€ã§ãããšããååã«æ ¹æ ã®ããææããããŸããã ã¯ããç¹å¥ãªæå³ã¯ãããŸããã ãã®åŸã®åä¿¡ãã¢ã®æååæïŒããã³ãããã®1äž1å以äžããã£ãïŒã¯ããããã®å Žåãæ°ãæããããŸãã
ããããé床ã®åå·çãå¿ èŠã§ã¯ãªãããéã«ãæãé©åãªååã®æå°æ°ãèŠã€ããå¿ èŠãããå Žåããããµã°ã©ã ã䜿çšããã€ã³ããã¯ã¹ä»ãã¯ããããããèããŠãã...ããªã©ã®ãã³ãã«éåžžã«é©ããŠããŸãã
ååãšããŠãé·ãååã䜿çšããå Žåã7æåã®éšåæååã®ã€ã³ããã¯ã¹ã䜿çšã§ããŸãã ãããŠããããã32ãããå€æ°ã«å ¥ããããã«ãäºåã®é³å£°ã³ãŒãã£ã³ã°ãå®è¡ããŠãæåæ°ã15ãšã¹ããŒã¹ã«æžãããŸãã ãããããã®ç¬éã«ãã®ãããªã¿ã¹ã¯ã¯ç«ã£ãŠããŸããã§ããã
æ®å¿µãªãããã¹ã¯ãªãããæžããçŽåŸã«ããµã€ãã¯èäœæš©äŸµå®³ã®ããã«ãããã¯ãããŸããã 圌ã¯ããã«å¥ã®ã¢ãã¬ã¹ã«ç§»åããŸããããåé¡ãå±±ç©ããŠãããããéè€ããååãæ€çŽ¢ããæéã¯ãããŸããã§ããã
ã¢ã«ãŽãªãºã ã®æçïŒ
DBããŒãã«æ§é
CREATE TABLE IF NOT EXISTS `prs_persons` ( `id` int(10) unsigned NOT NULL, `name_ru` varchar(255) collate utf8_unicode_ci default NULL, // // PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; CREATE TABLE IF NOT EXISTS `prs_normal` ( `id` int(10) unsigned NOT NULL, `name` varchar(255) collate utf8_unicode_ci default NULL, // `num` int(2) unsigned NOT NULL, // PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; CREATE TABLE IF NOT EXISTS `prs_5gramms` ( `code` int(10) unsigned NOT NULL, // `id` int(10) unsigned NOT NULL, // , KEY `code` (`code`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
å®æ°ãšè£å©å€æ°ãäºåã¢ã¯ã·ã§ã³ïŒ
$opt_debug_show_sql = 0; $opt_debug_mode = 0; $opt_table_prefix = "prs"; define('MIN_RELEVANT', 0.8); define('SITE_PREFIX', "âŠ"); // . // , // . $len_ratio = MIN_RELEVANT / (2 - MIN_RELEVANT); $suitablesin = iconv("Windows-1251", "UTF-8", "AaBbCcEegHKkMmnOoPprTuXxYy03468"); $suitablesout = ""; $charcode = array( ' ' => 0, '' => 1, '' => 2, '' => 3, '' => 4, '' => 5, '' => 6, '' => 7, '' => 8, '' => 9, '' => 10, '' => 11, '' => 12, '' => 13, '' => 14, '' => 15, '' => 16, '' => 17, '' => 18, '' => 19, '' => 20, '' => 21, '' => 22, '' => 23, '' => 24, '' => 25, '' => 26, '' => 27, '' => 28, '' => 29, '' => 30, '' => 31); // . set_time_limit(0); // . function message_die($errno, $error, $file, $line) { if ($errno) { print "<p><b>Error " . $errno . " " . $file . "(" . $line . "):</b> " . $error; die(); } }; $fout = fopen("âŠ", "w"); // . // . fclose($fout);
ã¡ã€ã³æ¯èŒã«ãŒãã®æ¬äœïŒ
// : $id = âŠ; $name_ru = âŠ; // ID UTF-8. $rusn = " "; // . $ischar = FALSE; for ($j = 0; $j < mb_strlen($name_ru, "UTF-8"); $j++) { $char = mb_substr($name_ru, $j, 1, "UTF-8"); if (($pos = mb_strpos($suitablesin, $char, 0, "UTF-8")) === FALSE) { if ($ischar) { // . $rusn .= " "; $ischar = FALSE; } } else { // Windows-1251 . $rusn .= $suitablesout{$pos}; $ischar = TRUE; } } if ($ischar) $rusn .= " "; // . if (strlen($rusn) < 5) continue; // . $norm = iconv("Windows-1251", "UTF-8", $rusn); // UTF-8 . // : $subgramms = array(); // , 0. // , . $code = ($charcode[$rusn{2}] << 5) | $charcode[$rusn{3}]; for ($j = 4; $j < strlen($rusn); $j++) { $code = (($code << 5) | $charcode[$rusn{$j}]) & 0x1FFFFFF; $subgramms[$code] = $code; // . } // . $trigramms = array(); for ($k = 0; $k < strlen($rusn) - 2; $k++) $trigramms[$trigramm = substr($rusn, $k, 3)] = $trigramm; $n = count($trigramms); // : $nmin = ceil($n * $len_ratio); $nmax = floor($n / $len_ratio); // . $similars = fquery("SELECT n.id AS id, n.name AS name, n.num AS num FROM ^@5gramms AS g, ^@normal AS n WHERE g.code IN (^N) AND n.id = g.id AND n.num >= ^N AND n.num <= ^N", $subgramms, $nmin, $nmax); // . fquery("INSERT INTO ^@normal (id, name, num) VALUES (^N, ^S, ^N)", $id, $norm, $n); // . foreach ($subgramms as $key=>$code) fquery("INSERT INTO ^@5gramms (code, id) VALUES (^N, ^N)", $code, $id); unset($subgramms); // . // -: for ($i = 0; $i < @mysql_num_rows($similars); $i++) { $similar = @mysql_fetch_assoc($similars) OR message_die(@mysql_errno(), @mysql_error(), __FILE__, __LINE__); $name = iconv("UTF-8", "Windows-1251", $similar['name']); $simid = $similar['id']; $m = $similar['num']; // . $nm = 0; // . $done = TRUE; $miss = floor($m - MIN_RELEVANT * ($n + $m) / 2); // "". for ($k = 0; ($k < strlen($name) - 2) & ($miss >= 0); $k++) { if (strpos($name, $trigramm = substr($name, $k, 3), 0) == $k) { // (): if (isset($trigramms[$trigramm])) $nm++; // . else $miss--; // . } } if ($miss >= 0) // , // . fwrite($fout, SITE_PREFIX . $id ."\t" . $rusn ."\t" . SITE_PREFIX . $key . "\t" . $name . "\t" . ($nm + $nm) / ($m + $n) . "\r\n"); } @mysql_free_result($similars) OR message_die(@mysql_errno(), @mysql_error(), __FILE__, __LINE__); unset($trigramms); // . fclose($fout); // .
MySQL圢åŒã®ã¯ãšãªé¢æ°
[7]
// Syntax: fquery($query_template_text, $argument's_1_value, $argument's_2_value, ...) // Special characters for a query template: // ^@TableName - indicates that the combination ^@ is to be replaced with table prefix // ^N - numeric parameter(s) (is not to be quoted), separated by comma if is array // ^S - string parameter(s) (is to be quoted), separated by comma if is array // ^0 - "NULL" or "NOT NULL" // (c) Grigoryev Andrey aka GrAnd aka Pochemuk // Thanks to Kamnev Artjom (Kamnium), Mesilov Maxim (Severus) for idea // http://life.screenshots.ru // When query successed returns Recordset for SELECT or True for others. // When error occurs returns False. function fquery() { global $opt_debug_mode; global $opt_debug_show_sql; global $opt_table_prefix; // getting prefix from the site's options if (is_array(func_get_arg(0))) $args = func_get_arg(0); else $args = func_get_args(); $qtext = $args[0]; // the first argument is always query template text if (empty($qtext)) return false; // Hmm, nothing to do! $qtext = str_replace("^@", ($opt_table_prefix == "") ? "" : ($opt_table_prefix . '_'), $qtext); // replacing with table prefixes $i = 0; $curArg = 1; $query = ""; while ($i < strlen($qtext)) // strlen is always up-to-date, even if special chars are replaced { if ($qtext{$i} == '^') { if ($curArg >= count($args)) return false; // too many parameters in the query template! $i++; switch ($qtext{$i}) { case 'N': { if (is_null($args[$curArg])) { $query .= "NULL"; continue; } if (is_array($args[$curArg])) $query .= implode(", ", $args[$curArg]); else $query .= $args[$curArg]; break; } case 'S': { if (is_null($args[$curArg])) { $query .= "NULL"; continue; } if (is_array($args[$curArg])) $query .= "'" . implode("', '", $args[$curArg]) . "'"; else $query .= "'" . $args[$curArg] . "'"; break; } case '0': { if (is_null($args[$curArg])) return false; // incorrect parameter, nulls are not allowed! $args[$curArg] = strtoupper($args[$curArg]); if (($args[$curArg] != "NULL") && ($args[$curArg] != "NOT NULL")) return false; // incorrect parameter, "NULL" or "NOT NULL" only! $query .= $args[$curArg]; break; } default: $query .= $qtext{$i}; } $i++; $curArg++; } else $query .= $qtext{$i++}; } if ($opt_debug_show_sql == 1) print('<P><CODE>Query string: "' . $query . '"<CODE></P>' . "\r\n"); $ResultData = mysql_query($query); if (mysql_errno() <> 0) { if ($opt_debug_mode == 1) { print('<P><CODE>MySQL error: #' . mysql_errno() . ': ' . mysql_error() . ' '); print('Query string: ' . $query . '</CODE></P>'); } } return($ResultData); } // End of fquery
æåã®1000ã¬ã³ãŒãã§ã®ãã¹ãå®è¡ã®çµæïŒ
äœæ¥çµæ
ADDRESS-1 | NAME-1 | ADDRESS-2 | NAME-2 | é¢é£æ§ |
---|---|---|---|---|
/人/ 784 | ããŒã¿ãŒã»ãžã§ã€ãœã³ | /人/ 389 | ããŒã¿ãŒã»ãžã£ã¯ãœã³ | 0.8125 |
/人/ 1216 | ãã£ãŒã«ãºã»ãã³ã¹ | /人/ 664 | ãã£ãŒã«ãºããã¹ | 0.8 |
/人/ 1662 | ã¹ãã¥ã¢ãŒãFãŠã£ã«ãœã³ | /人/ 1251 | ã¹ãã¥ã¢ãŒããŠã£ã«ãœã³ | 0.914285714286 |
/人/ 1798 | ãã€ã±ã«ã»ãã³ | /人/ 583 | ãã€ã±ã«ã»ã¬ãã³ | 0.846153846154 |
/人/ 2062 | ãã€ã±ã«ã»ããŒã³ | /人/ 265 | ãã€ã±ã«ãã³ | 0.8 |
/人/ 2557 | ãžãŒãã»ãã€ãã¹ | /人/ 963 | ãžã³ã»ãã€ãã¹ | 0.8 |
/人/ 3093 | JJãžã§ã³ãœã³ | /人/ 911 | ãã³ãžã§ã³ãœã³ | 0.818181818182 |
/人/ 3262 | ãã ãšãã¬ãã | /人/ 586 | ãã ãšãã¬ããã¹ã³ãã | 0.84848484848485 |
/人/ 3329 | ãããŒãã»ãªãã㣠| /人/ 3099 | ãããŒãã»ããã㣠| 0.827586206897 |
/人/ 3585 | ãã¬ãŒã·ãŒã±ã€ãªãªã«ã | /人/ 2810 | ãã¬ãŒã·ãŒãŠã«ã | 0.857142857143 |
/人/ 3598 | ã¢ã¬ããµã³ããŒã»ã«ã«ã®ã³ | /人/ 2852 | ã¢ã¬ããµã³ããŒã»ã«ãªã£ã®ã³ | 0.85 |
/人/ 3966 | ã»ã«ã²ã€ã»ãããã | /人/ 2991 | ã»ã«ã²ã€ã»ããããML | 0.888888888889 |
/人/ 3994 | ã»ã«ã²ã€ã»ãããã | /人/ 3966 | ã»ã«ã²ã€ã»ãããã | 0.8125 |
/人/ 4049 | ãªãã£ãŒãã»ã«ã€ã¹ | /人/ 2063 | ãªãã£ãŒãã»Jã»ã«ã€ã¹ | 0.882352941176 |
/人/ 4293 | ãžã§ãªãŒã«ããã | /人/ 2006 | ãžã§ãªãŒã»ãªãŒ | 0.88 |
/人/ 4377 | ãžã§ãŒã³ã»ãã¥ãŒã¶ã㯠| /人/ 3774 | ãžã§ã³ã»ã¯ãµã㯠| 0.827586206897 |
/人/ 4396 | ãã£ãŒã³ã»ãã¯ããŒã¢ãã | /人/ 2614 | ãã£ã©ã³ã»ãã¯ããŒã¢ãã | 0.833333333333 |
/人/ 4608 | ã·ã§ãŒã³ã»ãžã§ã³ã¹ãã³ | /人/ 2036 | JJãžã§ã³ã¹ãã³ | 0.8 |
/人/ 4981 | ã¯ãªã¹ããã¡ãŒã¡ã€ | /人/ 3233 | ã¯ãªã¹ããã¡ãŒã»ããŒã¬ã€ | 0.8 |
/人/ 5019 | ãžã§ãŒã³ã»ã¢ã¬ããµã³ã㌠| /人/ 381 | ãžã§ã€ãœã³ã»ã¢ã¬ããµã³ã㌠| 0.842105263158 |
/人/ 5551 | ã«ã«ãã¹ã»ã¢ã³ãã¬ã¹ã»ãŽã¡ã¹ | /人/ 1311 | ã«ã«ãã¹ã»ãŽã¡ã¹ | 0.810810810811 |
/人/ 5781 | ã¢ã¬ãã¯ã¹ã»ãã€ããŒã¬ãŒ | /人/ 4288 | ã¢ã¬ãã¯ã¹ããã«ã¬ãŒ | 0.8 |
/人/ 5839 | ãžã§ãŒã€ã»ãã©ãã«ã¿ | /人/ 935 | ãžã§ã³ã»ãã©ãã«ã¿ | 0.8125 |
/人/ 5917 | ãžã§ãŒã»ãžã§ã³ã¹ãã³ | /人/ 2036 | JJãžã§ã³ã¹ãã³ | 0.833333333333 |
/人/ 5917 | ãžã§ãŒã»ãžã§ã³ã¹ãã³ | /人/ 4608 | ã·ã§ãŒã³ã»ãžã§ã³ã¹ãã³ | 0.8 |
/人/ 6112 | ããŒãã¹ã»ã©ã€ã¢ã³ | /人/ 4869 | ããŒãã¹ã»ãžã§ã€ã»ã©ã€ã¢ã³ | 0.823529411765 |
/人/ 6416 | ãã©ã€ã¢ã³ãžã§ãŒãž | /人/ 3942 | ãžã§ãŒãžã»ãã©ã€ã¢ã³ã | 0.84848484848485 |
/人/ 6520 | ãžã§ã³ã»ã«ãŒã㌠| /人/ 5207 | ãžã§ã³ã»ã«ã㌠| 0.8 |
/人/ 6834 | ãžã§ã³ã»Jã»ã¢ã³ããŒãœã³ | /人/ 5049 | ãžã§ãŒã»ã¢ã³ããŒãœã³ | 0.838709677419 |
/人/ 6836 | ãã€ã±ã«ã»ãšã¹ãã³ | /人/ 5056 | ãã€ã±ã«ã»ãŠã§ã¹ãã³ | 0.827586206897 |
/人/ 6837 | ããŽã£ããã»ããã³ | /人/ 5884 | ããŽã£ããã»ããã³ | 0.827586206897 |
/人/ 7261 | ããªãŒã»ã°ã¬ã€ | /人/ 1695 | ããªãŒã»ã¬ã€ | 0.8 |
/人/ 7361 | ã¢ã©ã³ãããã | /人/ 3087 | ããŽã£ããã»ã¢ã©ã³ã»ããã·ã¥ | 0.838709677419 |
/人/ 7447 | ããŽã£ããã»ãšã¢ãŒ | /人/ 2277 | TEYER DAVID | 0.814814814815 |
/人/ 7497 | ã¢ã¬ããµã³ããŒã»ã«ã©ã ãã | /人/ 3857 | ã¢ã¬ããµã³ããŒã»ã«ã«ãã | 0.8 |
/人/ 7499 | ãã³ã©ã¹ã»ã©ã¡ãªãŒ | /人/ 4424 | ãã³ã©ã¹ã»ãªãŒ | 0.827586206897 |
/人/ 7534 | ãªãã£ãŒãã»ãªãã | /人/ 3547 | ãªãã£ãŒãã»ãã« | 0.857142857143 |
/人/ 7547 | ã¹ãŽã§ãã©ãŒãã»ã¹ã¿ãªã³ã | /人/ 1985 | ã¹ãŽã§ãã©ãŒãã»ã¹ãŽã§ãã£ã³ãŽã¡ | 0.8 |
/人/ 7677 | ãžã§ã€ã¹ã»ã¢ã¬ããµã³ã㌠| /人/ 381 | ãžã§ã€ãœã³ã»ã¢ã¬ããµã³ã㌠| 0.842105263158 |
/人/ 7677 | ãžã§ã€ã¹ã»ã¢ã¬ããµã³ã㌠| /人/ 5019 | ãžã§ãŒã³ã»ã¢ã¬ããµã³ã㌠| 0.833333333333 |
/人/ 8000 | ã°ã¬ãŽãªãŒã»ã¹ãã¹ | /人/ 7628 | ã°ã¬ãŽãªãŒPã¹ãã¹ | 0.909090909091 |
/人/ 8137 | ãã£ã¹ããŒã»ã¯ãªã¹ãã³ã»ã³ | /人/ 128 | ãžã§ã¹ããŒã»ã¯ãªã¹ãã³ã»ã³ | 0.8 |
/人/ 8186 | ã·ã§ãŒã³ã»ã³ã¹ã® | /人/ 6235 | ããããã | 0.814814814815 |
/人/ 8219 | ãã©ã³ãã³ãžã§ãŒã ãºãªã«ãœã³ | /人/ 797 | ãžã§ãŒã ãºã»ãªã«ãœã³ | 0.810810810811 |
/人/ 8442 | ã¬ã³ããŒã»ãšã«ããœã³ | /人/ 7033 | ã°ã³ããŒã»ãžã§ã³ãœã³ | 0.8 |
/人/ 8458 | ãžã§ã³ã»ã¢ã¬ããµã³ã㌠| /人/ 381 | ãžã§ã€ãœã³ã»ã¢ã¬ããµã³ã㌠| 0.810810810811 |
/人/ 8458 | ãžã§ã³ã»ã¢ã¬ããµã³ã㌠| /人/ 5019 | ãžã§ãŒã³ã»ã¢ã¬ããµã³ã㌠| 0.8 |
/人/ 8614 | ããããã»ããŒãã³ | /人/ 4945 | ããŽã£ããã»ãã€ãã³ | 0.8 |
/人/ 8874 | ãã³ã©ã¹ã»ã«ãŒã° | /人/ 1667 | ãã³ã©ã¹ã㊠| 0.827586206897 |
/人/ 8987 | ããããã»ãã¹ | /人/ 4870 | ããããã»ã¯ãã¹ | 0.814814814815 |
/人/ 9132 | ãããŒãã»ãã³ã° | /人/ 7683 | ãããŒãã»ãã³ãŽ | 0.827586206897 |
/人/ 9202 | ãããŒãã»ã¡ã³ãã« | /人/ 3410 | ãããŒãã»ãã³ãã« | 0.8125 |
/人/ 9229 | ã¢ã·ã¥ãªãŒããŒã¬ã³ã¹ | /人/ 2534 | ããŒã¬ã³ã¹ã»ã¢ã·ã¥ãªãŒ | 1 |
/人/ 9303 | ãžã§ã³ã»ãšã€ã©ãŒã | /人/ 8703 | ãžã§ã³ã»ãšãŠã©ãŒã | 0.827586206897 |
/人/ 9308 | SEAN ROBERS | /人/ 6552 | SEAN O ROBERS | 0.903225806452 |
/人/ 9347 | ã¹ãã£ãŒãã³ã»ã»ã«ãžã¯ | /人/ 2911 | ã¹ãã£ãŒãã³ã»ãµãŒãžã㯠| 0.8 |
/人/ 9432 | ããªãŒã·ã§ãã³ | /人/ 2240 | ã¢ãªãŒã»ã·ã§ãã³ | 0.8 |
/人/ 9583 | ãžã¥ãªãŒã»ããªã¹ | /人/ 904 | ãžã¥ãªã¢ã¹ã»ããªã¹ | 0.838709677419 |
/人/ 9788 | ã¢ã³ãããŒã¹ã¿ãŒ | /人/ 8308 | ã¢ã³ãããŒã»ã¹ã¿ãŒã¯ | 0.8 |
/人/ 9835 | ãã€ã±ã«ã»ãŠã»ã¯ãã³ãº | /人/ 4727 | ãã€ã±ã«ãºã»ã¢ããã»ã¯ãã³ãº | 0.864864864865 |
/人/ 9893 | ã¹ãã£ãŒãã»ãã«ãã£ãŒã | /人/ 6457 | ã¹ãã£ãŒãã»ããŒãã£ã³ | 0.827586206897 |
åç §ãšã¡ã¢ïŒ
1.é³å£°ã³ãŒãã£ã³ã°ã
â é³å£°ã¢ã«ãŽãªãºã ã ããã©ããã«
â ãå§ãããŸãã¯ãã·ã¢èªã®MetaPhoneïŒãã·ã¢èªã®Metaphoneã®èª¬æïŒ
2. ã¬ãŒãã³ã·ã¥ã¿ã€ã³è·é¢ã ãŠã£ãããã£ã¢
3.äžè¬çãªãã©ãŒã ã®å²ãåœãŠã«åºã¥ãåèªã®é¡äŒŒæ§ã®èšç®ã¯ãsimilar_texté¢æ°ã§PHPã«å®è£ ãããŠããŸãã PHPã®ããã¥ã¡ã³ãã«ã¯ããã®é¢æ°ã¯Oliverã¢ã«ãŽãªãºã [1993]ã«åºã¥ããŠãããšèšèŒãããŠããŸãã ãã ããåœåããRatcliff / Obershelpãã¿ãŒã³èªèã¢ã«ãŽãªãºã ããšããã¿ã€ãã«ã®ãã®ã¢ã«ãŽãªãºã ã¯ãJohn W. Ratcliffã«ãã£ãŠDr. 1988幎ã®Dobb'sïŒ ãã¿ãŒã³ãããã³ã°ïŒã²ã·ã¥ã¿ã«ãã¢ãããŒã ïŒã ãããŠãIan Oliverã¯åœŒã®èæž ãProgramming ClassicsïŒImplementing the World's Best Algorithmsãã§ããã䜿çšããŸããã
4. Jaro-Winklerã¢ã«ãŽãªãºã ã®ãœãŒã¹ã¯ãããšãã°ããã§èŠãããšãã§ããŸãïŒ Jaro-Winkler Distance
5. ãã©ã€ã°ã©ã ã€ã³ããã¯ã¹ãŸãã¯ãã¿ã€ããã¹ã§æ€çŽ¢ãã ããã©ããã«
6. Jeannette "Angel" Devereu-ã«ã«ãã¡ãã£ã¢ãã©ã³ãã£ã€ãº " Wing Commander "ã®ãã£ã©ã¯ã¿ãŒãã³ã³ãã¥ãŒã¿ãŒç©ºéã·ãã¥ã¬ãŒã¿ãŒãæŠç¥ããã®ä»ã®åœ¢åŒã®ã²ãŒã ãæåŠäœåã æ ç»ãªã© ã ããã¯ããã©ã³ã¹ã®å§ã®é³ãç解ããŠããªããšããäºå®ã ãã®ããã«ãé·ãéãããããããŽã§ããŒãã«èããããšä¿¡ããŠããããã§ãã 誀ã£ã翻蚳ã®å³ã¯ãè³ã§ãã§ã¯ãªããç®ã§ãã§ãã
7.ãã©ãŒããããããSQLã¯ãšãªã®æ©èœã®åºç€ã¯ããSQLã€ã³ãžã§ã¯ã·ã§ã³ïŒåã³ã®éãããšããèšäºããKamnev ArtemãšMesilov Maximããæ£çŽã«åŒçšãããŠããŸãã æ®å¿µãªããããããã®äººã®ãµã€ãã¯æè¿èªã¿èŸŒãŸããŠããŸããããèšäºã®ã³ããŒã¯ãŸã èŠã€ãããŸããSQLã€ã³ãžã§ã¯ã·ã§ã³ïŒåã³ã®ããã®æŠã ïŒæ®å¿µãªããããœãŒã¹ãªãïŒã è°è«ã®äœå°ã®ããæ©èœãåé€ããŸããã 代ããã«ãé åãåŒæ°ãšããŠæž¡ãæ©èœãè¿œå ããŸããã