PHPでのファジー検索アルゴリズムの使用

ファジー検索音声アルゴリズムに関するトピックに触発されて、PHPを使用してGoogleの「おそらく考えていた:...」に似たものを実装しようと思いました。



単語のタイプミスを修正するには、次のものが必要です。

レーベンシュタイン距離 (または、ダメラウ-レーベンシュタイン距離-差はわずかです)-レーベンシュタイン()

Metaphone - Metaphone ()

オリバーアルゴリズム-Similar_text()

ロシア語の単語のベース(ケース、時間など)。



単語を音訳する機能:



function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  1. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  2. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  3. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  4. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  5. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  6. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  7. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  8. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  9. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  10. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  11. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  12. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  13. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  14. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  15. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  16. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  17. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  18. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



  19. function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .



function translitIt($str) { $tr = array( "" => "A" , "" => "B" , "" => "V" , "" => "G" , "" => "D" , "" => "E" , "" => "J" , "" => "Z" , "" => "I" , "" => "Y" , "" => "K" , "" => "L" , "" => "M" , "" => "N" , "" => "O" , "" => "P" , "" => "R" , "" => "S" , "" => "T" , "" => "U" , "" => "F" , "" => "H" , "" => "TS" , "" => "CH" , "" => "SH" , "" => "SCH" , "" => "" , "" => "YI" , "" => "" , "" => "E" , "" => "YU" , "" => "YA" , "" => "a" , "" => "b" , "" => "v" , "" => "g" , "" => "d" , "" => "e" , "" => "j" , "" => "z" , "" => "i" , "" => "y" , "" => "k" , "" => "l" , "" => "m" , "" => "n" , "" => "o" , "" => "p" , "" => "r" , "" => "s" , "" => "t" , "" => "u" , "" => "f" , "" => "h" , "" => "ts" , "" => "ch" , "" => "sh" , "" => "sch" , "" => "y" , "" => "yi" , "" => "'" , "" => "e" , "" => "yu" , "" => "ya" ); return strtr($str,$tr); } * This source code was highlighted with Source Code Highlighter .







まず、データベースから辞書全体を取得し、ペアで配列に書き込みます。ここで、キーはロシア語で、意味は音訳です。







  1. $ query = "SELECT ru_words FROM word_list" ;
  2. if ($ stmt = $ this-> conn-> prepare($ query))
  3. {
  4. $ stmt-> execute();
  5. $ stmt-> bind_result($ ru_word);
  6. while ($ stmt-> fetch())
  7. {
  8. $ word_translit [$ ru_word] = translitIt($ ru_word);
  9. }
  10. }
*このソースコードは、 ソースコードハイライターで強調表示されました。




次に、入力した単語が辞書に存在するかどうかを確認し、存在しない場合は音訳を行います。







  1. if (isset($ word_list [$ myWord]))
  2. {
  3. $正しい[]。= $ myWord;
  4. }
  5. 他に
  6. {
  7. $ myWord = $ this-> translitIt($ myWord);
*このソースコードは、 ソースコードハイライターで強調表示されました。




その後、配列から、入力された単語の「メタフォン」の半分を超えない「メタフォン」間のレーベンシュタイン距離を持つ単語を選択するサイクルを開始します(おおまかに言って、ミススペルされた子音の半分までが許可されます)距離ですが、単語全体で、その「メタフォン」と適合する単語に従ってではないので、配列に書き込みます。







  1. foreach ($ word_translit as $ n => $ k)
  2. {
  3. if (levenshtein(メタフォン($ myWord))、メタフォン($ k))<mb_strlen(メタフォン($ myWord))/ 2)
  4. {
  5. if (levenshtein($ myWord、$ k)<mb_strlen($ myWord)/ 2)
  6. {
  7. $ possibleWord [$ n] = $ k;
  8. }
  9. }
  10. }
*このソースコードは、 ソースコードハイライターで強調表示されました。




ここで、レーベンシュタイン距離が既知の大きな数値と等しくなる変数と、「類似テキスト」-既知の小さな数値を定義します。







  1. $類似性= 0;
  2. $ meta_similarity = 0;
  3. $ min_levenshtein = 1000;
  4. $ meta_min_levenshtein = 1000;
*このソースコードは、 ソースコードハイライターで強調表示されました。




これは、単語と配列内の単語との「類似性」の最大値、および最小レーベンシュタイン距離を決定するために必要です。 まず、最小レーベンシュタイン距離を見つけます。







  1. foreach ($ possibleWord として $ n)
  2. {
  3. $ min_levenshtein = min($ min_levenshtein、levenshtein($ n、$ myWord));
  4. }
*このソースコードは、 ソースコードハイライターで強調表示されました。




また、同様に、レーベンシュタインの距離が最小になる単語の「類似性」の最大値を探しています。





  1. foreach ($ possibleWord として $ n)
  2. {
  3. if (levenshtein($ k、$ myWord)== $ min_levenshtein)
  4. {
  5. $類似性=最大($類似性、similar_text($ n、$ myWord));
  6. }
  7. }
*このソースコードは、 ソースコードハイライターで強調表示されました。




次に、最小のレーベンシュタイン距離と「類似性」の最大値を持つすべての単語を同時に選択するサイクルを開始します。







  1. foreach ($ possible Word as $ n => $ k)
  2. {
  3. if (levenshtein($ k、$ myWord)<= $ min_levenshtein)
  4. {
  5. if (similar_text($ k、$ myWord)> = $類似性)
  6. {
  7. $ result [$ n] = $ k;
  8. }
  9. }
  10. }
*このソースコードは、 ソースコードハイライターで強調表示されました。




その後、単語と配列内の単語の「メタフォン」の「類似性」の最大値、および最小のレーベンシュタイン距離を決定します。







  1. foreach ($ $ n として結果)
  2. {
  3. $ meta_min_levenshtein = min($ meta_min_levenshtein、levenshtein(メタフォン($ n)、メタフォン($ myWord)));
  4. }
  5. foreach ($ $ n として結果)
  6. {
  7. if (levenshtein($ k、$ myWord)== $ meta_min_levenshtein)
  8. {
  9. $ meta_similarity = max($ meta_similarity、similar_text(metaphone($ n)、metaphone($ myWord)));
  10. }
  11. }
*このソースコードは、 ソースコードハイライターで強調表示されました。




そして、最終的な配列を取得します。理想的には、1つの単語を含む必要があります。







  1. foreach ($の結果 $ n => $ k)
  2. {
  3. if (レベンシュタイン(メタフォン($ k)、メタフォン($ myWord))<= $ meta_min_levenshtein)
  4. {
  5. if (similar_text(metaphone($ k)、metaphone($ myWord))> = $ meta_similarity)
  6. {
  7. $ meta_result [$ n] = $ k;
  8. }
  9. }
  10. }
*このソースコードは、 ソースコードハイライターで強調表示されました。




そして、キーとして保存されている正しい単語を返します。







  1. リターンキー($ meta_result);
*このソースコードは、 ソースコードハイライターで強調表示されました。




プラス:


単語の定義の精度は非常に高く、100,000語の辞書を使用したという事実を考慮すると、ゼロ形式のみが含まれ、リストには非常にまれにしか使用されない単語が含まれています(より正確には、最初に聞いたことがある)。 これは確かに結果を台無しにします。



結果:




純粋な単語とその「メタフォン」の両方で同じレーベンシュタイン距離と「類似性」の意味が存在する単語の問題は、単語の使用頻度を追加することによってのみ解決できる可能性が高い。



マイナス:


遅いパフォーマンス:



テスト済み:C2D E6550(2.33GHz)、4Gb(DDR2-800)。

入力された1〜2文字と長さが異なる単語のみをデータベースから引き出すことで、これを部分的に解決できると思います。



habrasocietyから、音声アルゴリズムを使用するためのより合理的なオプション、またはこの方法を改善するためのアイデアを聞いてうれしいです。



参照:


ここでは、1つのクラスのすべてのコードをダウンロードできます。

そして、 ここに私がテストに使用したロシア語の語彙があります。

4,588,867個の単語と単語形式の優れたデータベース提供してくれたKarroplanに感謝します



All Articles