ã¯ããã«
éå»æ°å¹Žã«ããããé³å£°ã€ã³ã¿ãŒãã§ãŒã¹ã¯ãŸããŸãç§ãã¡ãåãå²ãã§ããŸãã ãã€ãŠæ ç»ã§ããèŠãããªãã£ãé ãæªæ¥ã«ã€ããŠã®è©±ã¯ãå®åšãããã®ã§ããã æºåž¯é»è©±ã§ã®é³å£°ã®åæïŒText To Speech-TTSïŒããã³èªèïŒèªåé³å£°èªè-ASRïŒã®ããã®ãšã³ãžã³ã®åã蟌ã¿ã¯ãã§ã«è¡ãããŠããŸãã ããã«ãã¢ããªã±ãŒã·ã§ã³ã«ASRãšTTSãåã蟌ãããã®éåžžã«ã¢ã¯ã»ã¹ããããAPIãç»å ŽããŸããã
ããã§ã誰ã§ãé³å£°ã€ã³ã¿ãŒãã§ã€ã¹ãåããããã°ã©ã ãäœæã§ããããã«ãªããŸãïŒãšã³ãžã³ã®è²»çšãæ¯æãã®ã«èŠåŽããŸããïŒã ãã®ã¬ãã¥ãŒã¯ãç¹ã«æ¢åã®ãšã³ãžã³ïŒããšãã°ãNuanceïŒã®äœ¿çšã«å°å¿µãããããã®äœæã«ã¯å°å¿µããŸããã ãŸããæåã«é³å£°ã€ã³ã¿ãŒãã§ã€ã¹ã«ééããåããã°ã©ãã«å¿ èŠãªäžè¬æ å ±ãæäŸãããŸãã ãã®èšäºã¯ãé³å£°ãã¯ãããžãŒãèªç€Ÿè£œåã«çµ±åããããšã®å®çŸå¯èœæ§ãè©äŸ¡ããããšãããããžã§ã¯ããããŒãžã£ãŒã«ãšã£ãŠã圹ç«ã€å ŽåããããŸãã
ããã§ã¯å§ããŸããã...
ããããçš®ã®ããã«-åè«ïŒ
ã°ã«ãžã¢èªåŠæ ¡ã§ã®ãã·ã¢èªã¬ãã¹ã³ã
å çã¯æ¬¡ã®ããã«è¿°ã¹ãŠããŸãããåäŸãã¡ãèŠããŠãããŠãã ãããå¡©ãè±ã麺ãšããèšèã¯ãœãããµã€ã³ã§ããã©ãŒã¯ããã«ã¯ããã¬ãŒããšããèšèã¯ãœãããµã€ã³ãªãã§æžãããŠããŸãã åã©ããã¡ãèŠããŠãç解ããã®ã¯äžå¯èœã ããïŒã
ãã®åè«ã¯ãç§ã«ã¯ãšãã§ããªãããã«èŠããŸããã ä»-ããã人çã ãªããã ä»ãç§ã¯èª¬æããããšããŸã...
1.é³çŽ
ã¹ããŒããšèšãã°ïŒãã§ã«ã°ãããŠããïŒããŸãé³çŽ ã®æŠå¿µã«å¯ŸåŠããå¿ èŠããããŸãã ç°¡åã«èšãã°ãé³çŽ ã¯ã人ãçºé³ããã³èªèã§ããç¬ç«ããé³ã§ãã ãããããã®å®çŸ©ã¯ç¢ºãã«ååã§ã¯ãããŸãããå€ãã®é³ãçºé³ããããšãã§ããèšèªã®é³çŽ ã®ã»ãããéãããŠããããã§ãã ãã£ãšå³å¯ã«å®çŸ©ããããšæããŸãã ã ãã-ããªãã¯èšèªåŠè ã«è¡ãå¿ èŠããããŸãã æ²ããããªãèšèªåŠè èªèº«ã¯ãããäœã§ãããã«ã€ããŠåæããããšã¯ã§ããŸããïŒãããŠåœŒãã¯æ¬åœã«ãããå¿ èŠãšããŸããïŒãã圌ãã¯ããã€ãã®ã¢ãããŒããæã£ãŠããŸãã é³çŽ ãšæå³ãçµã³ä»ããŸãã ããšãã°ãè±èªçã®Wikiã¯ããæå³ã®å€åãããããå¯èœæ§ã®ãããå¯Ÿç §çãªæå°ã®èšèªåäœããšèªã£ãŠããŸãã ç¥èŠãæã€ä»ã®äººã ããã§ãç§ãã¡ã®åèN. Trubetskoyã¯ãé³é»åäœããã®èšèªã®èŠ³ç¹ãã次ã ãšç¶ãçãé³é»åäœã«å解ã§ããªãé³çŽ åäœããšæžããŠããŸãã ãããŠããããšå¥ã®å®çŸ©ã«ã¯ãç§ãã¡ã«ãšã£ãŠéèŠãªæ確åããããŸãã äžæ¹ã§ã¯ãé³çŽ ãå€æŽãããšãåèªã®æå³ãå€æŽããããšãã§ããŸãïŒããããå¿ èŠã¯ãããŸããïŒã ãããã£ãŠããã³ãŒãããšãç«ãã¯2ã€ã®ç°ãªãåèªãšããŠèªèãããŸãã äžæ¹ããmuseumããŸãã¯ãmuseããšèšã£ãŠãæå³ã¯å€ãããŸããã ããªãã®å¯Ÿè«è ãã©ããããããããªãã®ã¢ã¯ã»ã³ããåé¡ããããšãã§ãããšããããšã¯å¯èœã§ããïŒ é³çŽ ã®äžå¯åæ§ãéèŠã§ãã ããããTrubetskoyãæ£ããææããããã«ãèšèªã«äŸåããå¯èœæ§ããããŸãã 1ã€ã®åœç±ã®äººã1ã€ã®é³ãèãå Žåãä»ã®äººã¯2ã€ã®é³ã次ã ãšèãããšãã§ããŸãã ãã ãã1ã€ã ãã§ã¯ãªãããã¹ãŠã®èšèªã«é©ããé³å£°äžå€åŒãå¿ èŠã§ãã
2.é³å£°ã¢ã«ãã¡ããã
1888幎ã«äœããã®åœ¢ã§å®çŸ©ã確å®ããããã«ã International Phonetic Alphabet ïŒIPAïŒãäœæãããŸããã ãã®ã¢ã«ãã¡ãããã¯ãç¹å®ã®èšèªã«äŸåããªããšããç¹ã§åªããŠããŸãã ã€ãŸã ããã¯ãã»ãšãã©ãã¹ãŠã®æ¢åã®ïŒããã«ã¯æ»ãã ïŒèšèªã®é³ãçºé³ããŠèªèããããšãã§ãããã¹ãŒããŒãã³ãã®ããã«èšèšãããŠããŸãã ã¢ã«ãã¡ãããIPAã¯ãç§ãã¡ã®æ代ïŒ2005幎ïŒãŸã§åŸã ã«å€ãããŸããã ããã¯äž»ã«ã³ã³ãã¥ãŒã¿ãŒä»¥åã®æ代ã«äœæããããããå²åŠè ã¯ç¥ãéã眮ãããã«é³ã瀺ãèšå·ãæããã ãã¡ããã圌ãã¯ã©ããããããã©ãã³ç³»ã®ã¢ã«ãã¡ãããã«çŠç¹ãåãããŸããããéåžžã«ãéåžžã«æ¡ä»¶ä»ãã§ã ãã®çµæãIPAæåã¯Unicodeã§äœ¿çšã§ããããã«ãªããŸããããããŒããŒãããå ¥åããã®ã¯ç°¡åã§ã¯ãããŸããã ããã§èªè ã¯å°ãããããããŸãã-ãªã人ã ã¯IPAãå¿ èŠãšããŸããïŒ å°ãªããšãçºé³ã©ããã«ç¶Žãããåèªã®äŸã¯ã©ãã§èŠãããšãã§ããŸããïŒ ç§ã®çãã¯ãæ®éã®äººã¯IPAãç¥ãå¿ èŠããªããšããããšã§ãã ããããããããã¹ãŠã«ãããå°çåãå§ãåºæåã«é¢é£ããå€ãã®Wikièšäºã§éåžžã«ç°¡åã«èŠãããšãã§ããŸãã IPAãç¥ã£ãŠããã°ããªãã¿ã®ãªãèšèªã§ç¹å®ã®ååã®æ£ããçºé³ããã€ã§ã確èªã§ããŸãã ããšãã°ããã©ã³ã¹äººãšããŠãããªããšèšãããã§ããïŒ ããã«è¡ããŸã-[paÊi]ã
3.é³å£°è¡šèš
æ°é ãã®ãããŠã£ããŠãŒã¶ãŒã¯ãå¥åŠãªè¡šé³ã¢ã«ãã¡ãããã¢ã€ã³ã³ãè§ãã£ã[mÉËskva]ã®äžã«ããããšãæã«ã¯ã¹ã©ãã·ã¥ã®äžã«ããããšã«æ°ã¥ãããšãã§ããŸã-/ ËlÊnd cann /ã éãã¯äœã§ããïŒ è§æ¬åŒ§ã§ããããã çãããŸãã¯ãçãã転åã åœå æåŠã§ã¯ãé³å£°ãšåŒã°ããŸãã ã¹ã©ãã·ã¥ã§ã¯ããããŒããæžãããŠããŸãã ãåºç¯å²ããŸãã¯é³çŽ 転åã å®çšçãªæå³ã¯æ¬¡ã®ãšããã§ããé³å£°è¡šèšã¯éåžžã«æ£ç¢ºãªçºé³ãæäŸããŸããããã¯ããæå³ã§ã話è ã®ã¢ã¯ã»ã³ãã«é¢ä¿ãªãçæ³çã§ãã èšãæããã°ãé³å£°è¡šèšã䜿çšãããšããCockneyã¯ãã®åèªããã®ããã«çºé³ããŸãããšèšãããšãã§ããŸãã é³çŽ 転åã¯ããªãšãŒã·ã§ã³ãå¯èœã«ããŸãã ãã®ãããåããšã³ããªã//ã«ãããªãŒã¹ãã©ãªã¢è±èªãšã«ããè±èªã®çºé³ã¯ç°ãªãå ŽåããããŸãã å®éãçãæåèµ·ããã§ããç°¡åã§ã¯ãããŸããã ã€ãŸã wawãã¡ã€ã«ããããªãé¢ããŠããŸãã ç·æ§ã女æ§ãåäŸã®å£°ã¯åãé³çŽ ãç°ãªãæ¹æ³ã§çºé³ããã ãŸããé³å£°ã®äžè¬çãªé床ãé³éãããã³é³å£°ã®åºæ¬ãããã¯èæ ®ãããŸããã å®éããããã®éãã«ãããé³å£°ã®çæãšèªèã®ã¿ã¹ã¯ã¯éèŠã§ãã ããã«æ¬æã§ã¯ãç¹ã«æèšãããŠããªãéããçã転åã§ã¯åžžã«IPAã䜿çšããŸãã åæã«ãIPAã®çŽæ¥äœ¿çšãæå°éã«æããããã«ããŸãã
4.èšèª
çããŠããèªç¶èšèªã«ã¯ãããããç¬èªã®é³çŽ ã»ããããããŸãã ããæ£ç¢ºã«ã¯ãããã¯ã¹ããŒãã®ç¹æ§ã§ããäžè¬çã«èšãã°ãèšèãçºé³ããããšãªãèšèªãç¥ãããšãã§ããããã§ãïŒèŽèŠé害è ãæããªè ã«èšèªãæããæ¹æ³ïŒã ã¢ã«ãã¡ããããç°ãªãããã«ãèšèªã®é³å£°æ§æã¯ç°ãªããŸãã ãããã£ãŠãèšèªã®é³å£°ã®è€éããç°ãªããŸãã 2ã€ã®ã³ã³ããŒãã³ãã§æ§æãããŠããŸãã 第äžã«ãæžèšçŽ ãé³çŽ ã«å€æããããšã®é£ããïŒè±èªã§ã¯ããã³ãã§ã¹ã¿ãŒããšããªãããŒã«ããèªãããšãèŠããŠããŸãïŒãšãé³èªäœïŒé³çŽ ïŒãçºé³ããã®ãé£ããããšã§ãã éåžžãããã€ã®é³çŽ ã«èšèªãå«ãŸããŠããŸããïŒ æ°ååã 幌å°æããããã·ã¢èªã®çºé³ã¯3ãããŒã®ããã«åçŽã§ããããšãŒãããã®èšèªãšã¯å¯Ÿç §çã«ããã¹ãŠãæžããããšããã«èªãŸããããšãæããããŸããã ãã¡ããã ãŸãããŸããïŒ èšèãæåéãã«æžãããŠããéãã«èªãã å Žåã圌ãã¯ããªããç解ããŠããããåžžã«çå®ã§ãããšã¯éããªãã ãããã圌ãã¯ç¢ºãã«ãã·ã¢äººãæ°ããŸããã ããã«ããšãŒããã人ã«ãšã£ãŠã¹ãã¬ã¹ã®ãããªæãããããšãé¢ä¿ããŠããŸãã ç¥ãæå³ãå€ããªããéãèªåã®éã«ä¹ããããã«ãåé ïŒè±èªã®ããã«ïŒãŸãã¯æ«å°ŸïŒãã©ã³ã¹èªã®ããã«ïŒã«çœ®ãã®ã§ã¯ãªããåèªå šäœã«ããã£ãŠç§ãã¡ãšæ©ããŸãã D o rogiãšdor o giã¯2ã€ã®ç°ãªãåèªã§ãããåè©ã§ãããããŸãã ãã·ã¢èªã®é³çŽ ã¯ããã€ã§ããïŒ Nuanceã«ã¯54åãããŸãã æ¯èŒã®ããã«ãè±èªã§ã¯45é³çŽ ããã©ã³ã¹èªã§ã¯34é³é³çŽ ãããããŸããã貎æãæ°äžçŽåã«ç¿åŸããã®ãç°¡åãªèšèªã ãšèããŠããã®ã¯ç¡æå³ã§ã¯ãããŸããã§ããïŒ ãã¡ããããã·ã¢èªã¯ãšãŒãããã§æãé£ããèšèªã§ã¯ãããŸãããããã®ãã¡ã®1ã€ã§ãïŒèŠããŠãããŠãã ãããç§ã¯ãŸã ææ³ã«ã€ããŠæ²é»ããŠããŸãïŒã
5.X-SAMPAããã³LH +
人ã ãé·ãéããŒããŒãããé³å£°è¡šèšãå°å ¥ãããã£ãã®ã§ãUnicodeãåºãé åžãããåãããASCIIããŒãã«ã®æåã®ã¿ã䜿çšã§ããããã«ããè¡šèšæ³ãéçºãããŸããã ãããã®æãäžè¬çãª2ã€ã¯ãJohn Wellsææã®åµäœã§ããX-SAMPAãšã LernoutïŒHauspieã®å éšåœ¢åŒã§ããLH +ã§ããããã®æè¡ã¯åŸã«Nuance Communicationsã«ãã£ãŠè³Œå ¥ãããŸããã X-SAMPAãšLH +ã«ã¯ããªã倧ããªéãããããŸãã æ£åŒã«ã¯ãX-SAMPAã¯ãç¹å®ã®èŠåã«ãããASCIIã®ã¿ã䜿çšããŠåãIPAé³çŽ ãèšé²ã§ããè¡šèšæ³ã§ãã ãã1ã€ã¯LH +ã§ãã ããæå³ã§ã¯ãLH +ã¯å¹ åºãïŒé³çŽ çãªïŒè»¢åã®é¡äŒŒäœã§ãã å®éã«ã¯ãåèšèªã§ãåãLH +èšå·ãç°ãªãIPAé³çŽ ã瀺ãããšããããŸãã äžæ¹ã§ãããã¯è¯ãããšã§ãããªããªãã ã¬ã³ãŒãã¯ççž®ãããèãããããã¹ãŠã®IPAæåããšã³ã³ãŒãããå¿ èŠã¯ãããŸããããäžæ¹ããããŸãããçããŸãã ãŸããIPAãžã®ç¿»èš³ã®ãã³ã«ãé£çµ¡è¡šãç®ã®åã«çœ®ããŠããå¿ èŠããããŸãã ãã ããæãæ²ããããšã¯ãç¹å®ã®èšèªã®ãé³å£°ãã®ã¿ãLH +ã§é²é³ãããè¡ãæ£ããçºé³ã§ããããšã§ãã
6.æ祚
ããããéå»ã«æªãã³ãŒããæžããããããã°ã©ããŒãé ã®äžã§ããè³ã«ãã声ã§ã¯ãããŸããã ããããããã²ãŒã¿ãŒããã®ä»ã®ã¢ãã€ã«ããã€ã¹ã®ææè ããã©ãã«ãŒããã¡ã€ã«ãŠã©ãã·ã¥ã§é »ç¹ã«æ€çŽ¢ãã人ã«ã€ããŠã§ãã ãããã®å£°ã«ãååããããŸãã ãMilenaããšãKaterinaããšããèšèã¯ãé³å£°ã€ã³ã¿ãŒãã§ãŒã¹ã®çµéšè±å¯ãªãŠãŒã¶ãŒã«å€ãã®ããšãèªã£ãŠããŸãã ããã¯äœïŒ ãããŸãã«èšã£ãŠããããã¯ã³ã³ãã¥ãŒã¿ãŒãé³çŽ ãé³ã«å€æã§ããããã«ããããŸããŸãªäŒæ¥ïŒNuanceãªã©ïŒã«ãã£ãŠæºåãããããŒã¿ã»ããã§ãã 声ã¯å¥³æ§ãšç·æ§ã§ãããå€ãã®è²»çšãããããŸãã ãã©ãããã©ãŒã ãšéçºè ã«ãã£ãŠã¯ãé³å£°ããšã«2ã5000ãã«ãæ¯æãå¿ èŠãããå ŽåããããŸãã ãããã£ãŠãå°ãªããšã5ã€ã®æãäžè¬çãªãšãŒãããèšèªã§ã€ã³ã¿ãŒãã§ãŒã¹ãäœæãããå Žåãè«æ±æžã¯æ°äžã«éããå¯èœæ§ããããŸãã ãã¡ãããç§ãã¡ã¯ãœãããŠã§ã¢ã€ã³ã¿ãŒãã§ãŒã¹ã«ã€ããŠè©±ããŠããã ãã®ãããé³å£°ã¯èšèªåºæã§ãã ããããé³å£°è¡šèšãžã®çµåãå§ãŸããŸãã ãããæåã«å®çŸããã®ã¯ç°¡åã§ã¯ãããŸããããèšäºã®åé ã®åè«ã¯æ¬åœã®çå®ã§ãã 1ã€ã®æ¯åœèªãæã€äººã ã¯ãéåžžãæ¯åœèªã«ãªãå¥ã®é³çŽ ãçºé³ã§ããŸããã ããã«æªãããšã«ãåã ã®é³çŽ ã ãã§ãªãããããã®ç¹å®ã®çµã¿åããããããŸãã ãããã£ãŠãããªãã®èšèªã§åèªããœãããªãlãã§çµãããªãå ŽåãïŒæåã¯ïŒçºé³ã§ããŸããã
声ã§ãåãããšã é³å£°ã¯ããã®èšèªã®é³çŽ ã®ã¿ãçºé³ããããã«èšèšãããŠããŸãã ããã«-èšèªã®ç¹å®ã®æ¹èšã§ã ã€ãŸã ã«ããã®ãã©ã³ã¹èªãšãã©ã³ã¹èªã®ãã©ã³ã¹èªã®é³å£°ã¯ãé³ãç°ãªãã ãã§ãªããçºé³ãããé³çŽ ã®ã»ãããç°ãªããŸãã ã¡ãªã¿ã«ãããã¯ASRããã³TTSãšã³ãžã³ã®ã¡ãŒã«ãŒã«ãšã£ãŠäŸ¿å©ã§ãã åèšèªã¯å¥ã ã®ãéã§è²©å£²ã§ããŸãã äžæ¹ãããªãã¯ããããç解ããããšãã§ããŸãã é³å£°ã®äœæã¯éåžžã«æéããããããéãããããŸãã ããããããããŸãã«ãã»ãšãã©ã®èšèªã®ãªãŒãã³ãœãŒã¹ãœãªã¥ãŒã·ã§ã³ã®åžå ŽããŸã åºããªãçç±ã§ãã
ãã¹ãŠã®IPAé³çŽ ãçºé³ã§ããããŠãããŒãµã«ããã€ã¹ã®äœæã劚ãããã®ã¯ãªãããã«æãããå€èšèªã€ã³ã¿ãŒãã§ã€ã¹ã®åé¡ã解決ããŸãã ããããäœããã®çç±ã§èª°ããããããŸããã ã»ãšãã©ã®å Žåãããã¯äžå¯èœã§ãã ã€ãŸã 圌ã¯èšãããšãã§ããŸããããã¹ãŠã®ãã€ãã£ãã¹ããŒã«ãŒã¯ãçºé³ã®ãèªç¶ããã®æ¬ åŠã«äžæºãæããã§ãããã å°ãç·Žç¿ããã€ã®ãªã¹äººã®å£ã®äžã§ã¯ãã·ã¢èªã®ããã«èããããã©ã³ã¹äººã®å£ã®äžã§ã¯è±èªã®ããã«èãããŸãã ãããã£ãŠãå€èšèªäž»çŸ©ãå¿ èŠãªå Žåã¯ãåå²ããæºåãããŠãã ããã
7. TTS APIã®äŸ
TTSã§ã®äœæ¥ããã»ã¹ãäžäœã¬ãã«ïŒC ++ã䜿çšïŒã§ã©ã®ããã«èŠããããèªè ã«ç€ºãããã«ãNuanceãšã³ãžã³ã«åºã¥ãé³å£°åæã®äŸã瀺ããŸãã ãã¡ãããããã¯äžå®å šãªäŸã§ããå®è¡ã§ããã ãã§ãªããã³ã³ãã€ã«ããããšãã§ããŸãããããã»ã¹ã®ã¢ã€ãã¢ãæäŸããŸãã TTS_SpeakïŒïŒãé€ããã¹ãŠã®é¢æ°ã¯ããã€ã³ãã£ã³ã°ãšããŠå¿ èŠã§ãã
TTS_InitializeïŒïŒ-ãšã³ãžã³ã®åæåã«åœ¹ç«ã¡ãŸã
TTS_CleanupïŒïŒ-åæå解é€ã®ãã
TTS_SelectLanguage-èšèªãéžæããèªèãã©ã¡ãŒã¿ãŒãèšå®ããŸãã
TTS_SpeakïŒïŒ-å®éã«ãµãŠã³ããµã³ãã«ãçæããŸã
TTS_CallbackïŒïŒ-ä»ã®ã€ãã³ãã®å Žåãšåæ§ã«ããªãŒãã£ãªããŒã¿ã®æ¬¡ã®éšåã®åçæºåãã§ãããšãã«åŒã³åºãããŸãã
TTSãšãã®ãã€ã³ãã£ã³ã°
static const NUAN_TCHAR * _dataPathList[] = { __TEXT("\\lang\\"), __TEXT("\\tts\\"), }; static VPLATFORM_RESOURCES _stResources = { VPLATFORM_CURRENT_VERSION, sizeof(_dataPathList)/sizeof(_dataPathList[0]), (NUAN_TCHAR **)&_dataPathList[0], }; static VAUTO_INSTALL _stInstall = {VAUTO_CURRENT_VERSION}; static VAUTO_HSPEECH _hSpeech = {NULL, 0}; static VAUTO_HINSTANCE _hTtsInst = {NULL, 0}; static WaveOut * _waveOut = NULL; static WaveOutBuf * _curBuffer = NULL; static int _volume = 100; static int _speechRate = 0; // use default speech rate static NUAN_ERROR _Callback (VAUTO_HINSTANCE hTtsInst, VAUTO_OUTDEV_HINSTANCE hOutDevInst, VAUTO_CALLBACKMSG * pcbMessage, VAUTO_USERDATA UserData); static const TCHAR * _szLangTLW = NULL; static VAUTO_PARAMID _paramID[] = { VAUTO_PARAM_SPEECHRATE, VAUTO_PARAM_VOLUME }; static NUAN_ERROR _TTS_GetFrequency(VAUTO_HINSTANCE hTtsInst, short *pFreq) { NUAN_ERROR Error = NUAN_OK; VAUTO_PARAM TtsParam; /*-- get frequency used by current voicefont --*/ TtsParam.eID = VAUTO_PARAM_FREQUENCY; if (NUAN_OK != (Error = vauto_ttsGetParamList (hTtsInst, &TtsParam, 1)) ) { ErrorV(_T("vauto_ttsGetParamList rc=0x%1!x!\n"), Error); return Error; } switch(TtsParam.uValue.usValue) { case VAUTO_FREQ_8KHZ: *pFreq = 8000; break; case VAUTO_FREQ_11KHZ: *pFreq = 11025; break; case VAUTO_FREQ_16KHZ: *pFreq = 16000; break; case VAUTO_FREQ_22KHZ: *pFreq = 22050; break; default: break; } return NUAN_OK; } int TTS_SelectLanguage(int langId) { NUAN_ERROR nrc; VAUTO_LANGUAGE arrLanguages[16]; VAUTO_VOICEINFO arrVoices[4]; VAUTO_SPEECHDBINFO arrSpeechDB[4]; NUAN_U16 nLanguageCount, nVoiceCount, nSpeechDBCount; nLanguageCount = sizeof(arrLanguages)/sizeof(arrLanguages[0]); nVoiceCount = sizeof(arrVoices) /sizeof(arrVoices[0]); nSpeechDBCount = sizeof(arrSpeechDB)/sizeof(arrSpeechDB[0]); int nVoice = 0, nSpeechDB = 0; nrc = vauto_ttsGetLanguageList( _hSpeech, &arrLanguages[0], &nLanguageCount); if(nrc != NUAN_OK){ TTS_ErrorV(_T("vauto_ttsGetLanguageList rc=0x%1!x!\n"), nrc); return 0; } if(nLanguageCount == 0 || nLanguageCount<=langId){ TTS_Error(_T("vauto_ttsGetLanguageList: No proper languages found.\n")); return 0; } _szLangTLW = arrLanguages[langId].szLanguageTLW; NUAN_TCHAR* szLanguage = arrLanguages[langId].szLanguage; nVoice = 0; // select first voice; NUAN_TCHAR* szVoiceName = arrVoices[nVoice].szVoiceName; nSpeechDB = 0; // select first speech DB { VAUTO_PARAM stTtsParam[7]; int cnt = 0; // language stTtsParam[cnt].eID = VAUTO_PARAM_LANGUAGE; _tcscpy(stTtsParam[cnt].uValue.szStringValue, szLanguage); cnt++; // voice stTtsParam[cnt].eID = VAUTO_PARAM_VOICE; _tcscpy(stTtsParam[cnt].uValue.szStringValue, szVoiceName); cnt++; // speechbase parameter - frequency stTtsParam[cnt].eID = VAUTO_PARAM_FREQUENCY; stTtsParam[cnt].uValue.usValue = arrSpeechDB[nSpeechDB].u16Freq; cnt++; // speechbase parameter - reduction type stTtsParam[cnt].eID = VAUTO_PARAM_VOICE_MODEL; _tcscpy(stTtsParam[cnt].uValue.szStringValue, arrSpeechDB[nSpeechDB].szVoiceModel); cnt++; if (_speechRate) { // Speech rate stTtsParam[cnt].eID = VAUTO_PARAM_SPEECHRATE; stTtsParam[cnt].uValue.usValue = _speechRate; cnt++; } if (_volume) { // Speech volume stTtsParam[cnt].eID = VAUTO_PARAM_VOLUME; stTtsParam[cnt].uValue.usValue = _volume; cnt++; } nrc = vauto_ttsSetParamList(_hTtsInst, &stTtsParam[0], cnt); if(nrc != NUAN_OK){ ErrorV(_T("vauto_ttsSetParamList rc=0x%1!x!\n"), nrc); return 0; } } return 1; } int TTS_Initialize(int defLanguageId) { NUAN_ERROR nrc; nrc = vplatform_GetInterfaces(&_stInstall, &_stResources); if(nrc != NUAN_OK){ Error(_T("vplatform_GetInterfaces rc=%1!d!\n"), nrc); return 0; } nrc = vauto_ttsInitialize(&_stInstall, &_hSpeech); if(nrc != NUAN_OK){ Error(_T("vauto_ttsInitialize rc=0x%1!x!\n"), nrc); TTS_Cleanup(); return 0; } nrc = vauto_ttsOpen(_hSpeech, _stInstall.hHeap, _stInstall.hLog, &_hTtsInst, NULL); if(nrc != NUAN_OK){ ErrorV(_T("vauto_ttsOpen rc=0x%1!x!\n"), nrc); TTS_Cleanup(); return 0; } // Ok, time to select language if(!TTS_SelectLanguage(defLanguageId)){ TTS_Cleanup(); return 0; } // init Wave out device { short freq; if (NUAN_OK != _TTS_GetFrequency(_hTtsInst, &freq)) { TTS_ErrorV(_T("_TTS_GetFrequency rc=0x%1!x!\n"), nrc); TTS_Cleanup(); return 0; } _waveOut = WaveOut_Open(freq, 1, 4); if (_waveOut == NULL){ TTS_Cleanup(); return 0; } } // init TTS output { VAUTO_OUTDEVINFO stOutDevInfo; stOutDevInfo.hOutDevInstance = _waveOut; stOutDevInfo.pfOutNotify = TTS_Callback; // Notify using callback! nrc = vauto_ttsSetOutDevice(_hTtsInst, &stOutDevInfo); if(nrc != NUAN_OK){ ErrorV(_T("vauto_ttsSetOutDevice rc=0x%1!x!\n"), nrc); TTS_Cleanup(); return 0; } } // OK TTS engine initialized return 1; } void TTS_Cleanup(void) { if(_hTtsInst.pHandleData){ vauto_ttsStop(_hTtsInst); vauto_ttsClose(_hTtsInst); } if(_hSpeech.pHandleData){ vauto_ttsUnInitialize(_hSpeech); } if(_waveOut){ WaveOut_Close(_waveOut); _waveOut = NULL; } vplatform_ReleaseInterfaces(&_stInstall); memset(&_stInstall, 0, sizeof(_stInstall)); _stInstall.fmtVersion = VAUTO_CURRENT_VERSION; } int TTS_Speak(const TCHAR * const message, int length) { VAUTO_INTEXT stText; stText.eTextFormat = VAUTO_NORM_TEXT; stText.szInText = (void*) message; stText.ulTextLength = length * sizeof(NUAN_TCHAR); TraceV(_T("TTS_Speak: %1\n"), message); NUAN_ERROR rc = vauto_ttsProcessText2Speech(_hTtsInst, &stText); if (rc == NUAN_OK) { return 1; } if (rc == NUAN_E_TTS_USERSTOP) { return 2; } ErrorV(_T("vauto_ttsProcessText2Speech rc=0x%1!x!\n"), rc); return 0; } static NUAN_ERROR TTS_Callback (VAUTO_HINSTANCE hTtsInst, VAUTO_OUTDEV_HINSTANCE hOutDevInst, VAUTO_CALLBACKMSG * pcbMessage, VAUTO_USERDATA UserData) { VAUTO_OUTDATA * outData; switch(pcbMessage->eMessage){ case VAUTO_MSG_BEGINPROCESS: WaveOut_Start(_waveOut); break; case VAUTO_MSG_ENDPROCESS: break; case VAUTO_MSG_STOP: break; case VAUTO_MSG_OUTBUFREQ: outData = (VAUTO_OUTDATA *)pcbMessage->pParam; memset(outData, 0, sizeof(VAUTO_OUTDATA)); { WaveOutBuf * buf = WaveOut_GetBuffer(_waveOut); if(buf){ VAUTO_OUTDATA * outData = (VAUTO_OUTDATA *)pcbMessage->pParam; outData->eAudioFormat = VAUTO_16LINEAR; outData->pOutPcmBuf = WaveOutBuf_Data(buf); outData->ulPcmBufLen = WaveOutBuf_Size(buf); _curBuffer = buf; break; } TTS_Trace(_T("VAUTO_MSG_OUTBUFREQ: processing was stopped\n")); } return NUAN_E_TTS_USERSTOP; case VAUTO_MSG_OUTBUFDONE: outData = (VAUTO_OUTDATA *)pcbMessage->pParam; WaveOutBuf_SetSize(_curBuffer, outData->ulPcmBufLen); WaveOut_PutBuffer(_waveOut, _curBuffer); _curBuffer = NULL; break; default: break; } return NUAN_OK; }
èªè ãæ°ã¥ããããããŸããããã³ãŒãã¯ããªãé¢åã§ãåçŽãªïŒäžèŠïŒæ©èœã«ã¯å€æ°ã®ããªã»ãããå¿ èŠã§ãã æ²ããããªãããã¯ãšã³ãžã³ã®æè»æ§ã®è£è¿ãã§ãã ãã¡ãããä»ã®èšèªçšã®ä»ã®ãšã³ãžã³ã®APIã¯ãã¯ããã«ã·ã³ãã«ããã³ã³ãã¯ãã«ã§ããŸãã
8.åã³é³çŽ
APIãèŠããšãèªè ãå°ããå ŽåããããŸã-TTSïŒText-To-SpeechïŒãããã¹ããé³å£°ã«çŽæ¥å€æã§ããå Žåããªãé³çŽ ãå¿ èŠãªã®ã§ããããã å¯èœã§ããããããããã1ã€ãããŸãã ãšã³ãžã³ã«ãªãã¿ã®ããåèªã¯ãé³å£°ã«å€æãããŸãã ããªãã¿ã®ãªãããšããèšèã¯ç¶æ³ãããã«æªåãããŸãã å°åãåºæåè©ãªã©ã ããã¯ãããšãã°ãã·ã¢ãªã©ã®å€åœç±åœã§ç¹ã«é¡èã§ãã åå°ã®æ°žé ã«6åã®1ã®é åã«ããéœåžãçºã®ååã¯ãç°ãªã人ã ãç°ãªãèšèªãç°ãªãæéã«äžããããŸããã ãã·ã¢æåã§ãããã綎ãå¿ èŠæ§ã¯ãååœèªã§æªãåè«ãæŒããŸããã ã¿ã¿ãŒã«èªããããèªãã¢ãããžã¢èªãã«ã¶ãèªãã€ã¯ãŒãèªãããªã£ãèªã®é³çŽ ã¯ããã·ã¢èªã®ããã¯ã©ã¹éã®ãããã«çµã蟌ãŸããŸããã ããã«ã¯ãå€ãã®é³çŽ ããããŸãããããã§ãåã®é£åã®äººã ã®ãã¹ãŠã®èšèªãäŒããã ãã§ã¯ååã§ã¯ãããŸããã ããããããã«æªãããšã«ãè¡šé³èšé²ãå°ãªããšãå ã®ãã®ãšããçšåºŠé¡äŒŒããŠããå ŽåãTTSãšã³ãžã³ãèªããšãKuchuk-Kainardzhiãã®ãããªååã¯ç¬ã声ã«ãããŸããã
ããããããã¯ãã·ã¢èªã®åé¡ã«éããªããšèããã®ã¯åçŽã§ãã 人å£ã®ç¹ã§ããå質ãªåœã«ãåæ§ã®å°é£ãååšããŸãã ãã®ããããã©ã³ã¹èªã§ã¯ãåèªã®æåŸã«ããæåpãbãdãtãsã¯éåžžèªãŸããŸããã ããããå°åãä»ãããšãå°å ã®äŒçµ±ãããã§çºå¹ããŸãã ãã®ãããæåŸã®ãããªããšããèšèã¯å®éã«ã¯çºé³ããããããŽã¡ãªãªã¹ããšããèšèã§ã¯-éããŸãåæ§ã§ãã éãã¯ãããªã¯ãã©ã³ã¹ã®åã«äœçœ®ãããŽã¡ããªã¹ã¯åã®ãããŽã¡ã³ã¹ã«ãããçºé³èŠåãå€å°ç°ãªãããšã§ãã ãã®ãããåèªã®é³å£°è»¢åãå¿ èŠã§ãã éåžžãã«ãŒãã«ã¯ä»å±ããŠããŸãã 確ãã«ã圢åŒã®çµ±äžã¯èŠ³å¯ãããŸããã ãã®ãããNavTeqã¯äŒçµ±çã«X-SAMPAãã©ã³ã¹ã¯ãªãã·ã§ã³ãšTomTom-LH +ã䜿çšããŸãã TTSãšã³ãžã³ãäž¡æ¹ãåãå ¥ããŠããå Žåãããã§ãªãå Žåã¯ã©ãã§ããããïŒ ããã§ããªãã¯å€æ ããªããã°ãªããŸããã ããšãã°ãããæåèµ·ãããå¥ã®æåå€æã«å€æããããšã¯ãããèªäœã§ã¯ç°¡åã§ã¯ãããŸããã é³å£°æ å ±ããŸã£ãããªãå Žåããšã³ãžã³ã«ã¯ãããååŸããããã®ç¬èªã®ã¡ãœããããããŸãã Nuanceãšã³ãžã³ã«ã€ããŠèšãã°ããããŒã¿é§ååæžèšçŽ ããé³çŽ ãïŒDDG2PïŒããã³ãå ±éèšèªã³ã³ããŒãã³ããïŒCLCïŒã§ãã ãã ãããããã®ãªãã·ã§ã³ã®äœ¿çšã¯ãã§ã«æ¥µç«¯ãªæ段ã§ãã
9.ç¹å¥ãªã·ãŒã±ã³ã¹
Nuanceã«ã¯ãããã¹ããŸãã¯é³å£°èšé²ãçºé³ããæ©èœã ãã§ãªããããããåçã«åãæ¿ããæ©èœããããŸãã ãããè¡ãã«ã¯ã次ã®åœ¢åŒã®ãšã¹ã±ãŒãã·ãŒã±ã³ã¹ã䜿çšããŸãã<ESC> / +
äžè¬ã«ããšã¹ã±ãŒãã·ãŒã±ã³ã¹ã䜿çšããŠãå€ãã®ãã©ã¡ãŒã¿ãŒãæå®ã§ããŸãã äžè¬çãªåœ¢åŒã§ã¯ã次ã®ããã«ãªããŸãã
<ESC> \ <param> = <å€> \
äŸãã°
\ x1b \ rate = 110 \-çºé³é床ãèšå®ããŸã
\ x1b \ vol = 5 \-ããªã¥ãŒã ãèšå®ããŸã
\ x1b \ audio = "beep.wav" \-wavãã¡ã€ã«ã®ããŒã¿ããªãŒãã£ãªã¹ããªãŒã ã«æ¿å ¥ããŸãã
åæ§ã«ããšã³ãžã³ã®ã¹ãã«ãåèªã«ããããããŒãºãæ¿å ¥ããããé³å£°ãå€æŽãããïŒããšãã°ãç·æ§ãã女æ§ã«ïŒããããšãã§ããŸãã ãã¡ããããã¹ãŠã®ã·ãŒã±ã³ã¹ã䟿å©ãªããã§ã¯ãããŸããããå šäœçã«éåžžã«äŸ¿å©ãªæ©èœã§ãã
10.èŸæž
ç¹å®ã®åèªã»ãããç¹å®ã®æ¹æ³ïŒç¥èªãç¥èªãåºæåè©ãªã©ïŒã§çºé³ããªããã°ãªããªãå ŽåããããŸãããããããã®å Žåã«ããã¹ããé³å£°è¡šèšã«çœ®ãæããå¿ èŠããããŸãïŒããã¯åžžã«å¯èœãšã¯éããŸããïŒã ãã®å ŽåãèŸæžãå©ãã«ãªããŸãã Nuanceçšèªéã®èŸæžãšã¯äœã§ããïŒ ããã¯ããã¢ã®ã»ãããæã€ãã¡ã€ã«ã§ãïŒ<text> <transcription>ã ãã®ãã¡ã€ã«ã¯ã³ã³ãã€ã«ããããšã³ãžã³ã«ãã£ãŠããŒããããŸãã çºé³æã«ããšã³ãžã³ã¯åèª/ããã¹ããèŸæžã«ååšãããã©ããã確èªããååšããå Žåã¯ãé³å£°è¡šèšã«çœ®ãæããŸãã ããšãã°ãããã«ã³ã®è¡è·¯ãåºå Žã®ååãå«ãèŸæžã
[ããããŒ] åå=ããã«ãŒã èšèª= ITI ã³ã³ãã³ã= EDCT_CONTENT_BROAD_NARROWS è¡šçŸ= EDCT_REPR_SZZ_STRING [ããŒã¿] ãã³ãã³ãŽããŒãã// 'lar.go_del_ko.lo.'nïŒa.to ãPiazza del Governatoratoã//ãpja.tïŒsïŒa_del_go.ver.na.to.'ra.to "Piazza della Stazione" // 'pja.tïŒsïŒa_de.lïŒa_sta.'tïŒsïŒjo.ne ããµã³ã¿ãã«ã¿åºå Žã// 'pja.tïŒsïŒa_di_'san.ta_'mar.ta ããµã³ããšããåºå Žã// 'pja.tïŒsïŒa_'sam_'pjE.tro ãPiazzettaChâteauneufDu Papeã// pja.'tïŒsïŒetïŒa_Sa.to.'nef_du_'pap ããµãªã¿ã»ã¢ã€ã»ãžã£ã«ãã£ãŒãã// sa.'li.ta_aj_dïŒZar.'di.ni ãã¹ãã©ãã³ãã€ãžã£ã«ãã£ãŒãã// stra.'do.ne_dej_dïŒZar.'di.ni ãVia dei Pellegriniã// 'vi.a_dej_pe.lïŒe.'gri.ni ããã©ã³ãã¡ã³ãéãã// 'vi.a_del_fon.da.'men.to ãVia del Governatoratoã// 'vi.a_del_go.ver.na.to.'ra.to ãVia della Postaã// 'vi.a_de.lïŒa_'pOs.ta ãVatica della Stazione Vaticanaã// 'vi.a_de.lïŒa_sta.'tïŒsïŒjo.ne_va.ti.'ka.na ãVia della Tipografiaã// 'vi.a_de.lïŒa_ti.po.gra.'fi.a ããŽã£ã¢ãã£ãã«ã¿ã¢ã³ãžã§ãªã«ã// 'vi.a_di_'pOr.ta_an.'dïŒZE.li.ka ãVia Tunicaã// 'vi.a_'tu.ni.ka "Viale Centro del Bosco" // vi.'a.le_'tïŒSEn.tro_del_'bOs.ko ãViard del Giardino Quadratoã// vi.'a.le_del_dïŒZar.'di.no_kwa.'dra.to ãViatic Vaticanoã// vi.'a.le_va.ti.'ka.no
11.èªè
é³å£°èªèã¯ããã®åæãããããã«å°é£ã§ãã ã·ã³ã»ãµã€ã¶ãŒãå€ãè¯ãæ代ã«äœããã®åœ¢ã§æ©èœããŠããã°ãè³¢æãªèªèã¯ä»ããå©çšã§ããªããªããŸããã ããã€ãã®çç±ããããŸããæåã®çç±ã¯ããªãã¿ã®ãªãèšèªã«çŽé¢ããŠããæ®éã®çããŠãã人ã®åé¡ã«éåžžã«äŒŒãŠããŸãã2çªç®ã®çç±ã¯ããªãã¿ã®ãªãå°åã®ããã¹ããšã®è¡çªã§ãã
ç§ãã¡ã¯å£°ãæãåºãããé³ã®æ¯åãç¥èŠãããŸããããé³çŽ ã«åå²ãã圢æããªããã°ãªããªã身è¿ãªé³ãèšèã«åé¢ããããšããŸãã èšèªã銎æã¿ã®ããå Žåãããã¯ç°¡åã«ååŸã§ããŸãããããã§ãªãå Žåã¯ãé³å£°ãé³çŽ ã«ãæ£ãããå解ããããšããã§ããŸããïŒãAllaãI'm at the barïŒãã«é¢ãã話ãæãåºããŠãã ããïŒã ç§ãã¡ãèããšããã§ã¯ã話ã人ã¯å®å šã«ç°ãªã£ãŠããŸãã ããã¯ãé·å¹Žã«ããããç§ãã¡ã®è³ãç¹å®ã®é³çŽ ã§ãèšç·ŽããããŠãããæéãšãšãã«ãããã ããç¥èŠããããšã«æ £ããŠããããã§ãã ãªãã¿ã®ãªãé³ã«åºäŒãã圌ã¯èªåãèãããã®ã«æãè¿ãæ¯åœèª[èšèª]ã®é³çŽ ãéžæããããšããŸãã ããæå³ã§ã¯ãããã¯CELPãªã©ã®é³å£°ã³ãŒããã¯ã§äœ¿çšããããã¯ãã«éååææ³ã«äŒŒãŠããŸãã ãã®ãããªè¿äŒŒãæåãããšããäºå®ã§ã¯ãããŸããã ãã®ããããå¿«é©ãªãé³çŽ ã¯ã䟿å©ãã«ãªããŸãã
ãœé£ã«æ»ã£ãŠãåŠæ ¡ã§å匷ããŠããéããããŠå€åœäººãšäŒããšããç§ãã¡ã®ååããé³èš³ãããããšããããšãèŠããŠãããŠãã ããã
ããªã¹ã»ããããã®åå
å çãã¡ã¯ç§ãã¡ãscãããªãããªãã®ååãæªããã®ã§ãã 圌ã¯ãããç解ãããšæããŸããïŒ ãã·ã¢èªã話ããïŒ
æ²ããããªãããã§ã圌ãã¯ç§ãã¡ã欺ããããééãããããŸãã...ããããªããè±èª/ãã€ãèª/äžåœèªã§ããªãã®ååãçºé³ããããšãã§ããã°ãããã¯ãã€ãã£ãã¹ããŒã«ãŒããããç¥èŠããããšã¯æ¬åœã«ç°¡åã§ãã äžåœäººã¯ãããããªãåã«ç解ãã西æŽã®ããŒãããŒãšéä¿¡ããããã«èªåèªèº«ã®ããã«ç¹å¥ãªããšãŒããããã®ååãåããŸããã æ©æ¢°èªèã§ã¯ãç¹å®ã®èšèªã¯ããããé³é¿ã¢ãã«ã«ãã£ãŠèšè¿°ãããŸãã ããã¹ããèªèããåã«ãç¹å®ã®èšèªã®é³é¿ã¢ãã«ãããŒãããå¿ èŠããããŸããããã«ãããå ¥åæã«ããã¹ããåŸ æ©ããé³çŽ ãããã°ã©ã ã«æ確ã«ããå¿ èŠããããŸãã
2çªç®ã®åé¡ãåæ§ã«è€éã§ãã çããŠãã人ãšã®é¡æšã«æ»ããŸãããã 察話è ã®è©±ãèããŠãç§ãã¡ã¯ç¡æèã®ãã¡ã«åœŒã次ã«èšãããšã®ã¢ãã«ãé ã®äžã«æ§ç¯ããŸããèšãæããã°ãäŒè©±ã®ã³ã³ããã¹ããäœæããŸãã ãããŠãæèããå€ããèšèãç©èªã«çªç¶æ¿å ¥ãããšïŒäŸãã°ããµãã«ãŒã«é¢ããŠã¯ãã€ã³ããªã¥ãŒããïŒã察è«è ã«èªç¥çäžååãåŒãèµ·ããå¯èœæ§ããããŸãã 倧éæã«èšãã°ãã³ã³ãã¥ãŒã¿ãŒã§ã¯ãã®éåžžã«äžååé³ã絶ããçºçããŸãããªããªãã圌ã¯äººã«äœãæåŸ ããã®ãããããªãããã§ãã 人ã«ãšã£ãŠã¯ç°¡åã§ãã察è«è ã«å床å°ããããšãã§ããŸãã ã³ã³ãã¥ãŒã¿ãŒã¯äœããã¹ãã§ããïŒ ãã®åé¡ã解決ããã³ã³ãã¥ãŒã¿ãŒã«æ£ããã³ã³ããã¹ããäžããããã«ãææ³ã䜿çšãããŸãã
12.ææ³
ææ³ïŒéåžžã¯BNFã®åœ¢åŒã§äžããããŸãïŒã¯ãã³ã³ãã¥ãŒã¿ãŒïŒããæ£ç¢ºã«ã¯ASRãšã³ãžã³ïŒã«ããã®ç¹å®ã®ç¬éã«ãŠãŒã¶ãŒã«æåŸ ããããšã®ã¢ã€ãã¢ãäžããã ãã§ãã éåžžããããã¯ããŸãã¯ããä»ããŠçµã¿åããããããã€ãã®éžæè¢ã§ãããããè€éãªææ³ãå¯èœã§ãã ã«ã¶ã³ã®å°äžéé§ ãéžæããããã®ææ³ã®äŸã次ã«ç€ºããŸãã
ïŒBNF + EM V1.0; ïŒææ³ãã¹ãã ïŒstart <metro_KAZAN_stations>; <metro_KAZAN_stations>ïŒ "Ametyevo"ïŒIdïŒ0ïŒïŒçºé³ïŒ "^ã 'MïŒ je.tïŒ jjI.vo-"ïŒ| ãèªç©ºæ©ãïŒIdïŒ1ïŒïŒçºé³ïŒ "^ vïŒ jI'astro-'itïŒ jIlïŒ jno-j ^"ïŒ| "Slides"ïŒIdïŒ2ïŒïŒçºââé³ïŒ "'gor.kïŒ jI"ïŒ| ãã€ã®å®äœãïŒIdïŒ3ïŒïŒçºé³ïŒ "'ko.zïŒ jj ^ _slo-.b ^ã' Da"ïŒ| "Kremlin"ïŒIdïŒ4ïŒïŒçºé³ïŒ "krïŒ jIm.'lïŒ jof.sko-.j ^"ïŒ| "Gabdulla Tukay Square"ïŒIdïŒ5ïŒïŒçºé³ïŒ "'plo.SïŒ jItïŒ j_go-.bdu.'li0_'tu.ko-.j ^"ïŒ| Victory AvenueïŒIdïŒ6ïŒïŒçºé³ïŒ "pr ^ã 'SpïŒ jekt_p ^ã' BïŒ je.di0"ïŒ| åé§ ïŒIdïŒ7ïŒïŒçºé³ïŒ "'sïŒ je.vïŒ jIr.ni0j_v ^ g.'zal"ïŒ| ãåžå°æ±ºæžãïŒIdïŒ8ïŒïŒçºé³ïŒ "'su.ko-.no-.j ^ _slo-.b ^ã' Da"ïŒ| "Yashlek"ïŒIdïŒ9ïŒïŒçºé³ïŒ "ja.'SlïŒ jek"ïŒ;
ã芧ã®ãšãããåè¡ã¯éžæè¢ã®1ã€ã§ãããå®éã®ããã¹ããæŽæ°IDãé³çŽ ã§æ§æãããŠããŸãã é³çŽ ã¯äžè¬ã«ãªãã·ã§ã³ã§ãããããã«ããèªèãããæ£ç¢ºã«ãªããŸãã
ææ³ã¯ã©ãããã倧ããã§ããŸããïŒ ååã«å€§ããã ç§ãã¡ã®å®éšã§ã¯ã37000ã®ä»£æ¿æ¡ã蚱容ã¬ãã«ã§èªèãããŠãããšããŸãã è€éã§åå²ããææ³ã§ã¯äºæ ã¯ããã«æªåããŸãã èªèæéãé·ããªããå質ãäœäžããææ³ã®é·ããžã®äŸåã¯éç·åœ¢ã§ãã ãããã£ãŠãç§ã®ã¢ããã€ã¹ã¯è€éãªææ³ãé¿ããããšã§ãã ãšã«ãããããããªãã
ææ³ïŒããã³ã³ã³ããã¹ãïŒã¯éçããã³åçã§ãã éçææ³ã®äŸããã§ã«èŠãŸãã;äºåã«ã³ã³ãã€ã«ããããšã³ãžã³ã®å éšãã€ããªè¡šçŸã«ä¿åãããŸãã ãã ãããŠãŒã¶ãŒã®æäœäžã«ã³ã³ããã¹ããå€ããå ŽåããããŸãã ããã²ãŒã·ã§ã³ã®å žåçãªäŸã¯ãæåã®æåã«ããéœåžã®éžæã§ãã ããã§èªèã®å¯èœãªãªãã·ã§ã³ã®ã»ããã¯ãããããã®æåãå ¥åããããã³ã«å€åããŸããèªèã³ã³ããã¹ãã¯åžžã«åæ§ç¯ããå¿ èŠããããŸãã ãããã®ç®çã®ããã«ãåçã³ã³ããã¹ãã䜿çšãããŸãã 倧ãŸãã«èšãã°ãããã°ã©ããŒã¯ææ³ãããªã³ã¶ãã©ã€ãã§ã³ã³ãã€ã«ããããã°ã©ã ã®å®è¡äžã«ãããããšã³ãžã³ã«ããŒã ããŸãã ãã¡ãããã¢ãã€ã«ããã€ã¹ã«ã€ããŠè©±ããŠããå ŽåãåŠçé床ã¯ããŸãé«ããªãããããŠãŒã¶ãŒã€ã³ã¿ãŒãã§ã€ã¹ãããªãŒãºããªãããã«ãå°ããªææ³ïŒçŽ100èªïŒã«å¶éããå¿ èŠããããŸãã
13. ASR APIã®äŸ
ããã¹ãèªèã¯ãåæã»ã©ç°¡åã§ã¯ãããŸããã ãŠãŒã¶ãŒããã€ã¯ã®åã§éãã«ãªã£ãŠããå Žåãåšå²ã®ãã€ãºãèªèããå¿ èŠããããŸãã ãehhhhhhããªã©ãšèšã£ãããèªèã倱æããå¯èœæ§ããããŸãã æè¯ã®å ŽåãASRã¯éåžžãäžé£ã®ãªãã·ã§ã³ïŒä»®èª¬ãšãåŒã°ããïŒãè¿ããŸãã å仮説ã«ã¯äžå®ã®éã¿ããããŸãã ææ³ã倧ããå Žåãèªèãªãã·ã§ã³ã¯éåžžã«å€ããªããŸãã ãã®å Žåã仮説ïŒããšãã°ãä¿¡é Œæ§ã®éé ã§æåã®5ã€ïŒãé£ç¶çã«è¿°ã¹ããŠãŒã¶ãŒã«ãããã®1ã€ãéžæããããã«äŸé Œããããšã¯çã«ããªã£ãŠããŸãã çæ³çã«ã¯ãçãææ³ïŒ "yes" | "no"ïŒã§ãä¿¡é Œæ§ã®é«ãã€ã³ãžã±ãŒã¿ãæã€1ã€ã®ãªãã·ã§ã³ãè¿ããŸãã
次ã®äŸã«ã¯ã次ã®é¢æ°ãå«ãŸããŠããŸãã
ConstructRecognizerïŒïŒ-ãèªèããäœæãããã®ãã©ã¡ãŒã¿ãŒãæ§æããŸã
DestroyRecognizerïŒïŒ-ãèªèããç Žæ£ããŸã
ASR_InitializeïŒïŒ-ASRãšã³ãžã³ãåæåããŸã
ASR_UnInitializeïŒïŒ-ASRãšã³ãžã³ã®åæåã解é€ããŸã
evt_HandleEvent-ãèªèãã¹ã¬ããã«ãã£ãŠçæãããã€ãã³ããåŠçããŸã
ProcessResultïŒïŒ-èªèçµæãåºåããŸã
ASRãšãã®ãã€ã³ãã£ã³ã°
typedef struct RECOG_OBJECTS_S { void *pHeapInst; // Pointer to the heap. const char *acmod; // path to acmod data const char *ddg2p; // path to ddg2p data const char *clc; // path to clc data const char *dct; // path to dct data const char *dynctx; // path to empty dyn ctx data LH_COMPONENT hCompBase; // Handle to the base component. LH_COMPONENT hCompAsr; // Handle to the ASR component. LH_COMPONENT hCompPron; // Handle to the pron component (dyn ctx) LH_OBJECT hAcMod; // Handle to the AcMod object. LH_OBJECT hRec; // Handle to the SingleThreadedRec Object LH_OBJECT hLex; // Handle to lexicon object (dyn ctx) LH_OBJECT hDdg2p; // Handle to ddg2p object (dyn ctx) LH_OBJECT hClc; // Handle to the CLC (DDG2P backup) LH_OBJECT hDct; // Handle to dictionary object (dyn ctx) LH_OBJECT hCache; // Handle to cache object (dyn ctx) LH_OBJECT hCtx[5]; // Handle to the Context object. LH_OBJECT hResults[5]; // Handle to the Best results object. ASRResult *results[5]; // recognition results temporary storage LH_OBJECT hUswCtx; // Handle to the UserWord Context object. LH_OBJECT hUswResult; // Handle to the UserWord Result object. unsigned long sampleFreq; // Sampling frequency. unsigned long frameShiftSamples; // Size of one frame in samples int requestCancel; // boolean indicating user wants to cancel recognition // used to generate transcriptions for dyn ctx LH_BNF_TERMINAL *pTerminals; unsigned int terminals_count; unsigned int *terminals_transtype; // array with same size as pTerminals; each value indicates the type of transcription in pTerminal: user-provided, from_ddg2p, from_dct, from_clc SLOT_TERMINAL_LIST *pSlots; unsigned int slots_count; // reco options int isNumber; // set to 1 when doing number recognition const char * UswFile; // path to file where userword should be recorded char * staticCtxID; } RECOG_OBJECTS; // store ASR objects static RECOG_OBJECTS recogObjects; static int ConstructRecognizer(RECOG_OBJECTS *pRecogObjects, const char *szAcModFN, const char * ddg2p, const char * clc, const char * dct, const char * dynctx) { LH_ERROR lhErr = LH_OK; PH_ERROR phErr = PH_OK; ST_ERROR stErr = ST_OK; LH_ISTREAM_INTERFACE IStreamInterface; void *pIStreamAcMod = NULL; LH_ACMOD_INFO *pAcModInfo; LH_AUDIOCHAINEVENT_INTERFACE EventInterface; /* close old objects */ if(!lh_ObjIsNull(pRecogObjects->hAcMod)){ DestroyRecognizer(pRecogObjects); } pRecogObjects->sampleFreq = 0; pRecogObjects->requestCancel = 0; pRecogObjects->pTerminals = NULL; pRecogObjects->terminals_count = 0; pRecogObjects->pSlots = NULL; pRecogObjects->slots_count = 0; pRecogObjects->staticCtxID = NULL; pRecogObjects->acmod = szAcModFN; pRecogObjects->ddg2p = ddg2p; pRecogObjects->clc = clc; pRecogObjects->dct = dct; pRecogObjects->dynctx = dynctx; EventInterface.pfevent = evt_HandleEvent; EventInterface.pfadvance = evt_Advance; // Create the input stream for the acoustic model. stErr = st_CreateStreamReaderFromFile(szAcModFN, &IStreamInterface, &pIStreamAcMod); if (ST_OK != stErr) goto error; // Create the AcMod object. lhErr = lh_CreateAcMod(pRecogObjects->hCompAsr, &IStreamInterface, pIStreamAcMod, NULL, &(pRecogObjects->hAcMod)); if (LH_OK != lhErr) goto error; // Retrieve some information from the AcMod object. lhErr = lh_AcModBorrowInfo(pRecogObjects->hAcMod, &pAcModInfo); if (LH_OK != lhErr) goto error; pRecogObjects->sampleFreq = pAcModInfo->sampleFrequency; pRecogObjects->frameShiftSamples = pAcModInfo->frameShift * pRecogObjects->sampleFreq/1000; // Create a SingleThreadRec object lhErr = lh_CreateSingleThreadRec(pRecogObjects->hCompAsr, &EventInterface, pRecogObjects, 3000, pRecogObjects->sampleFreq, pRecogObjects->hAcMod, &pRecogObjects->hRec); if (LH_OK != lhErr) goto error; // cretae DDG2P & lexicon for dyn ctx if (pRecogObjects->ddg2p) { int rc = InitDDG2P(pRecogObjects); if (rc<0) goto error; } else if (pRecogObjects->clc) { int rc = InitCLCandDCT(pRecogObjects); if (rc<0) goto error; } else { // TODO: what now? } // Return without errors. return 0; error: // Print an error message if the error comes from the private heap or stream component. // Errors from the VoCon3200 component have been printed by the callback. if (PH_OK != phErr) { printf("Error from the private heap component, error code = %d.\n", phErr); } if (ST_OK != stErr) { printf("Error from the stream component, error code = %d.\n", stErr); } return -1; } static int DestroyRecognizer(RECOG_OBJECTS *pRecogObjects) { unsigned int curCtx; if (!lh_ObjIsNull(pRecogObjects->hUswResult)){ lh_ObjClose(&pRecogObjects->hUswResult); pRecogObjects->hUswResult = lh_GetNullObj(); } if (!lh_ObjIsNull(pRecogObjects->hUswCtx)){ lh_ObjClose(&pRecogObjects->hUswCtx); pRecogObjects->hUswCtx = lh_GetNullObj(); } if (!lh_ObjIsNull(pRecogObjects->hDct)){ lh_ObjClose(&pRecogObjects->hDct); pRecogObjects->hDct = lh_GetNullObj(); } if (!lh_ObjIsNull(pRecogObjects->hCache)){ lh_ObjClose(&pRecogObjects->hCache); pRecogObjects->hCache = lh_GetNullObj(); } if (!lh_ObjIsNull(pRecogObjects->hClc)){ lh_ObjClose(&pRecogObjects->hClc); pRecogObjects->hClc = lh_GetNullObj(); } if (!lh_ObjIsNull(pRecogObjects->hLex)){ lh_LexClearG2P(pRecogObjects->hLex); lh_ObjClose(&pRecogObjects->hLex); pRecogObjects->hLex = lh_GetNullObj(); } if (!lh_ObjIsNull(pRecogObjects->hDdg2p)){ lh_DDG2PClearDct (pRecogObjects->hDdg2p); lh_ObjClose(&pRecogObjects->hDdg2p); pRecogObjects->hDdg2p = lh_GetNullObj(); } for(curCtx=0; curCtx<sizeof(recogObjects.hCtx)/sizeof(recogObjects.hCtx[0]); curCtx++){ if (!lh_ObjIsNull(pRecogObjects->hCtx[curCtx])){ lh_RecRemoveCtx(pRecogObjects->hRec, pRecogObjects->hCtx[curCtx]); lh_ObjClose(&pRecogObjects->hCtx[curCtx]); pRecogObjects->hCtx[curCtx] = lh_GetNullObj(); } if (!lh_ObjIsNull(pRecogObjects->hResults[curCtx])){ lh_ObjClose(&pRecogObjects->hResults[curCtx]); pRecogObjects->hResults[curCtx] = lh_GetNullObj(); } } if (!lh_ObjIsNull(pRecogObjects->hRec)){ lh_ObjClose(&pRecogObjects->hRec); pRecogObjects->hRec = lh_GetNullObj(); } if (!lh_ObjIsNull(pRecogObjects->hAcMod)){ lh_ObjClose(&pRecogObjects->hAcMod); pRecogObjects->hAcMod = lh_GetNullObj(); } return 0; } int ASR_Initialize(const char * acmod, const char * ddg2p, const char * clc, const char * dct, const char * dynctx) { int rc = 0; size_t curCtx; LH_HEAP_INTERFACE HeapInterface; // Initialization of all handles. recogObjects.pHeapInst = NULL; recogObjects.hCompBase = lh_GetNullComponent(); recogObjects.hCompAsr = lh_GetNullComponent(); recogObjects.hCompPron = lh_GetNullComponent(); recogObjects.hAcMod = lh_GetNullObj(); for(curCtx=0; curCtx<sizeof(recogObjects.hCtx)/sizeof(recogObjects.hCtx[0]); curCtx++){ recogObjects.hCtx[curCtx] = lh_GetNullObj(); recogObjects.hResults[curCtx] = lh_GetNullObj(); } recogObjects.hRec = lh_GetNullObj(); recogObjects.hLex = lh_GetNullObj(); recogObjects.hDdg2p = lh_GetNullObj(); recogObjects.hClc = lh_GetNullObj(); recogObjects.hCache = lh_GetNullObj(); recogObjects.hDct = lh_GetNullObj(); recogObjects.hUswCtx = lh_GetNullObj(); recogObjects.hUswResult = lh_GetNullObj(); recogObjects.sampleFreq = 0; recogObjects.requestCancel = 0; recogObjects.pTerminals = NULL; recogObjects.terminals_count= 0; recogObjects.pSlots = NULL; recogObjects.slots_count = 0; recogObjects.staticCtxID = NULL; // Construct all components and objects needed for recognition. // Connect the audiochain objects. if (acmod) { // initialize components // Create a base and an ASR component. (+pron for dyn ctx) if(LH_OK != lh_InitBase(&HeapInterface, recogObjects.pHeapInst, LhErrorCallBack, NULL, &recogObjects.hCompBase)) goto error; if(LH_OK != lh_InitAsr(recogObjects.hCompBase, &HeapInterface, recogObjects.pHeapInst, &recogObjects.hCompAsr)) goto error; if(LH_OK != lh_InitPron(recogObjects.hCompBase, &HeapInterface, recogObjects.pHeapInst, &recogObjects.hCompPron)) goto error; rc = ConstructRecognizer(&recogObjects, acmod, ddg2p, clc, dct, dynctx); if (rc<0) goto error; } return rc; error: // An error occured. Close the engine. CloseOnError(&recogObjects); return -1; } int ASR_UnInitialize(void) { int rc; // Disconnects the audiochain objects. // Closes all objects and components of the vocon recognizer. rc = DestroyRecognizer(&recogObjects); // Close the PRON component. lh_ComponentTerminate(&recogObjects.hCompPron); // Close the ASR and Base component. lh_ComponentTerminate(&recogObjects.hCompAsr); lh_ComponentTerminate(&recogObjects.hCompBase); return 0; } int evt_HandleEvent(void *pEvtInst, unsigned long type, LH_TIME timeMs) { RECOG_OBJECTS *pRecogObjects = (RECOG_OBJECTS*)pEvtInst; if ( type & LH_AUDIOCHAIN_EVENT_BOS ){ // ask upper level for beep printf ("Receiving event LH_AUDIOCHAIN_EVENT_BOS at time %d ms.\n", timeMs); } if ( type & LH_AUDIOCHAIN_EVENT_TS_FX ) { printf ("Receiving event LH_AUDIOCHAIN_EVENT_TS_FX at time %d ms.\n", timeMs); } if ( type & LH_AUDIOCHAIN_EVENT_TS_REC ) { printf ("Receiving event LH_AUDIOCHAIN_EVENT_TS_REC at time %d ms.\n", timeMs); } if ( type & LH_AUDIOCHAIN_EVENT_FX_ABNORMCOND ) { LH_ERROR lhErr = LH_OK; LH_FX_ABNORMCOND abnormCondition; printf ("Receiving event LH_AUDIOCHAIN_EVENT_FX_ABNORMCOND at time %d ms.\n", timeMs); // Find out what the exact abnormal condition is. lhErr = lh_FxGetAbnormCondition(pRecogObjects->hRec, &abnormCondition); if (LH_OK != lhErr) goto error; switch (abnormCondition) { case LH_FX_BADSNR: printf ("Abnormal condition: LH_FX_BADSNR.\n"); break; case LH_FX_OVERLOAD: printf ("Abnormal condition: LH_FX_OVERLOAD.\n"); break; case LH_FX_TOOQUIET: printf ("Abnormal condition: LH_FX_TOOQUIET.\n"); break; case LH_FX_NOSIGNAL: printf ("Abnormal condition: LH_FX_NOSIGNAL.\n"); break; case LH_FX_POORMIC: printf ("Abnormal condition: LH_FX_POORMIC.\n"); break; case LH_FX_NOLEADINGSILENCE: printf ("Abnormal condition: LH_FX_NOLEADINGSILENCE.\n"); break; } } // LH_AUDIOCHAIN_EVENT_FX_TIMER // It usually is used to get the signal level and SNR at regular intervals. if ( type & LH_AUDIOCHAIN_EVENT_FX_TIMER ) { LH_ERROR lhErr = LH_OK; LH_FX_SIGNAL_LEVELS SignalLevels; printf ("Receiving event LH_AUDIOCHAIN_EVENT_FX_TIMER at time %d ms.\n", timeMs); lhErr = lh_FxGetSignalLevels(pRecogObjects->hRec, &SignalLevels); if (LH_OK != lhErr) goto error; printf ("Signal level: %ddB, SNR: %ddB at time %dms.\n", SignalLevels.energy, SignalLevels.SNR, SignalLevels.timeMs); } // LH_AUDIOCHAIN_EVENT_RESULT if ( type & LH_AUDIOCHAIN_EVENT_RESULT ){ LH_ERROR lhErr = LH_OK; LH_OBJECT hNBestRes = lh_GetNullObj(); LH_OBJECT hCtx = lh_GetNullObj(); printf ("Receiving event LH_AUDIOCHAIN_EVENT_RESULT at time %d ms.\n", timeMs); // Get the NBest result object and process it. lhErr = lh_RecCreateResult (pRecogObjects->hRec, &hNBestRes); if (LH_OK == lhErr) { if (LH_OK == lh_ResultBorrowSourceCtx(hNBestRes, &hCtx)){ int i; int _ready = 0; for(i=0; i<sizeof(pRecogObjects->hCtx)/sizeof(pRecogObjects->hCtx[0]); i++){ if(!lh_ObjIsNull(pRecogObjects->hCtx[i])){ if(hCtx.pObj == pRecogObjects->hCtx[i].pObj){ if(!lh_ObjIsNull(pRecogObjects->hResults[i])){ lh_ObjClose(&pRecogObjects->hResults[i]); } pRecogObjects->hResults[i] = hNBestRes; hNBestRes = lh_GetNullObj(); _ready = 1; break; } } else { break; } } if (_ready) { for (i=0; i<sizeof(pRecogObjects->hCtx)/sizeof(pRecogObjects->hCtx[0]); i++) { if(!lh_ObjIsNull(pRecogObjects->hCtx[i])){ if(lh_ObjIsNull(pRecogObjects->hResults[i])){ _ready = 0; } } } } ASSERT(lh_ObjIsNull(hNBestRes)); if (_ready) { ProcessResult (pRecogObjects); for(i=0; i<sizeof(pRecogObjects->hResults)/sizeof(pRecogObjects->hResults[0]); i++){ if(!lh_ObjIsNull(pRecogObjects->hResults[i])){ lh_ObjClose(&pRecogObjects->hResults[i]); } } } } // Close the NBest result object. } } return 0; error: return -1; } static int ProcessResult (RECOG_OBJECTS *pRecogObjects) { LH_ERROR lhErr = LH_OK; size_t curCtx, i, k, count=0; size_t nbrHypothesis; ASRResult *r = NULL; long lid; // get total hyp count for(curCtx=0; curCtx<sizeof(pRecogObjects->hCtx)/sizeof(pRecogObjects->hCtx[0]); curCtx++){ if(!lh_ObjIsNull(pRecogObjects->hResults[curCtx])){ if(LH_OK == lh_NBestResultGetNbrHypotheses (pRecogObjects->hResults[curCtx], &nbrHypothesis)){ count += nbrHypothesis; } } } // traces printf ("\n"); printf (" __________RESULT %3d items max_______________\n", count); printf ("| | |\n"); printf ("| result | confi- | result string [start rule]\n"); printf ("| number | dence |\n"); printf ("|________|________|___________________________\n"); printf ("| | |\n"); if (count>0) { r = ASRResult_New(count); // Get & print out the result information for each hypothesis. count = 0; curCtx = sizeof(pRecogObjects->hCtx)/sizeof(pRecogObjects->hCtx[0]); for(; curCtx>0; curCtx--){ LH_OBJECT hNBestRes = pRecogObjects->hResults[curCtx-1]; if(!lh_ObjIsNull(hNBestRes)){ LH_HYPOTHESIS *pHypothesis; if(LH_OK == lh_NBestResultGetNbrHypotheses (hNBestRes, &nbrHypothesis)){ for (i = 0; i < nbrHypothesis; i++) { char *szResultWords; // Retrieve information on the recognition result. if (LH_OK == lh_NBestResultFetchHypothesis (hNBestRes, i, &pHypothesis)){ // Get the result string. if (LH_OK == lh_NBestResultFetchWords (hNBestRes, i, &szResultWords)){ printf ("| %6lu | %6lu | '%s' [%s]\n", i, pHypothesis->conf, szResultWords, pHypothesis->szStartRule); // Return the fetched data to the engine. lh_NBestResultReturnWords (hNBestRes, szResultWords); } lh_NBestResultReturnHypothesis (hNBestRes, pHypothesis); } } } } } } // traces printf ("|________|________|___________________________\n"); printf ("\n"); return 0; }
æããã«ãTTSã®å Žåã®ããã«ãã³ãŒãã¯éåžžã«å€§ãããæºåæé ã¯å€ãã®ã¹ããŒã¹ãå æããŸãããããŠãããã¯ãŸã å®å šã«æ©èœããã³ãŒãã§ã¯ãããŸããïŒåºçãããšããç§ã¯å€ãã®äžå¿ èŠãªãã®ãæããŸããããã®ãã¹ãŠã¯ãé³å£°I / Oãã¯ãããžãŒã䜿çšããã«ã¯ããªãé«ãããšã³ããªãããå€ããå¿ èŠã§ããããšããããŸã§èªãã 人ãã¡ã«ããäžåºŠç€ºããŠããŸãã
14.ã¹ããªãŒã èªèïŒãã£ã¯ããŒã·ã§ã³ïŒ
çŸåšã®æè¡ã«ãããæåŸã®èšèã¯ãã€ã³ã©ã€ã³èªèãã€ãŸããã£ã¯ããŒã·ã§ã³ã§ãããã®ãã¯ãããžãŒã¯ãAndroidããã³iOSçšã®ææ°ã®ã¹ããŒããã©ã³ã§ãã§ã«å©çšå¯èœã§ããå«ã-APIã®åœ¢åŒãããã§ãããã°ã©ãã¯ãææ³ãäœæãããšãã«èªèã³ã³ããã¹ããæå®ããå¿ èŠã¯ãããŸãããå ¥ãå£ã§ã¹ããŒãããããŸã-åºå£ã§ãèªèãããèšèãæ®å¿µãªããããã®æ¹æ³ãã©ã®ããã«æ©èœãããã«ã€ããŠã®è©³çŽ°ã¯ãŸã å ¥æã§ããŸãããèªèããã»ã¹ã¯ããã€ã¹èªäœã§ã¯ãªããé³å£°ãéä¿¡ããããµãŒããŒã§è¡ãããããããçµæãååŸãããŸããããããç§ã¯äœå¹ŽãåŸã«ã¯ã©ã€ã¢ã³ãåŽã§æè¡ãå©çšå¯èœã«ãªããšä¿¡ãããã§ãã
ãããã«
ASRãšTTSãã¯ãããžãŒã«ã€ããŠäŒãããã£ãã®ã¯ããããããã ãã§ããããç§ã¯ãããããŸãã«ãéå±ã§éåžžã«æçã§ã¯ãªãããšãé¡ã£ãŠããŸãã