ããããCã«ã¯word2vecã®ãœãŒã¹ã³ãŒãããããGoogleã«ã¯èª¬æããããŸããRã«ã¯ãCãC ++ãããã³Fortranã§å€éšã©ã€ãã©ãªã䜿çšããæ©äŒããããŸãã ãšããã§ãæéã®Rã©ã€ãã©ãªã¯ãç¹ã«Cããã³C ++ã§äœæãããŠããŸãã éçºäžã®tmcn.word2vec Rã©ãããŒããããŸãã ãã®èè
Jian Li ïŒäžåœèªã®ãŠã§ããµã€ãïŒã¯ãäžåœèªã®ãã¢ã®ãããªããšãããŸããïŒè±èªã§ãåäœããŸããããã·ã¢èªã§ã¯ãŸã è©ŠããŠããŸããïŒã ãã®ããŒãžã§ã³ã®åé¡ã¯æ¬¡ã®ãšããã§ãã
- ãŸãããã¹ãŠã®ãã©ã¡ãŒã¿ãŒã¯Cã³ãŒãã§ä¿è·ãããŠããŸãã
- 第äºã«ãèè ã¯èšç·Žãããã¢ãã«ãæäœããããã®é¢æ°ã1ã€ã ãäœæããŸãããããã¯åèªã®é¡äŒŒæ§ãè©äŸ¡ããæ倧å€ãæã€20ã®ãªãã·ã§ã³ã衚瀺ããŸãã
- 第äžã«ãx64 Windowsçšã®ããã±ãŒãžãæ§ç¯ã§ããŸããã§ããã win32ã§ã¯ãããã±ãŒãžã¯åé¡ãªãã€ã³ã¹ããŒã«ãããŸãã
ãã®ãè±ãããããã¹ãŠè©äŸ¡ããŠãç§ã¯word2vecã®Rã€ã³ã¿ãŒãã§ã€ã¹ã®ç¬èªããŒãžã§ã³ãäœæããããšã«ããŸããã å®ãèšããšãç§ã¯Cãããç¥ããªãã®ã§ãç°¡åãªããã°ã©ã ãæžãã ãã§ããã®ã§ã Jian Liã®ãœãŒã¹ã³ãŒãã Windowsã§ç¢ºå®ã«ã³ã³ãã€ã«ãããã®ã§ãåºç€ãšããããšã«ããŸããã äœããæ©èœããªãå Žåã¯ãåžžã«å ã®ãã®ãšæ¯èŒã§ããŸãã
æºåãã
Windowsã§Rã®Cã³ãŒããã³ã³ãã€ã«ããã«ã¯ãããã«Rtoolsãã€ã³ã¹ããŒã«ããå¿ èŠããããŸãã ãã®ããŒã«ãããã«ã¯ãCygwinã®äžã§å®è¡ãããgccã³ã³ãã€ã©ãå«ãŸããŠããŸãã Rtoolsãã€ã³ã¹ããŒã«ããåŸãPATHå€æ°ã確èªããå¿ èŠããããŸãã 次ã®ãããªãã®ãããã¯ãã§ãã
DïŒ\ Rtools \ bin; DïŒ\ Rtools \ gcc-4.6.3 \ bin; DïŒ\ R \ bin
OS Xã§ã¯ãRtoolsã¯å¿ èŠãããŸããã ã€ã³ã¹ããŒã«ãããã³ã³ãã€ã©ãå¿ èŠã§ããããã®ååšã¯gcc --versionã³ãã³ãã«ãã£ãŠãã§ãã¯ãããŸãã ããã§ãªãå Žåã¯ã Xcodeãã€ã³ã¹ããŒã«ããXcode-ã³ãã³ãã©ã€ã³ããŒã«ã䜿çšããå¿ èŠããããŸãã
RããCã©ã€ãã©ãªãåŒã³åºãã«ã¯ã次ã®ããšãç¥ã£ãŠããå¿ èŠããããŸãã
- é¢æ°ãåŒã³åºããšãã®ãã¹ãŠã®å€ã¯ãã€ã³ã¿ãŒã®åœ¢åŒã§æž¡ããããããã®åãæ瀺çã«ç»é²ããããã«æ³šæããå¿ èŠããããŸãã æãä¿¡é Œã§ããæ¹æ³ã¯ãcharåã®ãã©ã¡ãŒã¿ãŒãæž¡ããŠãããæ¢ã«Cã«ããç®çã®åã«å€æããããšã§ãã
- åŒã³åºãããé¢æ°ã¯å€ãè¿ããŸããã voidåã§ãªããã°ãªããŸããã
- Cã³ãŒãã§ã¯ãïŒinclude <Rh>åœä»€ãè¿œå ããå¿ èŠããããŸããè€éãªæ°åŠãããå Žåã¯ãïŒinclude <R.math>ãè¿œå ããŸãã
- Rã³ã³ãœãŒã«ã«äœããåºåããå¿ èŠãããå Žåã¯ãprintfïŒïŒã®ä»£ããã«RprintfïŒïŒã䜿çšããããšããå§ãããŸãã 確ãã«ãprintfïŒïŒãæ©èœããŸãã
ãŸããHelloãWorldïŒãªã©ãéåžžã«ã·ã³ãã«ãªãã®ãäœæããããšã«ããŸããã ãã ããããã«å€ãæž¡ãããããã«ããŸãã ç§ãé垞䜿çšããRstudioã䜿çšãããšãCããã³C ++ã³ãŒããèšè¿°ã§ãããã¹ãŠãæ£ããç¹ç¯ããŸãã hello.cã«ã³ãŒããèšè¿°ããŠä¿åããåŸãã³ãã³ãã©ã€ã³ãåŒã³åºããç®çã®ãã£ã¬ã¯ããªã«ç§»åããŠã次ã®ã³ãã³ãã§ã³ã³ãã€ã©ãèµ·åããŸããã
> R --arch x64 CMD SHLIB hello.c
win32ã§ã¯ãã¢ãŒããã¯ãã£ããŒã¯å¿ èŠãããŸããã
> R CMD SHLIB hello.c
ãã®çµæããã£ã¬ã¯ããªã«hello.oïŒå®å šã«åé€ã§ããŸãïŒãšhello.dllã©ã€ãã©ãªã®2ã€ã®ãã¡ã€ã«ã衚瀺ãããŸããã ïŒOS Xã§ã¯ãdllã®ä»£ããã«ãæ¡åŒµåãsoã®ãã¡ã€ã«ãååŸããŸãïŒã çµæã®Rã®helloé¢æ°ã¯ã次ã®ã³ãŒãã§åŒã³åºãããŸãã
dyn.load("hello.dll") hellof <- function(n) { .C("hello", as.integer(n)) } hellof(5)
ãã¹ãã§ã¯ããã¹ãŠãæ£ããæ©èœããword2vecã䜿çšããå®éšã§ã¯ããŒã¿ãæºåããããšãæ®ã£ãŠããããšã瀺ãããŸããã ç§ã¯ãèšèã®è¢ãšãããã³ãŒã³ã®è¢ãã¿ã¹ã¯ããããããKaggleã«é£ããŠè¡ãããšã«ããŸãã ã ãã¬ãŒãã³ã°ããã¹ããæªå²ãåœãŠã®ãµã³ãã«ããããåèšã§IMDBããã®æ ç»ã®10äžã®æ¹èšçãå«ãŸããŠããŸãã ãããã®ãã¡ã€ã«ãããŠã³ããŒãããåŸããããããHTMLã¿ã°ãç¹æ®æåãæ°åãå¥èªç¹ãã¹ãããã¯ãŒããåé€ããããŒã¯ã³åããŸããã åŠçã®è©³çŽ°ã¯çç¥ããŸããããã§ã«ãããã«ã€ããŠæžããŸããã
Word2vecã¯ãã¹ããŒã¹ã§åºåãããåèªãå«ã1è¡ã®ããã¹ããã¡ã€ã«åœ¢åŒã§ãã¬ãŒãã³ã°çšã®ããŒã¿ãåãå ¥ããŸãïŒå ¬åŒããã¥ã¡ã³ãã®word2vecã®äœ¿çšäŸãåæããããšã§ãããèŠã€ããŸããïŒã ããŒã¿ã»ããã1è¡ã«æ¥çããŠãããã¹ããã¡ã€ã«ã«ä¿åããŸããã
ã¢ãã«
Jian Liããªã¢ã³ãã§ã¯ããããã¯2ã€ã®ãã¡ã€ã«word2vec.hããã³word2vec.cã§ãã æåã®ã³ãŒãã«ã¯ã¡ã€ã³ã³ãŒããå«ãŸããŠãããã¡ã€ã³ã³ãŒãã¯å ã®word2vec.cãšäžèŽããŠããŸãã 2çªç®ã¯ãTrainModelïŒïŒé¢æ°ãåŒã³åºãããã®ã©ãããŒã§ãã ç§ãæåã«æ±ºããã®ã¯ããã¹ãŠã®ã¢ãã«ãã©ã¡ãŒã¿ãRã³ãŒãã«åã蟌ãããšã§ããã word2vec.cã®Rã¹ã¯ãªãããšã©ãããŒãç·šéããå¿ èŠããããŸããã次ã®æ§é ãåŸãããŸããã
dyn.load("word2vec.dll") word2vec <- function(train_file, output_file, binary, cbow, num_threads, num_features, window, min_count, sample) { //... ... OUT <- .C("CWrapper_word2vec", train_file = as.character(train_file), output_file = as.character(output_file), binary = as.character(binary), //... ) //... OUT... } word2vec("train_data.txt", "model.bin", binary=1, # output format, 1-binary, 0-txt cbow=0, # skip-gram (0) or continuous bag of words (1) num_threads = 1, # num of workers num_features = 300, # word vector dimensionality window = 10, # context / window size min_count = 40, # minimum word count sample = 1e-3 # downsampling of frequent words )
ãã©ã¡ãŒã¿ãŒã«é¢ããããã€ãã®èšèïŒ
ãã€ã㪠-ã¢ãã«åºå圢åŒã
cbow -skip-gramãŸãã¯åèªã®è¢ïŒcbowïŒã®ãã¬ãŒãã³ã°ã«äœ¿çšããã¢ã«ãŽãªãºã ã Skip-gramã¯äœéã§ããããŸããªåèªã§ã¯ããè¯ãçµæãåŸãããŸãã
num_threads-ã¢ãã«ã®æ§ç¯ã«é¢ä¿ããããã»ããµã¹ã¬ããã®æ°ã
num_features-ã¯ãŒãã¹ããŒã¹ïŒãŸãã¯åã¯ãŒãã®ãã¯ãã«ïŒã®æ¬¡å ãæ°åããæ°çŸãæšå¥šãããŸãã
window-åŠç¿ã¢ã«ãŽãªãºã ãèæ ®ãã¹ãã³ã³ããã¹ãã®åèªæ°ã
min_count-æå³ã®ããåèªã®èŸæžã®ãµã€ãºãå¶éããŸãã ããã¹ãå ã§æå®ãããæ°ãè¶ ããŠèŠã€ãããªãåèªã¯ç¡èŠãããŸãã æšå¥šå€ã¯10ã100ã§ãã
ãµã³ãã« -ããã¹ãå ã®åèªã®åºçŸé »åºŠã®äžéã.00001ãã.01ãŸã§ãæšå¥šãããŸãã
æšå¥šãããmakefileããŒã䜿çšããŠæ¬¡ã®ã³ãã³ãã§ã³ã³ãã€ã«ããŸãã
> R --arch x64 CMD SHLIB -lm -pthread -O3 -march = native -Wall -funroll-loops -Wno-unused-result word2vec.c
ã³ã³ãã€ã©ãŒã¯ããã€ãã®èŠåãåºããŸããããæ·±å»ãªããšã¯äœããããŸããã§ããã åé¡ãªããdyn.loadé¢æ°ïŒ "word2vec.dll"ïŒã䜿çšããŠRã«ããŒãããåãååã®é¢æ°ãèµ·åããŸããã pthreadããŒã ãã䟿å©ã ãšæããŸãã æ®ããªãã§ãå®è¡ã§ããŸãïŒãããã®äžéšã¯Rtoolsæ§æã«ç»é²ãããŠããŸãïŒã
çµæïŒ
åèšãããšãç§ã®ãã¡ã€ã«ã¯1150äžèªãèŸæž-19133èªã§ããããšãå€æããIntel Core i7ãæèŒããã³ã³ãã¥ãŒã¿ãŒã§ã®ã¢ãã«äœææéã¯6åã§ããã ãªãã·ã§ã³ãæ©èœãããã©ããã確èªããããã«ãnum_threadsã®å€ã1ãã6ã«å€æŽããŸããã ãªãœãŒã¹ã®ç£èŠãèŠãªãããšãå¯èœã§ããã¢ãã«ã®æ§ç¯æéã¯1ååã«ççž®ãããŸããã ã€ãŸãããã®ãã®ã¯æ°åã§1,100äžèªãåŠçã§ããŸãã
é¡äŒŒæ§ã®è©äŸ¡
è·é¢çã«ã¯ãå®éã«ã¯äœãå€æŽãããè¿ãããå€ã®æ°ã®ãã©ã¡ãŒã¿ãŒãåŒãåºããŸããã 次ã«ã圌ã¯ã©ã€ãã©ãªãã³ã³ãã€ã«ããRã«ããŒãããŠããæªãããšãè¯ãããç°¡åã«ç¢ºèªããŸããã
åèªïŒèªåœã®æªãäœçœ®ïŒ15 ã¯ãŒãã³ã¹ãã£ã¹ã 1ã²ã©ã0.5778409 2æããã0.5541780 3ãç²æ«ãª0.5527389 4ã²ã©ã0.5206609 5ç¬ããªãã0.4910716 6極æªãª0.4841466 7æããã0.4808238 8è¯ã0.4805901 9æªã0.4726501 10æããã0.4579800 åèªïŒèªåœã®è¯ãäœçœ®ïŒ6 ã¯ãŒãã³ã¹ãã£ã¹ã 1ãŸãšããª0.5678578 2çŽ æµãª0.5364762 3çŽ æŽããã0.5197815 4æªã0.4805902 5åªãã0.4554003 6è¯ã0.4365533 7倧äžå€«0.4361723 8æ¬åœã«0.4153538 9奜ã0.4061105 10眰é0.4004776
ãã¹ãŠãåã³ããŸããã£ãã èå³æ·±ãããšã«ãèšèã§æ°ãããšãæªãè·é¢ããè¯ãè·é¢ãŸã§ã®è·é¢ã¯ãè¯ãè·é¢ããæªãè·é¢ããã倧ãããªããŸãã ãŸãã圌ãã¯ãæããæãã¿ãž...ããšèšãããã«ãéããŸãåæ§ã§ãã ã¢ã«ãŽãªãºã ã¯ã次ã®åŒã«åŸã£ãŠããã¯ãã«éã®è§åºŠã®ã³ãµã€ã³ãšããŠé¡äŒŒåºŠãèšç®ããŸãïŒ wikiã®ç»åïŒïŒ
ãã®ããããã¬ãŒãã³ã°æžã¿ã®ã¢ãã«ã䜿çšãããšãCãªãã§è·é¢ãèšç®ããé¡äŒŒæ§ã®ä»£ããã«ãããšãã°å·®ç°ãè©äŸ¡ã§ããŸãã ãããè¡ãã«ã¯ãããã¹ã圢åŒïŒãã€ããª= 0ïŒã§ã¢ãã«ãæ§ç¯ããread.tableïŒïŒã䜿çšããŠRã«ããŒãããäžå®éã®ã³ãŒããæžã蟌ãå¿ èŠããããŸãã äŸå€åŠçã®ãªãã³ãŒãïŒ
similarity <- function(word1, word2, model) { size <- ncol(model)-1 vec1 <- model[model$word==word1,2:size] vec2 <- model[model$word==word2,2:size] sim <- sum(vec1 * vec2) sim <- sim/(sqrt(sum(vec1^2))*sqrt(sum(vec2^2))) return(sim) } difference <- function(string, model) { words <- tokenize(string) num_words <- length(words) diff_mx <- matrix(rep(0,num_words^2), nrow=num_words, ncol=num_words) for (i in 1:num_words) { for (j in 1:num_words) { sim <- similarity(words[i],words[j],model) if(i!=j) { diff_mx[i,j]=sim } } } return(words[which.min(rowSums(diff_mx))]) }
ããã§ã¯ãåèªæ°ã«å¯Ÿããã¯ãšãªã®åèªæ°ã®ãµã€ãºã§æ£æ¹è¡åãäœæãããŸãã ããã«ãéé¡äŒŒèªã®åãã¢ã«ã€ããŠãé¡äŒŒæ§ãèšç®ãããŸãã 次ã«ãå€ãè¡ã§åèšãããæå°éã®è¡ããããŸãã è¡çªå·ã¯ããªã¯ãšã¹ãå ã®ãäœåãªãåèªã®äœçœ®ã«å¯Ÿå¿ããŠããŸãã ãããªãã¯ã¹ã®ååã®ã¿ãã«ãŠã³ãããããšã«ãããäœæ¥ãå éã§ããŸãã ããã€ãã®äŸïŒ
>éãïŒããªã¹é¹¿äººéç¬ç«ããã¢ãã«ïŒ [1]ã人éã >éãïŒãæªãèµ€ãè¯ãããããã²ã©ãããã¢ãã«ïŒ [1]ãèµ€ã
é¡æš
é¡æšã®æ€çŽ¢ã«ããããç·æ§ã¯å¥³æ§ãæããçã¯ã©ã®ããã«é¢ä¿ããŠããã®ãïŒããªã©ã®åé¡ã解決ã§ããŸãã ç¹å¥ãªåèªã¢ãããžãŒé¢æ°ã¯å ã®Googleã³ãŒãã«ã®ã¿ãããããç§ã¯ããããããå¿ èŠããããŸããã Rããé¢æ°ãåŒã³åºãã©ãããŒãäœæããã³ãŒãããç¡éã«ãŒããåé€ããæšæºã®å ¥å/åºåã¹ããªãŒã ããã©ã¡ãŒã¿ãŒã®åãæž¡ãã«çœ®ãæããŸããã 次ã«ãã©ã€ãã©ãªã«ã³ã³ãã€ã«ããããã€ãã®å®éšãè¡ããŸããã ç§ã¯å¥³çã§æåããŸããã§ãããæããã«1,100äžèªã§ã¯äžååã§ãïŒword2vecã®èè ã¯çŽ10åèªãæšå¥šããŠããŸãïŒã è¯ãäŸïŒ
>ã¢ãããžãŒïŒ "model300.bin"ãââ "man woman king"ã3ïŒ ã¯ãŒãã³ã¹ãã£ã¹ã 1ç座0.4466286 2ãªã¢0.4268206 3ããªã³ã»ã¹0.4251665 >é¡æšïŒãmodel300.binãããç·ãšå¥³ã®å€«ãã3ïŒ ã¯ãŒãã³ã¹ãã£ã¹ã 1人ã®åŠ»0.6323696 2äžå¿ å®ãª0.5626401 3çµå©0.5268299 >ã¢ãããžãŒïŒãmodel300.binãããman woman boyãã3ïŒ ã¯ãŒãã³ã¹ãã£ã¹ã 1人ã®å¥³ã®å0.6313665 æ¯2人0.4309490 3 10代0.4272232
ã¯ã©ã¹ã¿ãªã³ã°
ããã¥ã¡ã³ããèªãã åŸãword2vecã«ã¯çµã¿èŸŒã¿ã®K-Meansã¯ã©ã¹ã¿ãªã³ã°ãããããšãããã£ãã ãããŠãããã䜿çšããã«ã¯ãRã®ãã1ã€ã®ãã©ã¡ãŒã¿ãŒãã¯ã©ã¹ãããåŒãåºããã ãã§ååã§ãã ããã¯ã¯ã©ã¹ã¿ãŒã®æ°ã§ããããããŒããã倧ããå Žåãword2vecã¯word-cluster numberãšãã圢åŒã®ããã¹ããã¡ã€ã«ãçæããŸãã 300åã®ã¯ã©ã¹ã¿ãŒã§ã¯ãæ£æ°ãåŸãã®ã«ååã§ã¯ãããŸããã§ããã éçºè ããã®çºèŠçææ³ïŒèŸæžã®ãµã€ãºã¯5ã§å²ãããŸãããããã£ãŠã3000ãéžæããŸãããããã€ãã®æåããã¯ã©ã¹ã¿ãŒãæäŸããŸãïŒãããã®åèªãè¿ãçç±ãç解ã§ãããšããæå³ã§æåããŸãïŒã
åèªID 335ãŠãŒã¢ã¢2952 489æ·±å»ãª2952 872è³¢ã2952 1035ãŠãŒã¢ã¢2952 1796ã®åç §2952 1916颚åº2952 2061ãã¿ãã¿2952 2367颚å€ãããª2952 2810åæ²¹2952 2953ã¢ã€ãããŒ2952 3125ãšãã§ããªã2952 3296è¶çª2952 3594åºã2952 4870æãã2952 4979ãšããžã®å¹ãã2952 åèªID 1025ç«241 3242ããŠã¹241 11189ãããŒ241 åèªID 1089è»é322 1127è»é322 1556ããã·ã§ã³322 1558å¹Žå µå£«322 3254ãã€ããŒ322 3323æŠé322 3902ã³ãã³ã322 3975ãŠããã322 4270倧äœ322 4277ã³ãã³ããŒ322 7821å°é322 7853æµ·å µé322 8691æµ·è»322 9762æè322 10391 gi 322 12452è»å£322 15839æ©å µ322 16697ãã€ããŒ322
ã¯ã©ã¹ã¿ãªã³ã°ã®å©ããåããŠãææ åæãè¡ãã®ã¯ç°¡åã§ãã ãããè¡ãã«ã¯ããã¯ã©ã¹ã¿ãŒããã°ããäœæããå¿ èŠããããŸããããã¯ãã¯ã©ã¹ã¿ãŒã®æ倧æ°ã«å¯Ÿãããªããžã§ã³æ°ã®ãµã€ãºã®ãããªãã¯ã¹ã§ãã ãã®ãããªãããªãã¯ã¹ã®åã»ã«ã«ã¯ãç¹å®ã®ã¯ã©ã¹ã¿ãŒå ã®ã¬ãã¥ãŒããã®åèªã®ãããæ°ãå¿ èŠã§ãã è©Šããããšã¯ãããŸããããããã§ã¯åé¡ã¯ãããŸããã 圌ãã¯ãIMDBããã®ã¬ãã¥ãŒã®ç²ŸåºŠã¯ããèšèã®è¢ããéããŠãããè¡ãå Žåãšåããããããã«äœããšèšããŸãã
ãã¬ãŒãº
Word2vecã¯ããã¬ãŒãºã䜿çšããããåèªã®å®å®ããçµã¿åããã䜿çšãããã§ããŸãã ãããè¡ãããã«ãå ã®ã³ãŒãã«ã¯word2phraseããã·ãŒãžã£ããããŸãã 圌女ã®ä»äºã¯ãé »ç¹ã«çºçããåèªã®çµã¿åãããèŠã€ããŠããããã®éã®ã¹ããŒã¹ãã¢ã³ããŒã¹ã³ã¢ã«çœ®ãæããããšã§ãã æåã®ãã¹ã®åŸã«ååŸããããã¡ã€ã«ã«ã¯2ã€ã®åèªãå«ãŸããŠããŸãã å床word2phraseã«éä¿¡ãããšãããªãã«ãšãã©ãŒã衚瀺ãããŸãã ãã®çµæã¯ãword2vecã®ãã¬ãŒãã³ã°ã«äœ¿çšã§ããŸãã
word2vecãšã®é¡æšã«ãããRãããã®ããã·ãŒãžã£ãåŒã³åºããŸããã
word2phrase("train_data.txt", "train_phrase.txt", min_count=5, threshold=100)
min_countãã©ã¡ãŒã¿ãŒã¯ãæå®ãããå€ãããå°ãªããã¬ãŒãºãèæ ®ããªãããã«ããŸãã ãããå€ã¯ã¢ã«ãŽãªãºã ã®æ床ãå¶åŸ¡ããå€ã倧ããã»ã©ãæ€åºããããã¬ãŒãºã¯å°ãªããªããŸãã 2åç®ã®ãã¹ã®åŸãç§ã¯çŽ6000ã®çµã¿åãããåŸãŸããã ãã¬ãŒãºèªäœãèŠãããã«ãæåã«ããã¹ã圢åŒã§ã¢ãã«ãäœæããããããåèªã®åãåŒãåºããŠããã®äžã§ãã£ã«ã¿ãªã³ã°ããŸããã 次ã«äŸã瀺ããŸãã
[5887] "works_perfectly" "four_year_old" "multi_million_dollar" [5890] "fresh_faced" "return_living_dead" "seemed_forced" [5893] "freddie_prinze_jr" "re_lucky" "puerto_rico" [5896]ãevery_sentenceããliving_hellããwent_straightã [5899] "supporting_cast_include" "action_set_pieces" "space_shuttle"
è·é¢ïŒïŒã®ããã€ãã®ãã¬ãŒãºãéžæããŸããïŒ
>è·é¢ïŒ "p_model300_2.bin"ãââ "crouching_tiger_hidden_ââdragon"ã10ïŒ åèªïŒcrouching_tiger_hidden_ââdragonèªåœã®äœçœ®ïŒ15492 ã¯ãŒãã³ã¹ãã£ã¹ã 1 tsui_hark 0.6041993 2 ang_lee 0.5996884 3 martial_arts_films 0.5541546 4 kung_fu_hustle 0.5381692 5倧ããã0.5305687 6 kill_bill 0.5279162 7ã°ã©ã€ã³ãããŠã¹0.5242150 8ããã¯ã0.5224440 9äºç®0.5141657 10 john_woo 0.5046486 >è·é¢ïŒ "p_model300_2.bin"ãââ "academy_award_winning"ã10ïŒ åèªïŒacademy_award_winningããã£ãã©ãªãŒã®äœçœ®ïŒ15780 ã¯ãŒãã³ã¹ãã£ã¹ã 1ããããŒã0.4570983 2 ever_produced 0.4558123 3 francis_ford_coppola 0.4547777 4 producer_director 0.4545878 5 set_standard 0.4512480 6åå 0.4503479 7 won_academy_award 0.4477891 8 michael_mann 0.4464636 9 huge_budget 0.4424854 10 directorial_debut 0.4406852
ããã§ãå®éšãå®äºããŸããã éèŠãªæ³šæç¹ã®1ã€ã¯ãword2vecãã¡ã¢ãªãšçŽæ¥ãéä¿¡ãããããšã§ããRã®çµæãäžå®å®ã«åäœããã»ãã·ã§ã³ãã¯ã©ãã·ã¥ãããå¯èœæ§ããããŸãã ããã¯ãRãæ£ããåŠçã§ããªãOSããã®èšºæã¡ãã»ãŒãžã®åºåãåå ã§ããå ŽåããããŸãã ã³ãŒãã«ãšã©ãŒããªãå Žåã¯ãã€ã³ã¿ãŒããªã¿ãŒãŸãã¯Rstudioã®åèµ·åã圹ç«ã¡ãŸãã
Rã³ãŒããCãœãŒã¹ãããã³ç§ã®ãªããžããªã® x64 Windows dllã§ã³ã³ãã€ã«ãããŸã ã
UPDïŒ
ServPonomarevãšã®è«äºããã³ãã®åŸã®word2vecã³ãŒãã®åæã®çµæãã¢ã«ãŽãªãºã ã1000ã¯ãŒãã®è¡ã§ãã¬ãŒãã³ã°ãããããã«æ²¿ã£ãŠãŠã£ã³ããŠããã©ã¹/ãã€ãã¹5ã¯ãŒãã§ç§»åããããšãããããŸããã EOLæåãæ€åºããããšãã¢ã«ãŽãªãºã ã«ãã£ãŠèŸæžå ã®ãŒãçªå·ã®ç¹å¥ãªåèªã«å€æããããŠã£ã³ããŠã®ç§»åãåæ¢ããæ°ããè¡ã§ç¶è¡ãããŸãã ã¢ãã«å ã®EOLã§åºåãããåèªã®è¡šçŸã¯ãã¹ããŒã¹ã§åºåãããåãåèªã®è¡šçŸãšã¯ç°ãªããŸãã çµè«ïŒãœãŒã¹ããã¹ããããã¥ã¡ã³ãã®ã³ã¬ã¯ã·ã§ã³ã§ãããã©ã€ã³ãã£ãŒãã§åºåããããã¬ãŒãºãŸãã¯æ®µèœã§ããå Žåããã®è¿œå æ å ±ãåé€ããªãã§ãã ããã EOLãã£ã©ã¯ã¿ãŒããã¬ãŒãã³ã°ã»ããã«æ®ããŸãã æ®å¿µãªããããããäŸã§èª¬æããããšã¯éåžžã«å°é£ã§ãã