ããŒã¿
ãã®ã»ããã®ããŒã¿ã¯ããã«ãã«åãä»ããããSamsung Galaxy S IIã䜿çšãã30人ã®ãã©ã³ãã£ã¢ããååŸãããŸããã ãžã£ã€ãã¹ã³ãŒããšå é床èšããã®ä¿¡å·ã¯ç¹å¥ã«åŠçããã561åã®æšèã«å€æãããŸããã ãã®ããã«ãå€æ®µéãã£ã«ã¿ãªã³ã°ãããŒãªãšå€æãããã³ããã€ãã®æšæºçãªçµ±èšå€æã䜿çšãããŸãããæ°åŠçãªæåŸ å€ãããã¯ãã«éã®è§åºŠã®èšç®ãŸã§ãåèš17ã®é¢æ°ã䜿çšãããŸããã åŠçã®è©³çŽ°ãèŠéããŸã;ãããã¯ãªããžããªã§èŠã€ããããšãã§ããŸãã ãã¬ãŒãã³ã°ãµã³ãã«ã«ã¯7352ã±ãŒã¹ããã¹ãã±ãŒã¹-2947ãå«ãŸããŠããŸããäž¡æ¹ã®ãµã³ãã«ã«ã¯ã6ã€ã®ã¢ã¯ãã£ããã£ïŒwalkingãwalking_upãwalking_downãsittingãstandingãlayingïŒã«å¯Ÿå¿ããã©ãã«ãä»ããŠããŸãã
å®éšã®åºæ¬çãªã¢ã«ãŽãªãºã ãšããŠãã©ã³ãã ãã©ã¬ã¹ããéžæããŸããã éžæã¯ã561åã®å€æ°ããå±æ§ãåå¥ã«éžæããèŠéããå°ãæãã£ããããRFã«ã¯å€æ°ã®éèŠæ§ãè©äŸ¡ããããã®çµã¿èŸŒã¿ã¡ã«ããºã ããããç§ãçµéšãããã£ããšããäºå®ã«åºã¥ããŠããŸããã ãŸãããµããŒããã¯ã¿ãŒãã·ã³ïŒSVMïŒãè©ŠããŠã¿ãããšã«ããŸããã ããã¯å€å žçãªãã®ã§ããããšã¯ç¥ã£ãŠããŸããã以åã«äœ¿çšããå¿ èŠã¯ãããŸããã§ãããäž¡æ¹ã®ã¢ã«ãŽãªãºã ã®ã¢ãã«ã®å質ãæ¯èŒããããšã¯èå³æ·±ããã®ã§ããã
ããŒã¿ãå«ãã¢ãŒã«ã€ããããŠã³ããŒãããŠè§£åããåŸãããããããå¿ èŠãããããšã«æ°ä»ããŸããã ã»ããã®ãã¹ãŠã®éšåãæšèã®ååã掻åã©ãã«ã¯ç°ãªããã¡ã€ã«ã«ãããŸããã Rç°å¢ã«ãã¡ã€ã«ãã¢ããããŒãããããã«ã read.tableïŒïŒé¢æ°ã䜿çšããŸããã å€æ°åãéè€ããŠããããšãå€æãããããæååããå åå€æ°ãžã®èªåå€æãçŠæ¢ããå¿ èŠããããŸããã ããã«ãååã«ç¡å¹ãªæåãå«ãŸããŠããŸããã ãã®åé¡ã¯ã sapplyïŒïŒé¢æ°ã䜿çšãã次ã®æ§æã«ãã£ãŠè§£æ±ºãããŸãããRã§ã¯ãæšæºã®forã«ãŒãããã眮ãæããããŸãã
editNames <- function(x) { y <- var_names[x,2] y <- sub("BodyBody", "Body", y) y <- gsub("-", "", y) y <- gsub(",", "_", y) y <- paste0("v",var_names[x,1], "_",y) return(y) } new_names <- sapply(1:nrow(var_names), editNames)
rbindïŒïŒããã³cbindïŒïŒé¢æ°ã䜿çšããŠãã»ããã®ãã¹ãŠã®éšåãæ¥çããŸããã æåã¯åãæ§é ã®è¡ãæ¥ç¶ãã2çªç®ã¯åãé·ãã®åãæ¥ç¶ããŸãã
ãã¬ãŒãã³ã°ãµã³ãã«ãšãã¹ããµã³ãã«ãæºåããåŸãããŒã¿ã®ååŠçã®å¿ èŠæ§ã«ã€ããŠçåãçããŸããã ã«ãŒãå ã§ranââgeïŒïŒé¢æ°ã䜿çšããŠãå€ã®ç¯å²ãèšç®ããŸããã ãã¹ãŠã®ç¬Šå·ã[-1.1]å ã«ãããããæ£èŠåãã¹ã±ãŒãªã³ã°ãå¿ èŠãªãããšãå€æããŸããã 次ã«ã匷ãåã£ãååžã®å åããã§ãã¯ããŸããã ãããè¡ãã«ã¯ã e1071ããã±ãŒãžã®æªåºŠïŒïŒé¢æ°ã䜿çšããŸããã
SkewValues <- apply(train_data[,-1], 2, skewness) head(SkewValues[order(abs(SkewValues),decreasing = TRUE)],3)
ApplyïŒïŒã¯ã sapplyïŒïŒãšåãã«ããŽãªã®é¢æ°ã§ãåãŸãã¯è¡ã§äœããè¡ãå¿ èŠããããšãã«äœ¿çšãããŸãã train_data [ã-1]-åŸå±å€æ°Activityã®ãªãããŒã¿ã»ãã2ã¯ãåã®å€ãèšç®ããå¿ èŠãããããšã瀺ããŸãã ãã®ã³ãŒãã¯ã3ã€ã®ææªã®å€æ°ãåºåããŸãã
v389_fBodyAccJerkbandsEnergy57_64 v479_fBodyGyrobandsEnergy33_40 v60_tGravityAcciqrX 14.70005 12.33718 12.18477
ãããã®å€ããŒãã«è¿ãã»ã©ãååžã®æªã¿ã¯å°ãããªããŸãããããã§ã¯ãççŽã«èšã£ãŠããªã倧ãããªããŸãã ãã®å Žåããã£ã¬ããã«ã¯BoxCoxå€æã®å®è£ ããããŸãã ã©ã³ãã ãã©ã¬ã¹ãã¯ãã®ãããªããšã«ææã§ã¯ãªãããšãèªãã ã®ã§ããµã€ã³ããã®ãŸãŸã«ããŠãSVMããããã©ã®ããã«åŠçãããã確èªããããšã«ããŸããã
ã¢ãã«ã®å質åºæºãšããŠã粟床ãŸãã¯å¿ å®åºŠã粟床ãŸãã¯ã«ãã蚱容åºæºãéžæããŸãã ã±ãŒã¹ãã¯ã©ã¹éã§åçã«åæ£ãããŠããªãå Žåãã«ããã䜿çšããå¿ èŠããããŸããããã¯ãç¹å®ã®ã¯ã©ã¹ãã©ã³ãã ã«ãåŒãåºãã確çã®ã¿ãèæ ®ããåã粟床ã§ãã [ã¢ã¯ãã£ããã£]åã®æŠèŠïŒïŒãå®è¡ããåŸãååžã確èªããŸããã
WALKING WALKING_UP WALKING_DOWN SITTING STANDING LAYING 1226 1073 986 1286 1374 1407
ã±ãŒã¹ã¯ã»ãŒåãããã«é åžãããŸãããwalking_downïŒãã©ã³ãã£ã¢ã®äžéšã¯æããã«é段ãéããã®ã奜ãŸãªãã£ãïŒãé€ããããã¯ç²ŸåºŠã䜿çšã§ããããšãæå³ããŸãã
ãã¬ãŒãã³ã°
å±æ§ã®å®å šãªã»ããã§RFã¢ãã«ããã¬ãŒãã³ã°ããŸããã ãã®ãããRã§æ¬¡ã®æ§æã䜿çšãããŸããã
fitControl <- trainControl(method="cv", number=5) set.seed(123) forest_full <- train(Activity~., data=train_data, method="rf", do.trace=10, ntree=100, trControl = fitControl)
k = 5ã§kåå²äº€å·®æ€èšŒãæäŸããŸãã ããã©ã«ãã§ã¯ãç°ãªãmtryå€ãæã€3ã€ã®ã¢ãã«ïŒã»ããå šäœããã©ã³ãã ã«éžæãããããªãŒã®åãã©ã³ãã®åè£ãšèŠãªãããå±æ§ã®æ°ïŒããã¬ãŒãã³ã°ããããã®åŸã粟床ã«ãã£ãŠæé©ãªã¢ãã«ãéžæãããŸãã ãã¹ãŠã®ã¢ãã«ã®ããªãŒã®æ°ã¯åãntree = 100ã§ãã
ãã¹ããµã³ãã«ã®ã¢ãã«ã®å質ãå€æããããã«ããã£ã¬ããããconfusionMatrixïŒxãyïŒé¢æ°ãååŸããŸãããxã¯äºæž¬å€ã®ãã¯ãã«ãyã¯ãã¹ããµã³ãã«ã®å€ã®ãã¯ãã«ã§ãã é ä¿¡ã®äžéšã次ã«ç€ºããŸãã
åç §å äºæž¬WALKING WALKING_UP WALKING_DOWN SITTING STANDING LAYING ãŠã©ãŒãã³ã°482 38 17 0 0 0 WALKING_UP 7 426 37 0 0 0 WALKING_DOWN 7 7 366 0 0 0 座ã0 0 0 433 51 0 ã¹ã¿ã³ãã£ã³ã°0 0 0 58 481 0 æ·èš0 0 0 0 0 537 å šäœçãªçµ±èš 粟床ïŒ0.9247 95ïŒ CIïŒïŒ0.9145ã0.9339ïŒ
Intel Core i5ãæèŒããã©ãããããã§ãäžé£ã®çç¶ã®ãã¬ãŒãã³ã°ã«çŽ18åããããŸããã doMCããã±ãŒãžã䜿çšããŠè€æ°ã®ããã»ããµã³ã¢ã䜿çšããããšã§ãOS Xã§æ°åé«éã«å®è¡ã§ããŸãããWindowsã®å Žåããã®ãããªããšã¯ãããŸããã
Caretã¯ãããã€ãã®SVMå®è£ ããµããŒãããŠããŸãã ç§ã¯svmRadialïŒã«ãŒãã«ãåããSVM-æŸå°åºåºé¢æ°ïŒãéžæããŸãããããã¯ãã£ã¬ããã§ããé »ç¹ã«äœ¿çšãããããŒã¿ã«é¢ããç¹å¥ãªæ å ±ããªãå Žåã®äžè¬çãªããŒã«ã§ãã ã¢ãã«ãSVMã§ãã¬ãŒãã³ã°ããã«ã¯ã trainïŒïŒé¢æ°ã®methodãã©ã¡ãŒã¿ãŒã®å€ãsvmRadialã«å€æŽããdo.traceããã³ntreeãã©ã¡ãŒã¿ãŒãåé€ããã ãã§ãã ã¢ã«ãŽãªãºã ã¯æ¬¡ã®çµæã瀺ããŸããããã¹ããµã³ãã«ã®ç²ŸåºŠ-0.952ã åæã«ã5åã®çžäºæ€èšŒã䜿çšããã¢ãã«ã®ãã¬ãŒãã³ã°ã«ã¯7å匷ããããŸããã ç§ã¯ã¡ã¢ãæ®ããŸããïŒã©ã³ãã ãã©ã¬ã¹ããããã«æŽãŸãªãã§ãã ããã
å€æ°ã®éèŠæ§
RFã®å€æ°ã®éèŠæ§ã®çµã¿èŸŒã¿è©äŸ¡ã®çµæã¯ããã£ã¬ããããã±ãŒãžã®varImpïŒïŒé¢æ°ã䜿çšããŠååŸã§ããŸãã ãã©ãŒã ããããïŒvarImpïŒmodelïŒã20ïŒã®æ§ç¯ã«ãããæåã®20åã®ç¹åŸŽã®çžå¯ŸçãªéèŠåºŠã衚瀺ãããŸãã
ååã®ãAccãã¯ããã®å€æ°ããžã£ã€ãã¹ã³ãŒãããã®å é床èšããžã£ã€ããããã®ä¿¡å·ãããããåŠçããããšã«ãã£ãŠååŸãããããšãæå³ããŸãã ã°ã©ããããèŠããšãæãéèŠãªå€æ°ã®äžã«ãžã£ã€ãã¹ã³ãŒãããã®ããŒã¿ããªãããšãããããŸããããã¯å人çã«ã¯é©ãã¹ãããšã§ããã説æããããšã¯ã§ããŸããã ïŒãããã£ããšãéåãã¯å é床èšããã®ä¿¡å·ã®2ã€ã®æåã§ãããtãšfã¯ä¿¡å·ã®æéé åãšåšæ³¢æ°é åã§ãïŒã
éèŠåºŠã«ãã£ãŠéžæãããRFå±æ§ãžã®ä»£å ¥ã¯ç¡æå³ãªæŒç¿ã§ããã圌ã¯ãã§ã«ããããéžæããŠäœ¿çšããŠããŸãã ãã ããSVMã䜿çšãããšè§£æ±ºã§ããŸãã æãéèŠãªå€æ°ã®10ïŒ ããå§ããŠã粟床ãå¶åŸ¡ãããã³ã«10ïŒ ãã€å¢å ãå§ããæ倧å€ãèŠã€ããæåã«ã¹ãããã5ïŒ ã«ã次ã«2.5ïŒ ã«ãæåŸã«1ã€ã®å€æ°ã«æžãããŸããã ãã®çµæãæ倧粟床ã¯çŽ490ã®å åã§ããã0.9545ã«éããŸãããããã¯ãå åã®å®å šãªã»ããã®å€ããã4åã®1ããŒã»ã³ãïŒæ£ç¢ºã«åé¡ãããã±ãŒã¹ã®è¿œå ãã¢ïŒãããåªããŠããŸãã ãã£ã¬ããã«ã¯RFEïŒååž°çæ©èœé€å»ïŒãå®è£ ãããŠããããããã®äœæ¥ãèªååã§ããŸãããã£ã¬ããã¯ååž°çã«åå€æ°ãåé€ããã³è¿œå ããã¢ãã«ã®ç²ŸåºŠãå¶åŸ¡ããŸãã ããã«ã¯2ã€ã®åé¡ããããŸããRFEã®åäœãéåžžã«é ãïŒå éšã«ã©ã³ãã ãã©ã¬ã¹ããããïŒãå±æ§ãšã±ãŒã¹ã®æ°ã䌌ãŠããããŒã¿ã»ããã®å Žåãããã»ã¹ã«ã¯çŽ1æ¥ããããŸãã 2çªç®ã®åé¡ã¯ããã¬ãŒãã³ã°ãµã³ãã«ã®ç²ŸåºŠã§ããããã¯ãRFEãè©äŸ¡ãããã®ã§ããã¹ãã®ç²ŸåºŠãšã¯ãŸã£ããç°ãªããŸãã
varImpããæœåºããç¹å®ã®æ°ã®å±æ§ã®ååãéèŠåºŠã®é«ãé ã«äžŠã¹ãã³ãŒãã¯æ¬¡ã®ããã«ãªããŸãã
imp <- varImp(model)[[1]] vars <- rownames(imp)[order(imp$Overall, decreasing=TRUE)][1:56]
æ©èœãã£ã«ã¿ãªã³ã°
ç§ã®è¯å¿ãã¯ãªã¢ããããã«ãç§ã¯æšèãéžæããä»ã®æ¹æ³ãè©Šãããšã«ããŸããã Kullback-Leiblerãã€ããŒãžã§ã³ã¹ã®å矩èªã§ããæ å ±ã²ã€ã³æ¯ïŒç¿»èš³ã§ã¯æ å ±ã²ã€ã³ãŸãã¯æ å ±ã²ã€ã³ãšããŠæ€åºãããŸãïŒã®èšç®ã«åºã¥ããŠããµã€ã³ã®ãã£ã«ã¿ãªã³ã°ãéžæããŸããã IGRã¯ã2ã€ã®ç¢ºçå€æ°ã®ç¢ºçååžéã®å·®ã®å°ºåºŠã§ãã
IGRãèšç®ããããã«ã FSelectorããã±ãŒãžã®information.gainïŒïŒé¢æ°ã䜿çšããŸããã ããã±ãŒãžã«ã¯JREãå¿ èŠã§ãã ãšããã§ããšã³ããããŒãšçžé¢ã«åºã¥ããŠæ©èœãéžæã§ããããŒã«ãä»ã«ããããŸãã IGRå€ã¯ãååžéã®ãè·é¢ãã®éæ°ã§ããã[0,1]ã§æ£èŠåãããŠããŸãã 1ã€ã«è¿ãã»ã©è¯ãã IGRãèšç®ããåŸãå€æ°ã®ãªã¹ããIGRã®éé ã§äžŠã¹ãŸãããæåã®20ã¯æ¬¡ã®ããã«ãªããŸããã
IGRã¯ãéèŠãªãå±æ§ã®å®å šã«ç°ãªãã»ãããæäŸããéèŠãªå±æ§ãšäžèŽããã®ã¯5ã€ã ãã§ãã äžéšã«åã³ãžã£ã€ãã¹ã³ãŒãã¯ãããŸããããXã³ã³ããŒãã³ãã«ã¯å€ãã®å åããããŸãã IGRã®æ倧å€ã¯0.897ãæå°å€ã¯0ã§ããèšå·ã®é åºä»ããªã¹ããåãåã£ãã®ã§ãéèŠåºŠã ãã§ãªãããã«å¯ŸåŠããŸããã SVMãšRFã§ãã¹ãããŸãããã粟床ãäžããããã«ã¯ããŸãããŸããããŸããã§ããã
åæ§ã®åé¡ã®æ©èœã®éžæãåžžã«æ©èœãããšã¯éãããå°ãªããšã2ã€ã®çç±ããããšæããŸãã æåã®çç±ã¯ããã£ãŒãã£ã®æ§ç¯ã«é¢é£ããŠããŸãã ããŒã¿ã»ãããæºåããç 究è ã¯ãã»ã³ãµãŒããã®ä¿¡å·ãããã¹ãŠã®æ å ±ããçãæ®ããããããšãè©Šã¿ãããããæèçã«ãããè¡ããŸããã ããã€ãã®å åã¯ããå€ãã®æ å ±ãæäŸããããã€ãã¯ããå°ãªãæ å ±ãæäŸããŸãïŒIGRããŒãã«çããããšãå€æããå€æ°ã¯1ã€ã ãã§ãïŒã ããã¯ãIGRã®ããŸããŸãªã¬ãã«ã§ç¹æ§ã®å€ããããããããšæ確ã«èŠãããŸãã æ確ã«ããããã«ã10çªç®ãš551çªç®ãéžæããŸããã IGRãé«ãæšèã®å Žåããã€ã³ãã¯èŠèŠçã«ååã«åé¢å¯èœã§ãããIGRãäœãå Žåã¯è²ã®æ··bleã«äŒŒãŠããŸãããæããã«æçšãªæ å ±ã®äžéšãå«ãŸããŠããããšãããããŸãã
2çªç®ã®çç±ã¯ãåŸå±å€æ°ã3ã€ä»¥äžã®ã¬ãã«ãæã€èŠå ã§ããããã§ãïŒããã§ã¯6ã€ïŒã 1ã€ã®ã¯ã©ã¹ã§æ倧ã®ç²ŸåºŠãéæããããšã«ãããå¥ã®ã¯ã©ã¹ã§ããã©ãŒãã³ã¹ãäœäžããŸãã ããã¯ãåã粟床ã§2ã€ã®ç°ãªããã£ãŒãã£ã»ããã®äžäžèŽãããªãã¯ã¹ã«è¡šç€ºã§ããŸãã
粟床ïŒ0.9243ã561å€æ° åç §å äºæž¬WALKING WALKING_UP WALKING_DOWN SITTING STANDING LAYING ãŠã©ãŒãã³ã°483 36 20 0 0 0 WALKING_UP 1 428 44 0 0 0 WALKING_DOWN 12 7356 0 0 0 座ã£ãŠãã0 0 0 433 45 0 ç«ã£ãŠãã0 0 0 58 487 0 æ·èš0 0 0 0 0 537 粟床ïŒ0.9243ã526å€æ° åç §å äºæž¬WALKING WALKING_UP WALKING_DOWN SITTING STANDING LAYING ãŠã©ãŒãã³ã°482 40 16 0 0 0 WALKING_UP 8 425 41 0 0 0 WALKING_DOWN 6 6 363 0 0 0 座ã£ãŠãã0 0 0 429 44 0 ç«ã£ãŠãã0 0 0 62488 0 æ·èš0 0 0 0 0 537
äžã®ããŒãžã§ã³ã§ã¯ãæåã®2ã€ã®ã¯ã©ã¹ã§ãšã©ãŒãå°ãªããäžã®ããŒãžã§ã³ã§ã¯3çªç®ãš5çªç®ã§ãã
èŠçŽãããšïŒ
1. SVMã¯ç§ã®ã¿ã¹ã¯ã§ã©ã³ãã ãã©ã¬ã¹ãããäœæãããŸããã2åã®é床ã§åäœããããè¯ãã¢ãã«ãæäŸããŸãã
2.å€æ°ã®ç©ççãªæå³ãç解ããããšã¯æ£ããã§ãããã ãŸããäžéšã®ããã€ã¹ã®ãžã£ã€ãã¹ã³ãŒãã§ã¯ä¿åã§ããããã§ãã
3. RFããã®å€æ°ã®éèŠæ§ã䜿çšããŠãä»ã®ãã¬ãŒãã³ã°æ¹æ³ã§å€æ°ãéžæã§ããŸãã
4.ãã£ã«ã¿ãªã³ã°ã«åºã¥ãå±æ§ã®éžæã«ãããã¢ãã«ã®å質ãåžžã«åäžããããã§ã¯ãããŸããããå質ããããã«äœäžãããŠãã¬ãŒãã³ã°æéãççž®ããããšã«ãããå±æ§ã®æ°ãæžããããšãã§ããŸãïŒ20ïŒ ã®éèŠãªå€æ°ã䜿çšãããšãSVMã®ç²ŸåºŠã¯æ倧å€ã®ããã3ïŒ ã§ããïŒã
ãã®èšäºã®ã³ãŒãã¯ç§ã®ãªããžããªã«ãããŸã ã
ããã€ãã®ãµã€ããªã³ã¯ïŒ
- Caret Rããã±ãŒãžã䜿çšããæ©èœéžæ
- ã©ã³ãã ãã©ã¬ã¹ã ãã¬ãªã»ãã¬ã€ãã³ãã¢ãã«ã»ã«ãã©ãŒ
UPDïŒ
kenomaã®ã¢ããã€ã¹ã§ãäž»æååæïŒPCAïŒãå®è¡ãããŸããã ãã£ã¬ããã®preProcessé¢æ°ã§ãªãã·ã§ã³ã䜿çšããŸããã
pca_mod <- preProcess(train_data[,-1], method="pca", thresh = 0.95) pca_train_data <- predict(pca_mod, newdata=train_data[,-1]) dim(pca_train_data) # [1] 7352 102 pca_train_data$Activity <- train_data$Activity pca_test_data <- predict(pca_mod, newdata=test_data[,-1]) pca_test_data$Activity <- test_data$Activity
ã«ãããªã0.95ã§ã¯ã102åã®ã³ã³ããŒãã³ããèŠã€ãããŸããã
ãã¹ãã»ããã®RF粟床ïŒ0.8734ïŒãã«ã»ããã§5ïŒ äœã粟床ïŒ
SVM粟床ïŒ0.9386ïŒ1ããŒã»ã³ãäœãïŒã ãã®çµæã¯ããªãè¯ããšæãã®ã§ããã³ãã¯åœ¹ã«ç«ã¡ãŸããã