ãã®èšäºã§ã¯ãæ°åŠè£ 眮ã®èåŸã«ããçŽæã®èŠ³ç¹ãããäž»æååæïŒPCAïŒã¡ãœãããã©ã®ããã«æ©èœãããã«ã€ããŠã話ããããšæããŸãã ã§ããã ãã·ã³ãã«ã§ããã詳现ã«ã
æ°åŠã¯äžè¬ã«éåžžã«çŸãããšã¬ã¬ã³ããªç§åŠã§ããããã®çŸããã¯å€ãã®æœè±¡åå±€ã®èåŸã«é ããŠããå ŽåããããŸãã çµå±ã®ãšããããã¹ãŠãäžèŠããããã«èŠãããããã¯ããã«ç°¡åã§ããããšãå€æããŠãããããæãéèŠãªããšã¯ç解ããŠæ³åããããšã§ãã
ä»ã®åæãšåæ§ã«ãããŒã¿åæã§ã¯ãç¶æ³ãã§ããéãæ£ç¢ºã«èšè¿°ããåçŽåãããã¢ãã«ãäœæãããšåœ¹ç«ã€å ŽåããããŸãã å€ãã®å Žåãå åã¯çžäºã«å€§ããäŸåããŠããããããã®åæååšã¯åé·ã§ãã
ããšãã°ãåœç€Ÿã®çææ¶è²»éã¯100ããã¡ãŒãã«ãããã®ãªããã«ã§æž¬å®ãããç±³åœã§ã¯ã¬ãã³ãããã®ãã€ã«ã§æž¬å®ãããŸãã äžèŠãå€ã¯ç°ãªããŸãããå®éã«ã¯äºãã«å³å¯ã«äŸåããŠããŸãã ãã€ã«1600mãã¬ãã³3.8lã 1ã€ã®å åã¯å³å¯ã«ä»ã®å åã«äŸåããŸãã
ããããã¯ããã«é »ç¹ã«ããµã€ã³ãããã»ã©å³å¯ã§ã¯ãªããäºãã«äŸåããŠããããšãèµ·ãããŸãïŒããã¯éèŠã§ãïŒïŒæ瀺çã§ã¯ãããŸããã ãšã³ãžã³å šäœã®äœç©ã¯ã100 km / hãŸã§ã®å éã«ãã©ã¹ã®åœ±é¿ãäžããŸãããããã¯åžžã«æ£ãããšã¯éããŸããã ãããŠãäžèŠãããšèŠããªãèŠå ïŒçæã®å質ã®æ¹åããã軜ãææã®äœ¿çšããã®ä»ã®è¿ä»£çãªææãªã©ïŒãèæ ®ãããšãèªåè»ã®å¹Žã¯åŒ·ããªãããšãããããŸãããããã¯å éã«ã圱é¿ããŸãã
äŸåé¢ä¿ãšãã®åŒ·ããç¥ã£ãŠããã®ã§ãããã€ãã®æ©èœã1ã€ã§è¡šçŸããããã°çµ±åããŠãããåçŽãªã¢ãã«ã§äœæ¥ã§ããŸãã ãã¡ãããæ å ±ã®æ倱ãåé¿ããããšã¯äžå¯èœã§ããå¯èœæ§ãæãé«ããªããŸãããPCAã¡ãœããã ãã§ãããæå°éã«æããããšãã§ããŸãã
ããå³å¯ã«è¡šçŸãããšããã®æ¹æ³ã¯èŠ³æž¬å€ã®n次å ã¯ã©ãŠããæ¥åäœïŒn次å ïŒã«è¿äŒŒãããã®å軞ãå°æ¥ã®äž»èŠã³ã³ããŒãã³ãã«ãªããŸãã ãããŠããã®ãããªè»žã«æ圱ãããšãïŒæ¬¡å åæžïŒãæ倧éã®æ å ±ãä¿åãããŸãã
ã¹ããã1.ããŒã¿ã®æºå
ããã§ã¯ãäŸãç°¡åã«ããããã«ãæ°ååã®å åãšæ°çŸåã®èŠ³å¯ã®ããã®å®éã®ãã¬ãŒãã³ã°ããŒã¿ã»ããã¯äœ¿çšããŸããããã§ããã ãåçŽãªããã¡ãã®äŸãäœæããŸãã 2ã€ã®å åãš10ã®èŠ³å¯çµæã¯ãã¢ã«ãŽãªãºã ã®è žã§äœãããããŠæãéèŠãªã®ã¯ãªãèµ·ããã®ãã説æããã®ã«ååã§ãã
ãµã³ãã«ãçæããŸãã
x = np.arange(1,11) y = 2 * x + np.random.randn(10)*2 X = np.vstack((x,y)) print X OUT: [[ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. ] [ 2.73446908 4.35122722 7.21132988 11.24872601 9.58103444 12.09865079 13.78706794 13.85301221 15.29003911 18.0998018 ]]
ãã®ãµã³ãã«ã§ã¯ãââäºãã«åŒ·ãçžé¢ãã2ã€ã®æ©èœããããŸãã PCAã¢ã«ãŽãªãºã ã䜿çšãããšãæ©èœã®çµã¿åãããç°¡åã«èŠã€ããããšãã§ããæ å ±ãç ç²ã«ããŠãããã®æ©èœã®äž¡æ¹ã1ã€ã®æ°ããæ©èœã§è¡šçŸã§ããŸãã ããã§ã¯æ£ãããã£ãŠã¿ãŸãããïŒ
ãŸããããã€ãã®çµ±èšã ã¢ãŒã¡ã³ãã¯ã©ã³ãã å€æ°ãèšè¿°ããããã«äœ¿çšãããããšãæãåºããŠãã ããã ããããå¿ èŠã§ãã æåŸ ãšåæ£ã ãã®ããããšèšããŸãã æåŸ ã¯å€§ããã®ãéå¿ãã§ãããåæ£ã¯ãã®ããµã€ãºãã§ãã 倧ãŸãã«èšãã°ããããã æåŸ å€ã¯ã©ã³ãã å€æ°ã®äœçœ®ãèšå®ããåæ£ã¯ãã®ãµã€ãºïŒããæ£ç¢ºã«ã¯ã¹ãã¬ããïŒã決å®ããŸãã
æ å ±ã®æ倱ãæå°éã«æããããã«ããã¯ãã«ã¯ãµã³ãã«ã®äžå¿ãééããå¿ èŠãããããããã¯ãã«ã«æ圱ããããã»ã¹ã¯å¹³åå€ã«åœ±é¿ããŸããã ãããã£ãŠããµã³ãã«ãäžå€®ã«é 眮ããŠãæ§ããŸãã-å±æ§ã®å¹³åå€ã0ã«ãªãããã«ãµã³ãã«ãç·åœ¢ã«ã·ããããŸããããã«ããã以éã®èšç®ãå€§å¹ ã«ç°¡çŽ åãããŸãïŒäžå€®ã«é 眮ããªããŠãå®è¡ã§ããããšã«æ³šæããŠãã ããïŒã
ã·ããæŒç®åã®éã¯ãåæå¹³åå€ã®ãã¯ãã«ã«çãããªããŸã-ãµã³ãã«ãå ã®æ¬¡å ã«åŸ©å ããå¿ èŠããããŸãã
Xcentered = (X[0] - x.mean(), X[1] - y.mean()) m = (x.mean(), y.mean()) print Xcentered print "Mean vector: ", m OUT: (array([-4.5, -3.5, -2.5, -1.5, -0.5, 0.5, 1.5, 2.5, 3.5, 4.5]), array([-8.44644233, -8.32845585, -4.93314426, -2.56723136, 1.01013247, 0.58413394, 1.86599939, 7.00558491, 4.21440647, 9.59501658])) Mean vector: (5.5, 10.314393916)
äžæ¹ãåæ£ã¯ãã©ã³ãã å€æ°ã®å€ã®é åºã«åŒ·ãäŸåããŸãã ã¹ã±ãŒãªã³ã°ã«ææã ãããã£ãŠããã£ãŒãã£ã®æž¬å®åäœã®é åºãå€§å¹ ã«ç°ãªãå Žåã¯ãããããæšæºåããããšã匷ããå§ãããŸãã ãã®å Žåãå€ã®é åºã¯ããã»ã©å€ãããªãã®ã§ãäŸãç°¡åã«ããããã«ããã®æäœã¯å®è¡ããŸããã
ã¹ããã2.å ±åæ£è¡å
å€æ¬¡å ã©ã³ãã å€æ°ïŒã©ã³ãã ãã¯ãã«ïŒã®å Žåãäžå¿ã®äœçœ®ã¯ãŸã åèŽããŸãã 軞äžã®äºæž¬ã®æåŸ ã ãããããã®åœ¢ç¶ã説æããã«ã¯ã軞ã«æ²¿ã£ãåæ£ã ãã§ã¯ååã§ã¯ãããŸããã ãããã®ã°ã©ããèŠãŠãã ããã3ã€ã®ã©ã³ãã å€æ°ã¯ãã¹ãŠåãæåŸ å€ãšåæ£ãæã¡ã軞äžã§ã®å šäœã®æ圱ã¯åãã§ãã
ã©ã³ãã ãã¯ãã«ã®åœ¢ç¶ãèšè¿°ããã«ã¯ãå ±åæ£è¡åãå¿ èŠã§ãã
ããã¯ã ïŒiãjïŒèŠçŽ ãç¹åŸŽïŒX i ãX j ïŒã®çžé¢ã§ããè¡åã§ãã å ±åæ£ã®åŒãæãåºããŠãã ããã
ç§ãã¡ã®å Žåã
X i = X jã®å Žå ïŒ
ããã¯ãä»»æã®ã©ã³ãã å€æ°ã«åœãŠã¯ãŸããŸãã
ãããã£ãŠã察è§ç·äžã®ãããªãã¯ã¹ã§ã¯å±æ§ã®åæ£ãããïŒi = jã§ããããïŒãæ®ãã®ã»ã«ã§ã¯å¯Ÿå¿ããå±æ§ã®ãã¢ã®å ±åæ£ããããŸãã ãŸããå ±åæ£ã®å¯Ÿç§°æ§ã«ãããè¡åã察称ã«ãªããŸãã
泚ïŒå ±åæ£è¡åã¯ãå€æ¬¡å ã©ã³ãã å€æ°ã®å Žåã®åæ£ã®äžè¬åã§ã-åæ£ã ãã§ãªããã©ã³ãã å€æ°ã®åœ¢ç¶ïŒã¹ãã¬ããïŒãèšè¿°ããŸãã
å®éã1次å 確çå€æ°ã®åæ£ã¯1x1å ±åæ£è¡åã§ããããã®å¯äžã®é ã¯åŒCovïŒXãXïŒ= VarïŒXïŒã§äžããããŸãã
ãããã£ãŠããµã³ãã«ã®å ±åæ£è¡åΣãäœæããŸãã ãããè¡ãããã«ãåæ£X iããã³X jãšãããã®å ±åæ£ãèšç®ããŸãã äžèšã®åŒã䜿çšã§ããŸãããPythonã䜿çšããŠããããã numpy.covïŒXïŒé¢æ°ã䜿çšããªãã®ã¯çœªã§ãã å ¥åãšããŠãã©ã³ãã å€æ°ã®ãã¹ãŠã®å±æ§ã®ãªã¹ããåãåãããã®å ±åæ£è¡åãè¿ããŸããããã§ãXã¯n次å ã®ã©ã³ãã ãã¯ãã«ïŒnè¡ã®æ°ïŒã§ãã ãã®é¢æ°ã¯ãäžååæ£ã®èšç®ã2ã€ã®éã®å ±åæ£ãããã³å ±åæ£è¡åã®ã³ã³ãã€ã«ã«åªããŠããŸãã
ïŒPythonã§ã¯ãè¡åã¯è¡é åã®åé åãšããŠè¡šãããããšãæãåºããŠãã ãããïŒ
covmat = np.cov(Xcentered) print covmat, "\n" print "Variance of X: ", np.cov(Xcentered)[0,0] print "Variance of Y: ", np.cov(Xcentered)[1,1] print "Covariance X and Y: ", np.cov(Xcentered)[0,1] OUT: [[ 9.16666667 17.93002811] [ 17.93002811 37.26438587]] Variance of X: 9.16666666667 Variance of Y: 37.2643858743 Covariance X and Y: 17.9300281124
ã¹ããã3.åºæãã¯ãã«ãšå€ïŒåºæãã¢ïŒ
ããŠãã©ã³ãã å€æ°ã®åœ¢ç¶ã説æãããããªãã¯ã¹ãåŸãããããããxãšyïŒã€ãŸãX 1ãšX 2 ïŒã®æ¬¡å ãšãå¹³é¢äžã®è¿äŒŒåœ¢ç¶ãååŸã§ããŸãã 次ã«ããµã³ãã«ã®æ圱ã®ãµã€ãºïŒåæ£ïŒãæ倧ã«ãªããããªãã¯ãã«ïŒãã®å Žåã¯1ã€ã®ã¿ïŒãèŠã€ããå¿ èŠããããŸãã
泚ïŒåæ£ã®é«æ¬¡å ãžã®äžè¬åã¯å ±åæ£è¡åã§ãããããã2ã€ã®æŠå¿µã¯åçã§ãã ãã¯ãã«ã«æ圱ããå Žåãæ圱ã®åæ£ã¯æ倧åããã倧ããªæ¬¡æ°ã®ç©ºéã«æ圱ããå Žåããã®å ±åæ£è¡åå šäœãæ倧åãããŸãã
ãããã£ãŠãã©ã³ãã ãã¯ãã«Xãæ圱ããåäœãã¯ãã«ãååŸããŸãããã®åŸããã®æ圱ã¯v T Xã«çãããªããŸãããã¯ãã«ãžã®æ圱ã®åæ£ã¯ãVarïŒv T XïŒã«ãªããŸãã äžè¬çãªåœ¢åŒã§ã¯ããã¯ãã«åœ¢åŒïŒäžå¿éã®å ŽåïŒã§ã¯ãåæ£ã¯æ¬¡ã®ããã«è¡šãããŸãã
ãããã£ãŠãæ圱ã®åæ£ïŒ
åæ£ãæ倧å€v TΣvã§æ倧åãããããšã¯ç°¡åã«ããããŸãã ããã§ã¬ã€ãªãŒã®æ 床ã¯ç§ãã¡ãå©ããŸãã æ°åŠã«æ·±ãå ¥ã蟌ãããšãªããã¬ã€ãªãŒé¢ä¿ã«ã¯å ±åæ£è¡åã®ç¹å¥ãªã±ãŒã¹ããããšã ãèšããŸãã
ãããŠ
æåŸã®åŒã¯ãè¡åãåºæãã¯ãã«ãšå€ã«å解ãããããã¯ã«ç²ŸéããŠããå¿ èŠããããŸãã xã¯åºæãã¯ãã«ã§ãããλã¯åºæå€ã§ãã åºæãã¯ãã«ãšå€ã®æ°ã¯ãè¡åã®ãµã€ãºã«çãããªããŸãïŒå€ã¯ç¹°ãè¿ãããšãã§ããŸãïŒã
ãšããã§ãè±èªã§ã¯ãåºæå€ãšãã¯ãã«ã¯ããããåºæå€ãšåºæãã¯ãã«ãšåŒã°ããŸãã
ããã¯ç§ãã¡ã®èšèãããã¯ããã«çŸããïŒãããŠç°¡æœã«ïŒèãããããã«æããŸãã
ãããã£ãŠãæ圱ã®æ倧åæ£ã®æ¹åã¯åžžã«åºæãã¯ãã«ãšäžèŽããåºæãã¯ãã«ã¯ãã®åæ£ã®å€ã«çãããªããŸã ã
ããã¯ãããå€ãã®æ¬¡å ãžã®æ圱ã«ãåœãŠã¯ãŸããŸããm次å 空éãžã®æ圱ã®åæ£ïŒå ±åæ£è¡åïŒã¯ãæ倧åºæå€ãæã€måã®åºæãã¯ãã«ã®æ¹åã§æ倧ã«ãªããŸãã
ãµã³ãã«ã®æ¬¡å ã¯2ã§ããã®äžã®åºæãã¯ãã«ã®æ°ã¯ãããã2ã§ããããããèŠã€ããŸãã
numpyã©ã€ãã©ãªã¯ãé¢æ°numpy.linalg.eigïŒXïŒãå®è£ ããŸããXã¯æ£æ¹è¡åã§ãã åºæå€ã®é åãšåºæãã¯ãã«ã®é åïŒåãã¯ãã«ïŒã®2ã€ã®é åãè¿ããŸãã ãããŠããã¯ãã«ã¯æ£èŠåãããŸã-ãããã®é·ãã¯1ã§ããå¿ èŠãªãã®ã ãã§ãã ãããã®2ã€ã®ãã¯ãã«ã¯ããµã³ãã«ã®æ°ããåºåºãå®çŸ©ãããã®è»žããµã³ãã«ã®è¿äŒŒæ¥åã®å軞ãšäžèŽããããã«ããŸãã
ãã®ã°ã©ãã§ã¯ããµã³ãã«ãååŸ2ã·ã°ãã®æ¥åã§è¿äŒŒããŸããïŒã€ãŸãããã¹ãŠã®èŠ³æž¬å€ã®95ïŒ ãå«ãã¯ãã§ã-ååãšããŠããã§èŠ³æž¬ããŸãïŒã 倧ããªãã¯ãã«ãå転ããŸããïŒeigïŒXïŒé¢æ°ã¯ãããå察æ¹åã«åããŸããïŒ-ãã¯ãã«ã®åãã§ã¯ãªãæ¹åãéèŠã§ãã
ã¹ããã4.次å åæžïŒæ圱ïŒ
æ倧ã®ãã¯ãã«ã®æ¹åã¯ååž°çŽç·ã«äŒŒãŠããããµã³ãã«ãæ圱ãããšãååž°ã®æ®ãã®é ã®åèšã«çžåœããæ å ±ã倱ãããŸãïŒYã®ãã«ã¿ã§ã¯ãªããè·é¢ã®ã¿ããŠãŒã¯ãªããã«ãªããŸãïŒã ç§ãã¡ã®å Žåããµã€ã³éã®é¢ä¿ã¯éåžžã«åŒ·ããããæ å ±ã®æ倱ã¯æå°éã«æããããŸãã åã®ã°ã©ããããããããã«ãæ圱ã®ãäŸ¡æ Œã-å°ããåºæãã¯ãã«ã®åæ£-ã¯éåžžã«å°ããã§ãã
泚ïŒå ±åæ£è¡åã®å¯Ÿè§èŠçŽ ã¯å ã®åºåºã«æ²¿ã£ãåæ£ã瀺ãããã®åºæå€ã¯æ°ããåºåºã«åŸã£ãŠïŒäž»æåãšãšãã«ïŒåæ£ã瀺ããŸãã
å€ãã®å Žåã倱ãããïŒããã³ä¿åãããïŒæ å ±ã®éãèŠç©ããå¿ èŠããããŸãã ããŒã»ã³ããŒãžã§è¡šç€ºããã®ãæã䟿å©ã§ãã å軞ã«æ²¿ã£ãåæ£ãååŸãã軞ã«æ²¿ã£ãåæ£ã®åèšïŒã€ãŸããå ±åæ£è¡åã®ãã¹ãŠã®åºæå€ã®åèšïŒã§é€ç®ããŸãã
ãããã£ãŠã倧ãããã¯ãã«ã¯45.994 / 46.431 * 100ïŒ = 99.06ïŒ ãè¡šããå°ãããã¯ãã«ã¯ããããçŽ0.94ïŒ ã§ãã å°ãããã¯ãã«ãç Žæ£ãã倧ãããã¯ãã«ã«ããŒã¿ãæ圱ãããšãæ å ±ã®1ïŒ æªæºãã倱ãããŸããã çŽ æŽãããçµæã§ãïŒ
泚ïŒå®éã«ã¯ãã»ãšãã©ã®å Žåãæ å ±ã®åèšæ倱ã10ã20ïŒ ãè¶ ããªãå Žåããã£ã¡ã³ã·ã§ã³ãå®å šã«åæžã§ããŸãã
æ圱ãå®è¡ããã«ã¯ãã¹ããã3ã§åè¿°ããããã«ãæŒç®v T Xãå®è¡ããå¿ èŠããããŸãïŒãã¯ãã«ã®é·ãã¯1ã§ãªããã°ãªããŸããïŒã ãŸãã¯ã1ã€ã®ãã¯ãã«ã§ã¯ãªãè¶ å¹³é¢ãããå Žåããã¯ãã«v Tã®ä»£ããã«åºåºãã¯ãã«V Tã®è¡åã䜿çšããŸãã çµæã®ãã¯ãã«ïŒãŸãã¯è¡åïŒã¯ã芳枬ã®æ圱ã®é åã«ãªããŸãã
_, vecs = np.linalg.eig(covmat) v = -vecs[:,1]) Xnew = dot(v,Xcentered) print Xnew OUT: [ -9.56404107 -9.02021624 -5.52974822 -2.96481262 0.68933859 0.74406645 2.33433492 7.39307974 5.3212742 10.59672425]
ãããïŒXãYïŒã¯ç©ã§ãïŒãããã£ãŠãPythonã§ãã¯ãã«ãšè¡åãä¹ç®ããŸãïŒ
æ圱å€ãåã®ã°ã©ãã®ç»åã«å¯Ÿå¿ããŠããããšã¯ç°¡åã«ããããŸãã
ã¹ããã5.ããŒã¿åŸ©æ§
ãããžã§ã¯ã·ã§ã³ã䜿çšããŠäœæ¥ããããã«åºã¥ããŠä»®èª¬ãç«ãŠãã¢ãã«ãéçºãããšäŸ¿å©ã§ãã ããããåžžã«åãåã£ãããã§ã¯ãªãããäž»èŠãªã³ã³ããŒãã³ãã«ã¯ãéšå€è ã«ãšã£ãŠæ確ã§ç解å¯èœãªæå³ããããŸãã ããšãã°ãæ€åºãããç°åžžå€ããã³ãŒãããŠããã®èåŸã«ãã芳枬å€ã確èªãããšäŸ¿å©ãªå ŽåããããŸãã
ãšãŠãç°¡åã§ãã ãã¹ãŠã®å¿ èŠãªæ å ±ãã€ãŸãå ã®åºåºã®åºåºãã¯ãã«ã®åº§æšïŒæ圱ãããã¯ãã«ïŒãšå¹³åã®ãã¯ãã«ïŒã»ã³ã¿ãªã³ã°ããã£ã³ã»ã«ããïŒããããŸãã ããšãã°ãæ倧å€ã§ãã10.596 ...ããã³ãŒãããŸãã ãããè¡ãã«ã¯ãå³åŽã«è»¢çœ®ãã¯ãã«ãæããŠãå¹³åã®ãã¯ãã«ãè¿œå ãããããµã³ãã«å šäœã®äžè¬çãªåœ¢åŒã§æ¬¡ã®ããã«ããŸããX T v T + m
n = 9 # Xrestored = dot(Xnew[n],v) + m print 'Restored: ', Xrestored print 'Original: ', X[:,n] OUT: Restored: [ 10.13864361 19.84190935] Original: [ 10. 19.9094105]
éãã¯å°ããã§ãããéãã¯ãããŸãã çµå±ã倱ãããæ å ±ã¯åŸ©å ãããŸããã ããã§ãã粟床ãããåçŽããéèŠãªå Žåã埩å ãããå€ã¯å ã®å€ã«å®å šã«è¿äŒŒããŸãã
çµè«ã®ä»£ããã«-ã¢ã«ãŽãªãºã ã®ãã§ãã¯
ãã®ãããã¢ã«ãŽãªãºã ãå解ããããã¡ãã®äŸã§ã©ã®ããã«æ©èœãããã瀺ããŸããããããã䜿çšãããããsklearnã«å®è£ ãããPCAãšæ¯èŒããã ãã§ãã
from sklearn.decomposition import PCA pca = PCA(n_components = 1) XPCAreduced = pca.fit_transform(transpose(X))
n_componentsãã©ã¡ãŒã¿ãŒã¯ãæ圱ãå®è¡ããã次å ã®æ°ãã€ãŸãããŒã¿ã»ãããåæžãã次å ã®æ°ã瀺ããŸãã ã€ãŸãããããã¯æ倧ã®åºæå€ãæã€nåã®åºæãã¯ãã«ã§ãã 次å ãçž®å°ããçµæã確èªããŸãã
print 'Our reduced X: \n', Xnew print 'Sklearn reduced X: \n', XPCAreduced OUT: Our reduced X: [ -9.56404106 -9.02021625 -5.52974822 -2.96481262 0.68933859 0.74406645 2.33433492 7.39307974 5.3212742 10.59672425] Sklearn reduced X: [[ -9.56404106] [ -9.02021625] [ -5.52974822] [ -2.96481262] [ 0.68933859] [ 0.74406645] [ 2.33433492] [ 7.39307974] [ 5.3212742 ] [ 10.59672425]]
sklearnã®PCAã¯åçŽé åãè¿ããŸãããçµæã¯èŠ³æž¬åãã¯ãã«ã®è¡åãšããŠè¿ãããŸãïŒããã¯ç·åœ¢ä»£æ°ã®èŠ³ç¹ããããæšæºçãªãã¥ãŒã§ãïŒã
ååãšããŠãããã¯éèŠã§ã¯ãããŸãããç·åœ¢ä»£æ°ã§ã¯åãã¯ãã«ãä»ããŠè¡åãèšè¿°ããããšã¯æšæºçã§ãããããŒã¿ïŒããã³ããŒã¿ããŒã¹ã«é¢é£ããä»ã®é åïŒã®åæã§ã¯ã芳枬ïŒãã©ã³ã¶ã¯ã·ã§ã³ãã¬ã³ãŒãïŒã¯éåžžè¡ã«æžã蟌ãŸããããšã«æ³šæããŠãã ããã
ä»ã®ã¢ãã«ãã©ã¡ãŒã¿ãŒããã§ãã¯ããŸããã-é¢æ°ã«ã¯ãäžéå€æ°ãžã®ã¢ã¯ã»ã¹ãèš±å¯ããããã€ãã®å±æ§ããããŸãã
-ãã¯ãã«ã®æå³ïŒ mean_
-æ圱ãã¯ãã«ïŒè¡åïŒïŒ components_
-æ圱軞ã®åæ£ïŒãªãã·ã§ã³ïŒïŒ explain_variance_
-æ å ±ã®å ±æïŒåèšåæ£ã®å ±æïŒïŒ explain_variance_ratio_
æ³šïŒ explain_variance_ã¯ãµã³ãã«åæ£ã瀺ããcovïŒïŒé¢æ°ã¯äžååæ£ãèšç®ããŠå ±åæ£è¡åãäœæããŸãïŒ
ååŸããå€ãã©ã€ãã©ãªé¢æ°ã®å€ãšæ¯èŒããŸãã
print 'Mean vector: ', pca.mean_, m print 'Projection: ', pca.components_, v print 'Explained variance ratio: ', pca.explained_variance_ratio_, l[1]/sum(l) OUT: Mean vector: [ 5.5 10.31439392] (5.5, 10.314393916) Projection: [[ 0.43774316 0.89910006]] (0.43774316434772387, 0.89910006232167594) Explained variance: [ 41.39455058] 45.9939450918 Explained variance ratio: [ 0.99058588] 0.990585881238
å¯äžã®éãã¯åæ£ã«ãããŸããããã§ã«èª¬æããããã«ãcovïŒïŒé¢æ°ã䜿çšããŸããããã®é¢æ°ã¯ãäžååæ£ã䜿çšããŸãããexplained_variance_å±æ§ã¯éžæçãªãã®ãè¿ããŸãã ãããã¯ãæåã®ãã®ãããããåŸãããã«ïŒn-1ïŒã§å²ããšããç¹ãšã2çªç®ãnã§å²ããšããç¹ã§ã®ã¿ç°ãªããŸãã 45.99âïŒ10-1ïŒ/ 10 = 41.39ã§ããããšã確èªããã®ã¯ç°¡åã§ãã
ä»ã®ãã¹ãŠã®å€ã¯åãã§ããã€ãŸããã¢ã«ãŽãªãºã ã¯åçã§ãã æåŸã«ãã©ã€ãã©ãªã¢ã«ãŽãªãºã ã®å±æ§ã¯ããããé床ãæé©åãããŠããããå©äŸ¿æ§ã®ããã«åçŽã«å€ãäžžããŠããããïŒãŸãã¯äœããã®äžå ·åãããããïŒãã©ã€ãã©ãªã¢ã«ãŽãªãºã ã®å±æ§ã®ç²ŸåºŠãäœãããšã«æ³šæããŠãã ããã
泚ïŒã©ã€ãã©ãªã¡ãœããã¯ãåæ£ãæ倧åãã軞ã«èªåçã«æ圱ããŸãã ããã¯åžžã«åççã§ã¯ãããŸããã ããšãã°ããã®å³ã§ã¯ããã£ã¡ã³ã·ã§ã³ã®ããããªæžå°ã«ãããåé¡ãäžå¯èœã«ãªããŸãã ãã ããããå°ãããã¯ãã«ã«æ圱ãããšã次å ãæ£åžžã«çž®å°ãããåé¡åãä¿æãããŸãã
ããã§ãPCAã¢ã«ãŽãªãºã ã®åçãšsklearnã§ã®å®è£ ãæ€èšŒããŸããã ãã®èšäºããããŒã¿åæã«ç²Ÿéãå§ããã°ããã®äººã«ã¯ååã«ç解ã§ãããã®ã¢ã«ãŽãªãºã ãããç¥ã£ãŠãã人ã«ã¯å°ãªããšãå°ãæçã§ããããšãé¡ã£ãŠããŸãã çŽæçãªè¡šç€ºã¯ãã¡ãœããã®ä»çµã¿ãç解ããã®ã«éåžžã«åœ¹ç«ã¡ãéžæããã¢ãã«ãé©åã«æ§æããã«ã¯ç解ãéåžžã«éèŠã§ãã ãæž èŽããããšãããããŸããïŒ
PSïŒééãã®å¯èœæ§ã«ã€ããŠèè ãauthorããªãã§ãã ããã èè èªèº«ãããŒã¿åæã«ç²ŸéããŠããéçšã«ããããã®é©ãã¹ãç¥èã®åéãç¿åŸããéçšã§åœŒã®ãããªäººã ãå©ãããã§ãïŒ ãããã建èšçãªæ¹å€ãšå€æ§ãªçµéšã¯ããããé¢ã§æè¿ãããŠããŸãïŒ