ãã®ã¢ãããŒãã¯ãç§ãã¡ã®åœãšè¥¿æŽã§ã©ã®çšåºŠé¢é£ããŠããŸããïŒ ããšãã°ãHabréããã³Mediumã®èšäºã§çµè«ãåºãããšãã§ããŸãã äºæž¬ä¿å®ã®åé¡ã®è§£æ±ºã«é¢ããHabréã«é¢ããèšäºã¯ã»ãšãã©ãããŸããã Mediumã«ã¯ã»ããå šäœããããŸãã ããã ãã ã ããã§ã¯ ããã®ã¢ãããŒãã®ç®æšãšå©ç¹ã詳ãã説æããŠããŸãã
ãã®èšäºãã次ã®ããšãåŠã³ãŸãã
- ãã®æè¡ãå¿ èŠãªçç±
- ã©ã®æ©æ¢°åŠç¿ã¢ãããŒããäºæž¬ã¡ã³ããã³ã¹ã«äžè¬çã«äœ¿çšãããŠããŸããã
- ç°¡åãªäŸã䜿ã£ãŠããªãã¯ã®1ã€ãã©ã®ããã«è©Šãããã

åºæ
äºæž¬ãµãŒãã¹ã¯ã©ã®ãããªæ©èœãæäŸããŸããïŒ
- å¿ èŠã«å¿ããŠå®è¡ãããã修埩äœæ¥ã®å¶åŸ¡ãããããã»ã¹ãããã«ãããè²»çšãç¯çŽããæ¥ãã§ããããã®äœæ¥ã®å質ãåäžãããŸãã
- æ©åšã®åäœã«ãããç¹å®ã®èª€åäœã®ç¹å®ïŒæ©åšãåäœããŠãããšãã«äº€æã®ããã«ç¹å®ã®éšåãè³Œå ¥ã§ããããšã¯å€§ããªå©ç¹ãæäŸããŸãïŒ;
- æ©åšã®åäœãè² è·ãªã©ã®æé©åã
- æ©åšã®å®æçãªã·ã£ããããŠã³ã®ã³ã¹ãã®åæžã
Mediumã«é¢ãã次ã®èšäºã§ã¯ãç¹å®ã®ã±ãŒã¹ã§ãã®åé¡ã«å¯ŸåŠããæ¹æ³ãç解ããããã«åçããå¿ èŠããã質åã«ã€ããŠè©³ãã説æããŠããŸãã
ããŒã¿ãåéãããšãããŸãã¯ããŒã¿ãéžæããŠã¢ãã«ãæ§ç¯ãããšãã次ã®3ã€ã®ã°ã«ãŒãã®è³ªåã«çããããšãéèŠã§ãã
- ãã¹ãŠã®ã·ã¹ãã ã®åé¡ãäºæž¬ã§ããŸããïŒ ã©ã®äºæž¬ãç¹ã«éèŠã§ããïŒ
- é害ããã»ã¹ãšã¯äœã§ããïŒ ã·ã¹ãã å šäœãåäœãåæ¢ããŸããããŸãã¯åäœã¢ãŒãã®ã¿ãå€æŽãããŸããïŒ ããã¯è¿ éãªããã»ã¹ã§ãããç¬éçãŸãã¯æ®µéçãªå£åã§ããïŒ
- ã·ã¹ãã ã®ããã©ãŒãã³ã¹ã¯ããã®ããã©ãŒãã³ã¹ãé©åã«åæ ããŠããŸããïŒ ãããã¯ã·ã¹ãã ã®åã ã®éšåã«é¢ä¿ããŠããŸããããããšãã·ã¹ãã å šäœã«é¢ä¿ããŠããŸããïŒ
ãŸããäºæž¬ãããããšãäºæž¬ã§ããããšãäžå¯èœãªããšãäºåã«ç解ããããšãéèŠã§ãã
Mediumã«é¢ããèšäºã«ã¯ãç¹å®ã®ç®æšã決å®ããã®ã«åœ¹ç«ã€è³ªåããªã¹ããããŠããŸãã
- äœãäºæž¬ããå¿ èŠããããŸããïŒ æ®ãã®å¯¿åœãç°åžžãªåäœãã©ããã次ã®Næé/æ¥/é±ã®å€±æã®ç¢ºçïŒ
- ååãªå±¥æŽããŒã¿ããããŸããïŒ
- ã·ã¹ãã ãç°åžžãªæž¬å®å€ãäžãããšããšãããã§ãªããšããããã£ãŠããŸãã ãã®ãããªå åãããŒã¯ããããšã¯å¯èœã§ããïŒ
- ã¢ãã«ã¯ã©ããŸã§èŠãå¿ èŠããããŸããïŒ æé/æ¥/é±ã®ééã§ã®ã·ã¹ãã ã®åäœãåæ ãã枬å®å€ã®ç¬ç«æ§
- æé©åããã«ã¯äœãå¿ èŠã§ããïŒ ã¢ãã«ãå¯èœãªéãå€ãã®éåããã£ããããäžæ¹ã§ãèª€å ±ãäžããã¹ããããããšã誀æ€ç¥ã®ãªãè€æ°ã®ã€ãã³ãããã£ããããã®ã«ååãã
ä»åŸç¶æ³ãæ¹åãããããšãæãŸããŸãã ãããŸã§ãäºç¥ä¿å šã®åéã«ã¯å°é£ããããŸããã·ã¹ãã ã®èª€åäœã®äŸã¯ã»ãšãã©ãªãããã·ã¹ãã ã®èª€åäœã®ç¬éã§ååã§ãããããŒã¯ã¢ãããããŠããŸããã 倱æããã»ã¹ã¯äžæã§ãã
äºç¥ä¿å šã®å°é£ãå æããäž»ãªæ¹æ³ã¯ã ç°åžžæ€çŽ¢æ¹æ³ã䜿çšããããšã§ãã ãã®ãããªã¢ã«ãŽãªãºã ã¯ããã¬ãŒãã³ã°ã®ããã®ããŒã¯ã¢ãããå¿ èŠãšããŸããã ã¢ã«ãŽãªãºã ããã¹ãããã³ãããã°ããã«ã¯ãäœããã®åœ¢åŒã®ããŒã¯ã¢ãããå¿ èŠã§ãã ãã®ãããªæ¹æ³ã¯ãç¹å®ã®é害ãäºæž¬ãããã€ã³ãžã±ãŒã¿ã®ç°åžžãéç¥ããã ãã§ãããšããç¹ã§å¶éãããŠããŸãã
ããããããã¯ãã§ã«æªãããšã§ã¯ãããŸããã

åºæ
æ¹æ³
次ã«ãç°åžžæ€åºã¢ãããŒãã®ããã€ãã®æ©èœã«ã€ããŠèª¬æããããšæããŸãããã®åŸãå®éã«ããã€ãã®ç°¡åãªã¢ã«ãŽãªãºã ã®æ©èœããã¹ãããŸãã
ç¹å®ã®ç¶æ³ã§ã¯ãç°åžžãæ€çŽ¢ããŠæé©ãªã¢ã«ãŽãªãºã ãéžæããããã«ããã€ãã®ã¢ã«ãŽãªãºã ããã¹ãããå¿ èŠããããŸããããã®åéã§äœ¿çšãããäž»ãªææ³ã®é·æãšçæãç¹å®ããããšã¯å¯èœã§ãã
ãŸããããŒã¿ã®ç°åžžã®å²åãäºåã«ç解ããããšãéèŠã§ãã
åæåž«ããã¢ãããŒãã®ããªãšãŒã·ã§ã³ã«ã€ããŠè©±ããŠããå ŽåïŒãéåžžã®ãããŒã¿ã®ã¿ã調æ»ããç°åžžã®ããããŒã¿ã§äœæ¥ïŒãã¹ãïŒããŸãïŒãæãæé©ãªéžæã¯ã1ã€ã®ã¯ã©ã¹ïŒ One-Class SVM ïŒã®ãµããŒããã¯ãã«æ³ã§ãã ååŸåºåºé¢æ°ãã«ãŒãã«ãšããŠäœ¿çšããå Žåããã®ã¢ã«ãŽãªãºã ã¯åç¹ã®åšãã«éç·åœ¢è¡šé¢ãæ§ç¯ããŸãã ãã¬ãŒãã³ã°ããŒã¿ãã¯ãªãŒã³ã§ããã»ã©ãããŸãæ©èœããŸãã
ä»ã®å Žåã«ã¯ãç°åžžãªãã€ã³ããšãæ£åžžãªããã€ã³ãã®æ¯çãç¥ãå¿ èŠããããŸã-ã«ãããªããããå€ã決å®ããããã
ããŒã¿å ã®ç°åžžã®æ°ã5ïŒ ãè¶ ããŠãããäž»èŠãªãµã³ãã«ããéåžžã«ããåé¢ã§ããå Žåã¯ãæšæºã®ç°åžžæ€çŽ¢æ¹æ³ã䜿çšã§ããŸãã
ãã®å Žåã åé¢ãã©ã¬ã¹ãæ¹åŒã¯å質ã®ç¹ã§æãå®å®ããŠããŸãã åé¢ãã©ã¬ã¹ãã¯ã©ã³ãã åãããããŒã¿ã§ãã ããç¹åŸŽçãªææšã¯ããæ·±ããªãå¯èœæ§ãé«ããäžæ¹ãç°åžžãªææšã¯æåã®å埩ã§æ®ãã®ãµã³ãã«ããåé¢ããŸãã
ä»ã®ã¢ã«ãŽãªãºã ã¯ãããŒã¿ã®ä»æ§ã®äžã§ãé©åãããå Žåãããé©åã«æ©èœããŸãã
ããŒã¿ã«æ£èŠååžãããå Žåã¯ãããŒã¿ãå€æ¬¡å æ£èŠååžã§è¿äŒŒããæ¥åãšã³ãããŒãæ³ãé©ããŠããŸãã ãã€ã³ããååžã«å±ããå¯èœæ§ãäœãã»ã©ãç°åžžã§ãã確çãé«ããªããŸãã
ç°ãªãç¹ã®çžå¯Ÿäœçœ®ããããã®å·®ãããŸãåæ ããããã«ããŒã¿ãæ瀺ãããå Žåãã¡ããªãã¯ææ³ãé©åãªéžæã®ããã§ãïŒäŸãã°ã kæè¿åãk次æè¿åãABODïŒè§åºŠã«åºã¥ãç°åžžå€æ€åºïŒãŸãã¯LOFïŒå±æç°åžžå åïŒ ïŒ
ããããã¹ãŠã®æ¹æ³ã¯ããæ£ãããææšãå€æ¬¡å 空éã®1ã€ã®é åã«éäžããŠããããšã瀺åããŠããŸãã kïŒãŸãã¯kçªç®ïŒã®æè¿åã®ãã¹ãŠãã¿ãŒã²ããããé ãå Žåããã€ã³ãã¯ç°åžžã§ãã ABODã®å Žåãæšè«ã¯äŒŒãŠããŸããkåãã¹ãŠã®æãè¿ããã€ã³ãããèæ ®ããããã€ã³ããšæ¯èŒããŠåãã¹ããŒã¹ã®ã»ã¯ã¿ãŒã«ããå Žåããã®ãã€ã³ãã¯ç°åžžã§ãã LOFã®å ŽåïŒããŒã«ã«å¯åºŠïŒkåã®æè¿åã«ãã£ãŠåãã€ã³ãã«å¯ŸããŠäºåã«æ±ºå®ãããŠããïŒãkåã®æè¿åã®å¯åºŠãããäœãå Žåããã®ãã€ã³ãã¯ç°åžžã§ãã
ããŒã¿ãé©åã«ã¯ã©ã¹ã¿ãŒåãããŠããå Žåã ã¯ã©ã¹ã¿ãŒåæã«åºã¥ãæ¹æ³ãé©åãªéžæã§ãã ãã€ã³ããè€æ°ã®ã¯ã©ã¹ã¿ãŒã®äžå¿ããçè·é¢ã«ããå Žåãããã¯ç°åžžã§ãã
æ倧ã®åæ£å€åã®æ¹åãããŒã¿å ã§æ確ã«åºå¥ãããŠããå Žåã äž»æåæ³ã«åºã¥ããŠç°åžžãæ€çŽ¢ããã®ãé©åãªéžæã®ããã§ãã ãã®å Žåãn1ïŒæããäž»ãªãæåïŒããã³n2ïŒæããäž»ãªãæåïŒã®å¹³åå€ããã®åå·®ã¯ãç°åžžæž¬å®ãšèŠãªãããŸãã
ããšãã°ã The Prognostics and Health Management SocietyïŒPHM SocietyïŒã®ããŒã¿ã»ãããåç §ããããšããå§ãããŸãã ãã®éå¶å©çµç¹ã¯æ¯å¹Žç«¶äºãæé ããŸãã ããšãã°ã2018幎ã«ã¯ãæäœã®ãšã©ãŒãšãã€ãªã³ããŒã ãšããã³ã°ãã©ã³ãã®æ éãŸã§ã®æéãäºæž¬ããå¿ èŠããããŸãã ã 2015幎ã®ããŒã¿ã»ããã䜿çšããŸãã ããã«ã¯ã30ã®ã€ã³ã¹ããŒã«çšã®è€æ°ã®ã»ã³ãµãŒã®èªã¿åãå€ïŒãã¬ãŒãã³ã°ãµã³ãã«ïŒãå«ãŸããŠããããã€ã©ã®ãšã©ãŒãçºçããããäºæž¬ããå¿ èŠããããŸãã
ãããã¯ãŒã¯äžã®ãã¹ããµã³ãã«ã§åçãèŠã€ãããŸããã§ããã®ã§ããã¬ãŒãã³ã°ãµã³ãã«ã§ã®ã¿ãã¬ã€ããŸãã
äžè¬ã«ããã¹ãŠã®èšå®ã¯äŒŒãŠããŸãããããšãã°ãã³ã³ããŒãã³ãã®æ°ãç°åžžã®æ°ãªã©ãç°ãªããŸãã ãããã£ãŠãæåã®20ã§åŠç¿ããä»ã®20ã§ãã¹ãããããšã¯ããŸãæå³ããããŸããã
ãã®ãããã€ã³ã¹ããŒã«ã®1ã€ãéžæããŠããŒããããã®ããŒã¿ã確èªããŸãã ãã®èšäºã¯æ©èœãšã³ãžãã¢ãªã³ã°ã«é¢ãããã®ã§ã¯ãªããããããŸã詳ããã¯èª¬æããŸããã
import pandas as pd import matplotlib.pyplot as plt %matplotlib inline import seaborn as sns from sklearn.covariance import EllipticEnvelope from sklearn.neighbors import LocalOutlierFactor from sklearn.ensemble import IsolationForest from sklearn.svm import OneClassSVM dfa=pd.read_csv('plant_12a.csv',names=['Component number','Time','S1','S2','S3','S4','S1ref','S2ref','S3ref','S4ref']) dfa.head(10)

ã芧ã®ãšããã15åããšã«ååŸããã4ã€ã®ã»ã³ãµãŒã®èªã¿åãå€ããã7ã€ã®ã³ã³ããŒãã³ãããããŸãã 競åã®èª¬æã®S1ref-S4refã¯åç §å€ãšããŠãªã¹ããããŠããŸãããå€ã¯ã»ã³ãµãŒã®èªã¿åãå€ãšã¯éåžžã«ç°ãªããŸãã ããããäœãæå³ããã®ããèããŠæéãç¡é§ã«ããªãããã«ãããããåé€ããŸãã åç¹æ§ïŒS1-S4ïŒã®å€ã®ååžãèŠããšãååžã¯S1ãS2ãS4ã§ã¯é£ç¶çã§ãããS3ã§ã¯é¢æ£çã§ããããšãããããŸãã ããã«ãS2ãšS4ã®å ±åååžãèŠããšããããã¯åæ¯äŸããŠããããšãããããŸãã

çŽæ¥çãªäŸåé¢ä¿ããã®éžè±ã¯ééãã瀺ããŠããå¯èœæ§ããããŸãããããã確èªããã®ã§ã¯ãªããåã«S4ãåé€ããŸãã
ããäžåºŠãããŒã¿ã»ãããåŠçããŸãã S1ãS2ãããã³S3ã®ãŸãŸã«ããŸãã StandardScalerã䜿çšããŠS1ãšS2ãã¹ã±ãŒãªã³ã°ãïŒå¹³åå€ãæžç®ããæšæºåå·®ã§é€ç®ããŸãïŒãS3ãOHEïŒOne Hot EncodingïŒã«å€æããŸãã ãã¹ãŠã®ã€ã³ã¹ããŒã«ã³ã³ããŒãã³ãã®æž¬å®å€ã1è¡ã§çž«ããŸãã åèš89ã®æ©èœã 2 * 7 = 14-7ã€ã®ã³ã³ããŒãã³ãããã³75ã®R3ã®äžæã®å€ã®èªã¿åãå€S1ããã³S2ã ãã®ãããªè¡ã¯ããã56åè¡ã§ãã
ãšã©ãŒã®ãããã¡ã€ã«ãã¢ããããŒãããŸãã
dfc=pd.read_csv('plant_12c.csv',names=['Start Time', 'End Time','Type']) dfc.head()

ããŒã¿ã»ããã§ãããã®ã¢ã«ãŽãªãºã ãè©Šãåã«ãããå°ãäœè«ãããŸãã ãã¹ãããå¿ èŠããããŸãã ãã®ããããšã©ãŒã®éå§æå»ãšçµäºæå»ãååŸããããšããå§ãããŸãã ãããŠããã®ééå ã®ãã¹ãŠã®å åã¯ç°åžžãšèŠãªãããå€éš-æ£åžžãšèŠãªãããŸãã ãã®ã¢ãããŒãã«ã¯å€ãã®æ¬ ç¹ããããŸãã ãã ããç¹ã«1ã€-ãšã©ãŒãä¿®æ£ãããåã«ç°åžžãªåäœãçºçããå¯èœæ§ãé«ãã§ãã å¿ å®æ§ã®ããã«ã30ååã«ç°åžžã®ãŠã£ã³ããŠãæéçã«ã·ããããŸãã F1ã®æž¬å®å€ã粟床ãåçŸçãè©äŸ¡ããŸãã
æ©èœãéžæããã¢ãã«ã®å質ã決å®ããããã®ã³ãŒãïŒ
def load_and_preprocess(plant_num): # , dfa=pd.read_csv('plant_{}a.csv'.format(plant_num),names=['Component number','Time','S1','S2','S3','S4','S1ref','S2ref','S3ref','S4ref']) dfc=pd.read_csv('plant_{}c.csv'.format(plant_num),names=['Start Time','End Time','Type']).drop(0,axis=0) N_comp=len(dfa['Component number'].unique()) # 15 dfa['Time']=pd.to_datetime(dfa['Time']).dt.round('15min') # 6 ( , ) dfc=dfc[dfc['Type']!=6] dfc['Start Time']=pd.to_datetime(dfc['Start Time']) dfc['End Time']=pd.to_datetime(dfc['End Time']) # , OHE 3- dfa=pd.concat([dfa.groupby('Time').nth(i)[['S1','S2','S3']].rename(columns={"S1":"S1_{}".format(i),"S2":"S2_{}".format(i),"S3":"S3_{}".format(i)}) for i in range(N_comp)],axis=1).dropna().reset_index() for k in range(N_comp): dfa=pd.concat([dfa.drop('S3_'+str(k),axis=1),pd.get_dummies(dfa['S3_'+str(k)],prefix='S3_'+str(k))],axis=1).reset_index(drop=True) # df_train,df_test=train_test_split(dfa,test_size=0.25,shuffle=False) cols_to_scale=df_train.filter(regex='S[1,2]').columns scaler=preprocessing.StandardScaler().fit(df_train[cols_to_scale]) df_train[cols_to_scale]=scaler.transform(df_train[cols_to_scale]) df_test[cols_to_scale]=scaler.transform(df_test[cols_to_scale]) return df_train,df_test,dfc # def get_true_labels(measure_times,dfc,shift_delta): idxSet=set() dfc['Start Time']-=pd.Timedelta(minutes=shift_delta) dfc['End Time']-=pd.Timedelta(minutes=shift_delta) for idx,mes_time in tqdm_notebook(enumerate(measure_times),total=measure_times.shape[0]): intersect=np.array(dfc['Start Time']<mes_time).astype(int)*np.array(dfc['End Time']>mes_time).astype(int) idxs=np.where(intersect)[0] if idxs.shape[0]: idxSet.add(idx) dfc['Start Time']+=pd.Timedelta(minutes=shift_delta) dfc['End Time']+=pd.Timedelta(minutes=shift_delta) true_labels=pd.Series(index=measure_times.index) true_labels.iloc[list(idxSet)]=1 true_labels.fillna(0,inplace=True) return true_labels # def check_model(model,df_train,df_test,filt='S[123]'): model.fit(df_train.drop('Time',axis=1).filter(regex=(filt))) y_preds = pd.Series(model.predict(df_test.drop(['Time','Label'],axis=1).filter(regex=(filt)))).map({-1:1,1:0}) print('F1 score: {:.3f}'.format(f1_score(df_test['Label'],y_preds))) print('Precision score: {:.3f}'.format(precision_score(df_test['Label'],y_preds))) print('Recall score: {:.3f}'.format(recall_score(df_test['Label'],y_preds))) score = model.decision_function(df_test.drop(['Time','Label'],axis=1).filter(regex=(filt))) sns.distplot(score[df_test['Label']==0]) sns.distplot(score[df_test['Label']==1]) df_train,df_test,anomaly_times=load_and_preprocess(12) df_test['Label']=get_true_labels(df_test['Time'],dfc,30)

PHM 2015 Data ChallengeããŒã¿ã»ããã§ã®åçŽãªç°åžžæ€çŽ¢ã¢ã«ãŽãªãºã ã®ãã¹ãçµæ
ã¢ã«ãŽãªãºã ã«æ»ããŸãã ããŒã¿ã«å¯ŸããŠOne Class SVMïŒOCSVMïŒãIsolationForestïŒIFïŒãEllipticEnvelopeïŒEEïŒãLocalOutlierFactorïŒLOFïŒãè©ŠããŠã¿ãŸãããã ãŸãããã©ã¡ãŒã¿ãŒãèšå®ããŸããã LOFã¯2ã€ã®ã¢ãŒãã§åäœããããšã«æ³šæããŠãã ããã novelty = Falseããã¬ãŒãã³ã°ã»ããã§ã®ã¿ç°åžžãæ€çŽ¢ã§ããå ŽåïŒfit_predictã®ã¿ããããŸãïŒãTrueã®å Žåããã¬ãŒãã³ã°ã»ããå€ã®ç°åžžãæ€çŽ¢ããããšãç®çãšããŠããŸãïŒåå¥ã«é©åããã³äºæž¬ã§ããŸãïŒã IFã«ã¯ãæ°æ§ã®åäœã¢ãŒãããããŸãã æ°åã䜿çšããŠããŸãã 圌ã¯ããè¯ãçµæãåºããŸãã
OCSVMã¯ç°åžžãé©åã«æ€åºããŸããã誀æ€åºãå€ãããŸãã ä»ã®æ¹æ³ã§ã¯ãçµæã¯ããã«æªããªããŸãã
ããããããŒã¿ã®ç°åžžã®å²åãç¥ã£ãŠãããšä»®å®ããŸãã ç§ãã¡ã®å Žåã27ïŒ ã OCSVMã«ã¯nuããããŸãããšã©ãŒã®å²åã®äžéã®æšå®å€ãšãµããŒããã¯ãã«ã®å²åã®äžéå€ã§ãã ä»ã®æ±ææ¹æ³ã«ã¯ãããŒã¿ãšã©ãŒã®å²åããããŸãã IFããã³LOFã¡ãœããã§ã¯èªåçã«æ±ºå®ãããŸãããOCSVMããã³EEã§ã¯ããã©ã«ãã§0.1ã«èšå®ãããŸãã æ±æïŒnuïŒã0.27ã«èšå®ããŠã¿ãŸãããã EEã®æé«ã®çµæã
ã¢ãã«ããã§ãã¯ããããã®ã³ãŒãïŒ
def check_model(model,df_train,df_test,filt='S[123]'): model_type,model = model model.fit(df_train.drop('Time',axis=1).filter(regex=(filt))) y_preds = pd.Series(model.predict(df_test.drop(['Time','Label'],axis=1).filter(regex=(filt)))).map({-1:1,1:0}) print('F1 score for {}: {:.3f}'.format(model_type,f1_score(df_test['Label'],y_preds))) print('Precision score for {}: {:.3f}'.format(model_type,precision_score(df_test['Label'],y_preds))) print('Recall score for {}: {:.3f}'.format(model_type,recall_score(df_test['Label'],y_preds))) score = model.decision_function(df_test.drop(['Time','Label'],axis=1).filter(regex=(filt))) sns.distplot(score[df_test['Label']==0]) sns.distplot(score[df_test['Label']==1]) plt.title('Decision score distribution for {}'.format(model_type)) plt.show()
ããŸããŸãªæ¹æ³ã®ç°åžžã€ã³ãžã±ãŒã¿ã®ååžãèŠãã®ã¯èå³æ·±ãã§ãã ãã®ããŒã¿ã§ã¯LOFãããŸãæ©èœããªãããšãããããŸãã EEã«ã¯ãã¢ã«ãŽãªãºã ã極端ã«ç°åžžã§ãããšèŠãªããã€ã³ãããããŸãã ãã ããéåžžã®ãã€ã³ãã¯ããã«èœã¡ãŸãã IsoForãšOCSVMã¯ãã«ãããªããããå€ïŒæ±æ/ nuïŒã®éžæãéèŠã§ããããšã瀺ããŠããŸããããã«ããã粟床ãšå®å šæ§ã®ãã¬ãŒããªããå€ãããŸãã

ã»ã³ãµãŒã®èªã¿åãå€ãå®åžžå€ã«è¿ããæ£èŠååžã«è¿ãããšã¯è«ççã§ãã ã©ãã«ã®ä»ãããã¹ããµã³ãã«ãããã°ãã§ããã°æ€èšŒçšã®ãµã³ãã«ãããã°ãæ±æå€ã«è²ãä»ããããšãã§ããŸãã 次ã®è³ªåã¯ãã©ã®ãšã©ãŒãããæåçã§ãããã§ãïŒåœéœæ§ãŸãã¯åœé°æ§ïŒ
LOFã®çµæã¯éåžžã«äœãã§ãã ããŸãå°è±¡çã§ã¯ãããŸããã ãã ããOHEå€æ°ã¯ãStandardScalerã«ãã£ãŠå€æãããå€æ°ãšãšãã«å ¥åã«éãããããšã«æ³šæããŠãã ããã ããã©ã«ãã®è·é¢ã¯ãŠãŒã¯ãªããã§ãã ãã ããS1ããã³S2ã«åŸã£ãŠå€æ°ã®ã¿ãã«ãŠã³ãããå Žåãç¶æ³ã¯ä¿®æ£ãããçµæã¯ä»ã®æ¹æ³ãšæ¯èŒã§ããŸãã ãã ãããªã¹ããããŠããã¡ããªãã¯åé¡åã®éèŠãªãã©ã¡ãŒã¿ãŒã®1ã€ã¯ãè¿é£ã®æ°ã§ããããšãç解ããããšãéèŠã§ãã å質ã«å€§ãã圱é¿ããããã調æŽããå¿ èŠããããŸãã è·é¢ã¡ããªãã¯èªäœãéžæãããšããã§ãããã
次ã«ã2ã€ã®ã¢ãã«ãçµã¿åãããŠã¿ãŸãã æåã«ããã¬ãŒãã³ã°ã»ããããç°åžžãåé€ããŸãã ãããŠããããã¯ãªãŒã³ãªããã¬ãŒãã³ã°ã»ããã§OCSVMããã¬ãŒãã³ã°ããŸãã 以åã®çµæã«ãããšãEEã§æ倧ã®å®å šæ§ã芳å¯ãããŸããã EEãéããŠãã¬ãŒãã³ã°ãµã³ãã«ãã¯ãªã¢ããOCSVMããã¬ãŒãã³ã°ããŠãF1 = 0.50ã粟床= 0.34ãå®å šæ§= 0.95ãååŸããŸãã å°è±¡çã§ã¯ãããŸããã ããããnu = 0.27ãèŠæ±ããŸããã ãããŠãç§ãã¡ãæã£ãŠããããŒã¿ã¯å€ããå°ãªãããã¯ãªãŒã³ãã§ãã ãã¬ãŒãã³ã°ã»ããã®EEã®å 足床ãåãã§ãããšä»®å®ãããšããšã©ãŒã®5ïŒ ãæ®ããŸãã ãã®ãããªnuãèšå®ãããšãF1 = 0.69ã粟床= 0.59ãå®å šæ§= 0.82ã«ãªããŸãã çŽ æŽãããã ä»ã®æ¹æ³ã§ã¯ããã®ãããªçµã¿åããã¯æ©èœããªãããšã«æ³šæããããšãéèŠã§ãããããã®çµã¿åããã¯ããã¬ãŒãã³ã°ã»ããã®ç°åžžæ°ãšãã¹ãæ°ãåãã§ããããšãæå³ããããã§ãã çŽç²ãªãã¬ãŒãã³ã°ããŒã¿ã»ããã§ãããã®æ¹æ³ããã¬ãŒãã³ã°ããå Žåãå®éã®ããŒã¿ãããå°ãªãæ±æãæå®ãããŒãã«è¿ãå€ãæå®ããå¿ èŠã¯ãããŸãããã亀差æ€èšŒçšã«éžæããããšããå§ãããŸãã
æ瀺ã®ã·ãŒã±ã³ã¹ã§æ€çŽ¢çµæãèŠãã®ã¯èå³æ·±ãã§ãïŒ

ãã®å³ã¯ã7ã€ã®ã³ã³ããŒãã³ãã®ç¬¬1ããã³ç¬¬2ã»ã³ãµãŒã®èªã¿åãå€ã®ã»ã°ã¡ã³ãã瀺ããŠããŸãã å¡äŸã§ã¯ã察å¿ãããšã©ãŒã®è²ïŒéå§ãšçµäºã¯åãè²ã®çžŠç·ã§è¡šç€ºãããŸãïŒã ãããã¯äºæž¬ã瀺ããŸããç·-çã®äºæž¬ãèµ€-åœéœæ§ã玫-åœé°æ§ã å³ããããšã©ãŒæéãèŠèŠçã«å€æããããšã¯å°é£ã§ãããã¢ã«ãŽãªãºã ã¯ãã®ã¿ã¹ã¯ã«éåžžã«ãã察å¿ããŠããããšãããããŸãã ããã§ã¯ã3çªç®ã®ã»ã³ãµãŒã®æž¬å®å€ã瀺ãããŠããªãããšãç解ããããšãéèŠã§ãã ããã«ããšã©ãŒã®çµäºåŸã«èª€æ€ç¥ã®èªã¿åãå€ããããŸãã ã€ãŸã ã¢ã«ãŽãªãºã ã¯èª€ã£ãå€ããããšå€æãããã®é åã«ãšã©ãŒããªããšããŒã¯ããŸããã å³ã®å³åŽã¯ãšã©ãŒã®åã®é åã瀺ããŠããŸãããšã©ãŒã®ãªãé åïŒãšã©ãŒã®30ååïŒãããŒã¯ãããšããšã©ãŒãªããšèªèãããã¢ãã«ã®ãšã©ãŒããã¬ãã£ãã«ãªããŸãã å³ã®äžå€®ã§ã¯ãäžè²«æ§ã®ããããŒã¹ãèªèããããšã©ãŒãšããŠèªèãããŸãã çµè«ã¯æ¬¡ã®ããã«æãããšãã§ããŸãïŒç°åžžã®æ€çŽ¢ã®åé¡ã解決ãããšããããŒã¯ã¢ããã§äœ¿çšãããã¢ã«ãŽãªãºã ããã§ãã¯ããŠãçŸå®ãå®å šã«åæ ããããã®ãããªã¢ã«ãŽãªãºã ãå¯èœãªæ¡ä»¶ãã·ãã¥ã¬ãŒãããªããããåºåãäºæž¬ããå¿ èŠãããã·ã¹ãã ã®æ¬è³ªãç解ãããšã³ãžãã¢ãšå¯æ¥ã«ããåãããå¿ èŠããããŸã䜿çšãããŸãã
ãã£ãŒããããããããããã®ã³ãŒãïŒ
def plot_time_course(df_test,dfc,y_preds,start,end,vert_shift=4): plt.figure(figsize=(15,10)) cols=df_train.filter(regex=('S[12]')).columns add=0 preds_idx=y_preds.iloc[start:end][y_preds[0]==1].index true_idx=df_test.iloc[start:end,:][df_test['Label']==1].index tp_idx=set(true_idx.values).intersection(set(preds_idx.values)) fn_idx=set(true_idx.values).difference(set(preds_idx.values)) fp_idx=set(preds_idx.values).difference(set(true_idx.values)) xtime=df_test['Time'].iloc[start:end] for col in cols: plt.plot(xtime,df_test[col].iloc[start:end]+add) plt.scatter(xtime.loc[tp_idx].values,df_test.loc[tp_idx,col]+add,color='green') plt.scatter(xtime.loc[fn_idx].values,df_test.loc[fn_idx,col]+add,color='violet') plt.scatter(xtime.loc[fp_idx].values,df_test.loc[fp_idx,col]+add,color='red') add+=vert_shift failures=dfc[(dfc['Start Time']>xtime.iloc[0])&(dfc['Start Time']<xtime.iloc[-1])] unique_fails=np.sort(failures['Type'].unique()) colors=np.array([np.random.rand(3) for fail in unique_fails]) for fail_idx in failures.index: c=colors[np.where(unique_fails==failures.loc[fail_idx,'Type'])[0]][0] plt.axvline(failures.loc[fail_idx,'Start Time'],color=c) plt.axvline(failures.loc[fail_idx,'End Time'],color=c) leg=plt.legend(unique_fails) for i in range(len(unique_fails)): leg.legendHandles[i].set_color(colors[i])
ç°åžžã®å²åã5ïŒ æªæºã§ããå Žåãããã³/ãŸãã¯ãæ£åžžãªãã€ã³ãžã±ãŒã¿ããã®åé¢ãäžååãªå Žåãäžèšã®æ¹æ³ã¯ããŸãæ©èœããããã¥ãŒã©ã«ãããã¯ãŒã¯ã«åºã¥ãã¢ã«ãŽãªãºã ã䜿çšãã䟡å€ããããŸãã æãåçŽãªå Žåããããã¯æ¬¡ã®ããã«ãªããŸãã
- èªåãšã³ã³ãŒããŒïŒèšç·Žãããèªåãšã³ã³ãŒããŒã®é«ããšã©ãŒã¯ãç°åžžãªèªã¿åãå€ãéç¥ããŸãïŒ;
- ååž°ãããã¯ãŒã¯ïŒæåŸã®èªã¿åããäºæž¬ããããã®ã·ãŒã±ã³ã¹ã«ããåŠç¿ãå·®ã倧ããå Žå-ãã€ã³ãã¯ç°åžžã§ãïŒã
ãããšã¯å¥ã«ãæç³»åã§ã®äœæ¥ã®è©³çŽ°ã«æ³šç®ãã䟡å€ããããŸãã äžèšã®ã¢ã«ãŽãªãºã ã®ã»ãšãã©ïŒèªåãšã³ã³ãŒããŒãšãã©ã¬ã¹ãã®åé¢ãé€ãïŒã¯ãã©ã°æ©èœïŒä»¥åã®æç¹ããã®èªã¿åãå€ïŒãè¿œå ãããšå質ãäœäžããå¯èœæ§ãé«ãããšãç解ããããšãéèŠã§ãã
ãã®äŸã§é 延æ©èœãè¿œå ããŠã¿ãŸãããã 競åä»ç€Ÿã®èª¬æã§ã¯ããšã©ãŒã®3æéåã®å€ã¯ãšã©ãŒãšã¯ãŸã£ããé¢ä¿ããªããšãããŠããŸãã ãã®åŸã3æéã§æšèãè¿œå ããŸãã åèš259ãµã€ã³ã
ãã®çµæãOCSVMãšIsolationForestã®çµæã¯ã»ãšãã©å€åããŸããã§ããããæ¥åãšã³ãããŒããšLOFã®çµæã¯äœäžããŸããã
ã·ã¹ãã ã®ãã€ããã¯ã¹ã«é¢ããæ å ±ã䜿çšããã«ã¯ããªã«ã¬ã³ããŸãã¯ç³ã¿èŸŒã¿ãã¥ãŒã©ã«ãããã¯ãŒã¯ã§èªåãšã³ã³ãŒããŒã䜿çšããå¿ èŠããããŸãã ãŸãã¯ãããšãã°ãèªåãšã³ã³ãŒããæ å ±ã®å§çž®ãããã³å§çž®ãããæ å ±ã«åºã¥ããŠç°åžžãæ€çŽ¢ããåŸæ¥ã®ã¢ãããŒãã®çµã¿åããã éã®ã¢ãããŒããææãªããã§ãã æšæºçãªã¢ã«ãŽãªãºã ã«ããæãç¹åŸŽã®ãªããã€ã³ãã®äžæ¬¡ã¹ã¯ãªãŒãã³ã°ãšãããã¯ãªãŒã³ãªããŒã¿ã§ãã§ã«èªåãšã³ã³ãŒããŒããã¬ãŒãã³ã°ããŸãã

åºæ
1次å ã®æç³»åãæäœããããã®äžé£ã®ãã¯ããã¯ããããŸãã ãããã¯ãã¹ãŠãå°æ¥ã®æž¬å®å€ãäºæž¬ããããšãç®çãšããŠãããäºæž¬ãšç°ãªãç¹ã¯ç°åžžãšèŠãªãããŸãã
Holt-Wintersã¢ãã«
ããªãã«ææ°å¹³æ»æ³ã¯ãã·ãªãŒãºãã¬ãã«ããã¬ã³ããå£ç¯æ§ã®3ã€ã®ã³ã³ããŒãã³ãã«åå²ããŸãã ãããã£ãŠãã·ãªãŒãºããã®åœ¢åŒã§è¡šç€ºãããå Žåããã®æ¹æ³ã¯ããŸãæ©èœããŸãã Facebook Prophetã¯åæ§ã®åçã§åäœããŸãããã³ã³ããŒãã³ãèªäœãç°ãªãæ¹æ³ã§è©äŸ¡ããŸãã 詳现ã«ã€ããŠã¯ãããšãã°ãã¡ããã芧ãã ãã ã
SïŒARIMAïŒ
ãã®æ¹æ³ã§ã¯ãäºæž¬ã¢ãã«ã¯èªå·±ååž°ãšç§»åå¹³åã«åºã¥ããŠããŸãã SïŒARIMAïŒã®æ¡åŒµã«ã€ããŠè©±ããŠããå Žåãå£ç¯æ§ãè©äŸ¡ã§ããŸãã ã¢ãããŒãã®è©³çŽ°ã«ã€ããŠã¯ã ãã¡ã ã ãã¡ã ã ãã¡ããã芧ãã ãã ã
ãã®ä»ã®äºæž¬ãµãŒãã¹ã¢ãããŒã
æç³»åã«é¢ããŠããšã©ãŒã®çºçæå»ã«é¢ããæ å ±ãããå Žåãæåž«ã«æè²æ¹æ³ãé©çšã§ããŸãã ã¿ã°ä»ãããŒã¿ã®å¿ èŠæ§ã«å ããŠããã®å Žåããšã©ãŒäºæž¬ã¯ãšã©ãŒã®æ§è³ªã«äŸåããããšãç解ããããšãéèŠã§ãã å€ãã®ãšã©ãŒããããæ§è³ªãç°ãªãå Žåããããããåå¥ã«äºæž¬ããå¿ èŠããããããã«å€ãã®ã©ãã«ä»ãããŒã¿ãå¿ èŠã«ãªããŸãããèŠéãã¯ããé åçã§ãã
äºæž¬ã¡ã³ããã³ã¹ã§æ©æ¢°åŠç¿ã䜿çšããå¥ã®æ¹æ³ããããŸãã ããšãã°ãä»åŸNæ¥éã®ã·ã¹ãã é害ã®äºæž¬ïŒåé¡ã¿ã¹ã¯ïŒã ãã®ãããªã¢ãããŒãã§ã¯ãã·ã¹ãã åäœã®ãšã©ãŒã®åã«å£åæéãå¿ èŠã§ããïŒå¿ ãããç·©ããã§ã¯ãªãïŒããšãç解ããããšãéèŠã§ãã ãã®å Žåãæãæåããã¢ãããŒãã¯ãç³ã¿èŸŒã¿å±€ããã³/ãŸãã¯ååž°å±€ãæã€ãã¥ãŒã©ã«ãããã¯ãŒã¯ã®äœ¿çšã§ãããšæãããŸãã ãããšã¯å¥ã«ãæç³»åãå¢åŒ·ããæ¹æ³ã«æ³šç®ãã䟡å€ããããŸãã ç§ã«ãšã£ãŠã 2ã€ã®ã¢ãããŒããæãèå³æ·±ããšåæã«ã·ã³ãã«ã«æããŸãã
- è¡ã®é£ç¶éšåãéžæããïŒããšãã°ã70ïŒ ã§æ®ãã¯åé€ãããŸãïŒãå ã®ãµã€ãºã«åŒã䌞ã°ãããŸãã
- è¡ã®é£ç¶éšåïŒ20ïŒ ãªã©ïŒãéžæããã䌞瞮ãããŸãã ãã®åŸãè¡å šäœãããã«å¿ããŠå ã®ãµã€ãºã«å§çž®ãŸãã¯æ¡å€§ãããŸãã
ã·ã¹ãã ã®æ®ãã®å¯¿åœãäºæž¬ãããªãã·ã§ã³ããããŸãïŒååž°ã¿ã¹ã¯ïŒã ããã§ãå¥ã®ã¢ãããŒããåºå¥ã§ããŸããäºæž¬ã¯å¯¿åœã§ã¯ãªããã¯ã€ãã«ååžãã©ã¡ãŒã¿ãŒã§ãã
ãã£ã¹ããªãã¥ãŒã·ã§ã³èªäœã«ã€ããŠã¯ãã¡ã ãããã³ãªã«ã¬ã³ãã¡ãã·ã¥ãšçµã¿åãããŠäœ¿çšââããæ¹æ³ã«ã€ããŠã¯ãã¡ããã芧ãã ãã ã ãã®ååžã«ã¯2ã€ã®ãã©ã¡ãŒã¿ãŒÎ±ãšÎ²ããããŸãã αã¯ã€ãã³ãããã€çºçãããã瀺ããβã¯ã¢ã«ãŽãªãºã ã®ä¿¡é ŒåºŠã瀺ããŸãã ãã®ã¢ãããŒãã®é©çšã¯ææã§ããããã®å Žåãé©åãªå¯¿åœãäºæž¬ãããããã¢ã«ãŽãªãºã ãæåã¯å®å šã§ãªãæ¹ãç°¡åã§ããããããã¥ãŒã©ã«ãããã¯ãŒã¯ããã¬ãŒãã³ã°ããããšã¯å°é£ã§ãã
ãããšã¯å¥ã«ã Coxååž°ã«æ³šç®ãã䟡å€ããããŸãã 蚺æåŸã®åæç¹ã§ã·ã¹ãã ã®ãã©ãŒã«ããã¬ã©ã³ã¹ãäºæž¬ãã2ã€ã®æ©èœã®ç©ãšããŠæ瀺ããããšãã§ããŸãã 1ã€ã®æ©èœã¯ããã©ã¡ãŒã¿ãŒã«äŸåããªãã·ã¹ãã ã®å£åã§ãã ãã®ãããªã·ã¹ãã ã«å ±éã 2çªç®ã¯ãç¹å®ã®ã·ã¹ãã ã®ãã©ã¡ãŒã¿ãŒãžã®ææ°é¢æ°çãªäŸåã§ãã ã ããã人ã«ãšã£ãŠã¯ãèåã«é¢é£ããå ±éã®æ©èœããããã ãã§ãã»ãŒåãã§ãã ããããå¥åº·ã®æªåã¯å èã®ç¶æ ã«ãé¢é£ããŠãããããã¯èª°ã«ãšã£ãŠãç°ãªã£ãŠããŸãã
äºç¥ä¿å šã«ã€ããŠããå°ãç¥ã£ãŠããã ããã°å¹žãã§ãã ãã®æè¡ã§æããã䜿çšãããæ©æ¢°åŠç¿æ¹æ³ã«ã€ããŠè³ªåããããšæããŸãã ç§ã¯ã³ã¡ã³ãã§ãããã®ããããã«åãã§çããŸãã æžãããŠããããšã«ã€ããŠè³ªåããã ãã§ãªãã䌌ããããªããšããããå Žåã CleverDATAããŒã ã¯åžžã«æèœã®ããç±å¿ãªå°é家ã«åãã§ããŸãã
æ¬ å¡ã¯ãããŸããïŒ ãã¡ããïŒ