é©åãªæ§æãèŠã€ããããã«ãvw-hyperoptã¯Hyperopt Pythonã©ã€ãã©ãªã®ã¢ã«ãŽãªãºã ã䜿çšããTree-Structured Parzen EstimatorsïŒTPEïŒã¡ãœããã䜿çšããŠãã€ããŒãã©ã¡ãŒã¿ãŒãé©å¿çã«æé©åã§ããŸãã ããã«ãããå埩åæ°ãçããåçŽãªã°ãªããæ€çŽ¢ãããåªããæé©åãèŠã€ããããšãã§ããŸãã
ãã®èšäºã¯ãVowpal Wabbitãæ±ããã¹ãŠã®äººãç¹ã«ããœãŒã¹ã³ãŒãã§å€æ°ã®ã¢ãã«ãããæ§æããæ¹æ³ããªããæåã§èª¿æŽããããæé©åèªäœããšã³ã³ãŒãããæ¹æ³ããªãããšã«æ©ãã§ãã人ã«ãšã£ãŠèå³æ·±ããã®ã§ãã
ãã€ããŒãã©ã¡ãŒã¿ãŒ
ãã€ããŒãã©ã¡ãŒã¿ãŒãšã¯äœã§ããïŒ ãããã¯ãã¹ãŠã¢ã«ãŽãªãºã ã®ãèªç±åºŠãã§ãããçŽæ¥æé©åãããŸããããçµæã«äŸåããŸãã çµæãããªãç°ãªãå Žåãããããããã«ã°ã«ã§ãªãå Žåã¯ãããã©ã«ãå€ã䜿çšããããæåã§éžæã§ããŸãã ãã ããèšå®ã倱æãããšãã¹ãŠãå°ç¡ãã«ãªãå ŽåããããŸããã¢ã«ãŽãªãºã ã¯åèšç·Žãè¡ãããéã«ã»ãšãã©ã®æ å ±ã䜿çšã§ããªããªããŸãã
çãæå³ã§ã¯ããã€ããŒãã©ã¡ãŒã¿ãŒã¯ãå€ãã®å Žåãæ©æ¢°åŠç¿æ¹æ³ã®æ£ååããã³ãã®ä»ã®ãæãããªãèšå®ã®ã¿ãæå³ããŸãã ãã ããåºãæå³ã§ããã€ããŒãã©ã¡ãŒã¿ãŒã¯äžè¬ã«ãçµæã«åœ±é¿ãäžããå¯èœæ§ã®ããããŒã¿ã䜿çšããæäœã§ãïŒãšã³ãžãã¢ãªã³ã°æ©èœã芳枬å€ã®éã¿ä»ããã¢ã³ããŒãµã³ããªã³ã°ãªã©ã
ã°ãªããæ€çŽ¢
ãã¡ããããã©ã¡ãŒã¿ãŒã®æé©åã«å ããŠããã€ããŒãã©ã¡ãŒã¿ãŒãæé©åããã¢ã«ãŽãªãºã ãçšæããŠãããšããã§ãããã çŽæããããã®ã¢ã«ãŽãªãºã ãä¿¡é Œã§ããã°ãªãããã§ãã ãã¡ããããã®æ¹åã«ããã€ãã®ã¹ããããåãããŠããŸãã çŽ æŽãªæ¹æ³ã¯å€ãã®æ©æ¢°åŠç¿ã©ã€ãã©ãªã«çµã¿èŸŒãŸããŠããŸãïŒã°ãªããæ€çŽ¢-ã°ãªãããæ©ãããŸãã¯ã©ã³ãã æ€çŽ¢-åºå®ååžããã®ãµã³ããªã³ã°ãã€ã³ãïŒæãæåãªã€ã³ã¹ã¿ã³ã¹ã¯ãsklearnã®GridSearchCVãšRandomizedGridSearchCVã§ãïŒã ã°ãªãããã¹ã®å©ç¹ã¯ãç¬èªã«ã³ãŒãã£ã³ã°ããããã䞊ååãç°¡åãªããšã§ãã ãã ããããã«ã¯é倧ãªæ¬ ç¹ããããŸãã
- 圌ã¯æããã«å€±æããå€ãã®ãã€ã³ãã調ã¹ãŸãã çµæãŸãã¯ãã®ä»ã®æ
å ±ãå«ãããã€ãã®æ§æã»ãããæ¢ã«ãããšããŸãã 人ã¯ãã©ã®æ§æãééããªãããããªçµæãããããããç解ã§ãããããã®é åãå床ãã§ãã¯ããªãããã«æšæž¬ããŸãã ã°ãªããæ€çŽ¢ã¯ãããè¡ãæ¹æ³ãç¥ããŸããã
- ãã€ããŒãã©ã¡ãŒã¿ãå€æ°ããå Žåããã»ã«ãã®ãµã€ãºã倧ããããããå¿
èŠããããé©åãªæé©åãèŠéãå¯èœæ§ããããŸãã ãããã£ãŠãçµæã«åœ±é¿ãäžããªãå€ãã®äœåãªãã€ããŒãã©ã¡ãŒã¿ãŒãæ€çŽ¢ã¹ããŒã¹ã«å«ãããšãåãæ°ã®å埩ã§ã°ãªããæ€çŽ¢ãéåžžã«æªããªããŸãã ãã ããã©ã³ãã æ€çŽ¢ã®å Žåãããã¯ããã»ã©ã§ã¯ãããŸããã
ãã€ãžã¢ã³æ³
å埩åæ°ãæžãããŠé©åãªæ§æãèŠã€ããããã«ãé©å¿åãã€ãžã¢ã³ææ³ãèæ¡ãããŸããã 圌ãã¯ããã§ãã¯æžã¿ã®ãã€ã³ãã®çµæãèæ ®ããŠããã§ãã¯ãã次ã®ãã€ã³ããéžæããŸãã ã¢ã€ãã¢ã¯ãïŒaïŒèŠã€ãã£ããã€ã³ãã®äžã§æãæåãããã€ã³ãã®è¿ãã®é åãæ¢çŽ¢ããïŒbïŒããã«æåãããã€ã³ããé 眮ãããå¯èœæ§ã®ããäžç¢ºå®æ§ã®é«ãé åãæ¢çŽ¢ããéã®åã¹ãããã§åŠ¥åç¹ãèŠã€ããããšã§ãã ããã¯ãæ¢çŽ¢ãšã¯ã¹ããã€ããŸãã¯ã©ãŒãã³ã°vsç²åŸãžã¬ã³ããšåŒã°ããããšããããããŸãã ãããã£ãŠãæ°ããåãã€ã³ãã®ãã§ãã¯ã«è²»çšããããç¶æ³ïŒæ©æ¢°åŠç¿ãæ€èšŒ=åŠç¿+æ€èšŒïŒã§ã¯ãã¯ããã«å°ãªãã¹ãããã§ã°ããŒãã«æé©ã«ã¢ãããŒãã§ããŸãã
ããŸããŸãªããªãšãŒã·ã§ã³ã®åæ§ã®ã¢ã«ãŽãªãºã ãã MOE ã Spearmint ã SMAC ã BayesOpt ãããã³HyperoptããŒã«ã«å®è£ ãããŠããŸã ã
vw-hyperopt
ã¯
vw-hyperopt
ã®ã©ãããŒã§ãããããåŸè ã«ã€ããŠè©³ãã説æããŸãããæåã«Vowpal Wabbitã«ã€ããŠå°ã説æããå¿ èŠããããŸãã
ãŽã©ãŒãã«ãŠãµã®
ããªãã®å€ãã¯ããã®ããŒã«ã䜿çšããããå°ãªããšãèããããšãããã¯ãã§ãã èŠããã«ãããã¯äžçã®æ©æ¢°åŠç¿ã©ã€ãã©ãªã§æéã®ïŒæéã§ãªããšããŠãïŒã®1ã€ã§ãã CTRäºæž¬åïŒãã€ããªåé¡ïŒã®ã¢ãã«ã3,000äžã®ã±ãŒã¹ãšæ°åäžã®æ©èœã§ãã¬ãŒãã³ã°ããã«ã¯ã1ã³ã¢ã§æ°ã®ã¬ãã€ãã®RAMãš6åããããããŸããã Vowpal Wabbitã¯ãããã€ãã®ãªã³ã©ã€ã³ã¢ã«ãŽãªãºã ãå®è£ ããŠããŸãã
- ããŸããŸãªæ©èœãåãã確ççåŸé éäž;
- FTRL-Proximalã ããã§èªãããšãã§ããŸã ;
- SVMã®ãªã³ã©ã€ã³é¡äŒŒæ§;
- ãªã³ã©ã€ã³ããŒã¹ãã£ã³ã°;
- å æ°å解ãã·ã³ã
ããã«ããã£ãŒããã©ã¯ãŒããã¥ãŒã©ã«ãããã¯ãŒã¯ããããæé©åïŒBFGSïŒããã³LDAãå®è£ ããŠããŸãã ããã¯ã°ã©ãŠã³ãã§Vowpal Wabbitãå®è¡ãããããããåŠç¿ããããåã«äºæž¬ãè¡ãããšã§ãå ¥åãšããŠããŒã¿ã¹ããªãŒã ãåä¿¡ã§ããŸãã
FTRLãšSGDã¯ãååž°ãšåé¡ã®äž¡æ¹ã®åé¡ã解決ã§ããŸããããã¯ãæ倱é¢æ°ã«ãã£ãŠã®ã¿èŠå¶ãããŸãã ãããã®ã¢ã«ãŽãªãºã ã¯ãç¹åŸŽã«é¢ããŠç·åœ¢ã§ãããå€é åŒã®ç¹åŸŽã䜿çšããŠéç·åœ¢æ§ãç°¡åã«å®çŸã§ããŸãã ããŸãã«ãå€ãã®æ代ã瀺ããŠããå Žåãåèšç·Žãã身ãå®ãããã®éåžžã«äŸ¿å©ãªæ©æåæ¢ã¡ã«ããºã ããããŸãã
Vowpal Wabbitã¯ããã®æ©èœããã·ã³ã°ã§ãæåã§ããããã¯ãå€ãã®æ©èœãããå Žåã«è¿œå ã®æ£ååãšããŠæ©èœããŸãã ããã«ãããæ°ååã®åžå°ãªã«ããŽãªãæã€ã«ããŽãªæ©èœãåŠç¿ããå質ãç ç²ã«ããããšãªãã¢ãã«ãRAMã«é©åãããããšãã§ããŸãã
Vowpal Wabbitã«ã¯ç¹å¥ãªå ¥åããŒã¿åœ¢åŒãå¿ èŠã§ãã ãç解ããã®ã¯ç°¡åã§ãã ããã¯èªç¶ã«ãŸã°ãã§ãã»ãšãã©ã¹ããŒã¹ãåããŸããã äžåºŠã«1ã€ã®èŠ³æž¬ïŒãŸãã¯LDAã®å Žåã¯è€æ°ïŒãRAMã«ããŒããããŸãã ãã¬ãŒãã³ã°ã¯ãã³ã³ãœãŒã«ããå®è¡ããã®ãæãç°¡åã§ãã
èå³ã®ããæ¹ã¯ã ãã¥ãŒããªã¢ã« ããªããžããªå ã®ãã®ä»ã®äŸãèšäºãããã³ãã¬ãŒã³ããŒã·ã§ã³ãèªãããšãã§ããŸãã Vowpal Wabbitã®å éšã«ã€ããŠã¯ãJohn Langfordã®åºçç©ãšåœŒã®ããã°ã§è©³ããç¥ãããšãã§ããŸãã Habréã«ãé©åãªæçš¿ããããŸãã åŒæ°ã®ãªã¹ãã¯ã
vw --help
ãã
vw --help
ãã 詳现ãªèª¬æãèªãããšãã§ããŸãã 説æãããããããã«ãå€æ°ã®åŒæ°ãããããããã®å€ãã¯æé©åã§ãããã€ããŒãã©ã¡ãŒã¿ãŒãšèŠãªãããšãã§ããŸãã
Vowpal Wabbitã«ã¯ãé»é æ¯ æ³ã䜿çšã㊠1ã€ã®ãã€ããŒãã©ã¡ãŒã¿ãŒãéžæã§ããvw-hypersearchã¢ãžã¥ãŒã«ããããŸãã ãã ããããã€ãã®æ¥µå°å€ãããå Žåããã®æ¹æ³ã¯æè¯ã®éžæè¢ããã¯çšé ããã®ã«ãªãå¯èœæ§ããããŸãã ããã«ãå€ãã®ãã€ããŒãã©ã¡ãŒã¿ãŒãäžåºŠã«æé©åããå¿ èŠããããŸãããvw-hypersearchã§ã¯ããã§ã¯ãããŸããã æ°ã¶æåãå€æ¬¡å ãŽãŒã«ãã³ã»ã¯ã·ã§ã³ã¡ãœãããèšè¿°ããããšããŸããããåæã«å¿ èŠãªã¹ãããæ°ãã°ãªããæ€çŽ¢ãè¶ ããŠããããããã®ãªãã·ã§ã³ã¯äžèŠã«ãªããŸããã Hyperoptã䜿çšããããšã決å®ãããŸããã
ãã€ããŒãªãã
ãã®Pythonã©ã€ãã©ãªã¯ãTree-Structured Parzen EstimatorsïŒTPEïŒæé©åã¢ã«ãŽãªãºã ãå®è£ ããŠããŸãã ãã®å©ç¹ã¯ãéåžžã«ãåä»ãªãã¹ããŒã¹ã§åäœã§ããããšã§ãã1ã€ã®ãã€ããŒãã©ã¡ãŒã¿ãŒãé£ç¶ããŠããå Žåããã1ã€ã¯ã«ããŽãªãŒåã§ãã 3çªç®ã¯é¢æ£çã§ããããã®é£æ¥å€ã¯äºãã«çžé¢ããŠããŸãã æåŸã«ããã©ã¡ãŒã¿å€ã®ããã€ãã®çµã¿åããã¯åã«æå³ããªããªãå ŽåããããŸãã TPEã¯ãå éšç確çãæã€éå±€æ€çŽ¢ç©ºéãåã蟌ã¿ãåã¹ãããã§ããããæ°ããç¹ãäžå¿ãšããã¬ãŠã¹ååžãšæ··åããŸãã èè ã®James Bergstraã¯ããã®ã¢ã«ãŽãªãºã ãæ¢çŽ¢-æªçšã®åé¡ãååã«è§£æ±ºããã°ãªããæ€çŽ¢ãšãšãã¹ããŒãæ€çŽ¢ã®äž¡æ¹ãå°ãªããšãå€ãã®ãã€ããŒãã©ã¡ãŒã¿ãååšãããã£ãŒãã©ãŒãã³ã°ã¿ã¹ã¯ã®äž¡æ¹ã§ããŸãæ©èœãããšäž»åŒµããŠããŸãã 詳现ã«ã€ããŠã¯ã ãã¡ããšãã¡ããã芧ãã ãã ã TPEã¢ã«ãŽãªãºã ã«ã€ããŠã¯ã ãã¡ããã芧ãã ãã ã ããããå°æ¥ã圌ã«ã€ããŠã®è©³çŽ°ãªèšäºãæžãããšãå¯èœã«ãªãã§ãããã
Hyperoptã¯ãããç¥ãããæ©æ¢°åŠç¿ã©ã€ãã©ãªã®ãœãŒã¹ã³ãŒãã«åã蟌ãŸããŠããŸããã§ããããå€ãã¯ããã䜿çšããŠããŸãã ããšãã°ãããã¯hyperopt + sklearnã«é¢ããåªãããã¥ãŒããªã¢ã«ã§ã ã hyperopt + xgboostã®ã¢ããªã±ãŒã·ã§ã³ã次ã«ç€ºããŸãã ç§ã®è²¢ç®ã¯ãã¹ãŠãVowpal Wabbitã®ãã®ãããªã©ãããŒããµãŒãã¹ããŒã¹ãæå®ããã³ãã³ãã©ã€ã³ããããããã¹ãŠãèµ·åããããã®å€ããå°ãªãã蚱容ã§ããæ§æã§ãã Vowpal Wabbitã¯ãŸã ãã®ãããªæ©èœãæã£ãŠããªãã£ãã®ã§ãLangfordã¯ç§ã®ã¢ãžã¥ãŒã«ãæ°ã«å ¥ã£ãŠããã泚ããŸããã å®éã誰ã§ããæ°ã«å ¥ãã®æ©æ¢°åŠç¿ããŒã«ã«Hyperoptãè©Šãããšãã§ããŸããããã¯ç°¡åã«å®è¡ã§ããå¿ èŠãªãã®ã¯ãã¹ãŠãã®ãã¥ãŒããªã¢ã«ã«ãããŸãã
vw-hyperopt
vw-hyperopt
䜿çšã«
vw-hyperopt
ãŸãããã ãŸããgithubããVowpal Wabbitã®ææ°ããŒãžã§ã³ãã€ã³ã¹ããŒã«ããå¿ èŠããããŸãã ã¢ãžã¥ãŒã«ã¯utlãã©ã«ããŒã«ãããŸãã
泚æïŒ ãããŸã§ã®ææ°ã®å€æŽïŒç¹ã«ãæ°ããã³ãã³ãæ§æïŒïŒ12æ15æ¥ïŒã¯ãã¡ã€ã³ãªããžããªã«ããŒãžãããŸããã æ°æ¥äžã«ãåé¡ã解決ããããšãé¡ã£ãŠããŸãããä»ã®ãšãããç§ã®ãã©ã³ãã®ææ°ããŒãžã§ã³ã®ã³ãŒãã䜿çšã§ããŸãã ç·šéïŒ 12æ22æ¥ãå€æŽã泚ã蟌ãŸããã¡ã€ã³ãªããžããªã䜿çšã§ããããã«ãªããŸããã
䜿çšäŸïŒ
./vw-hyperopt.py --train ./train_set.vw --holdout ./holdout_set.vw --max_evals 200 --outer_loss_function logistic --vw_space '--algorithms=ftrl,sgd --l2=1e-8..1e-1~LO --l1=1e-8..1e-1~LO -l=0.01..10~L --power_t=0.01..1 --ftrl_alpha=5e-5..8e-1~L --ftrl_beta=0.01..1 --passes=1..10~I --loss_function=logistic -q=SE+SZ+DR,SE~O --ignore=T~O' --plot
ãã®ã¢ãžã¥ãŒã«ã«ã¯ããã¬ãŒãã³ã°ãšæ€èšŒã®ãµã³ãã«ãããã³
--vw_space
äºåååžïŒ
--vw_space
å ã§åŒçšïŒãå¿ èŠã§ãã æŽæ°ãé£ç¶ããŸãã¯ã«ããŽãªãŒã®ãã€ããŒãã©ã¡ãŒã¿ãŒãæå®ã§ããŸãã ã«ããŽãªãŒãé€ããã¹ãŠã«ã€ããŠãåäžååžãŸãã¯å¯Ÿæ°åäžååžãæå®ã§ããŸãã ãã®äŸã®æ€çŽ¢ã¹ããŒã¹ã¯ã
vw-hyperopt
å ã§
vw-hyperopt
次ã®ããã«å€æãããŸãïŒ
Hyperopt
ãã¥ãŒããªã¢ã«ã
Hyperopt
ã å Žåããããç解ã§ããŸãïŒã
from hyperopt import hp prior_search_space = hp.choice('algorithm', [ {'type': 'sgd', '--l1': hp.choice('sgd_l1_outer', ['empty', hp.loguniform('sgd_l1', log(1e-8), log(1e-1))]), '--l2': hp.choice('sgd_l2_outer', ['empty', hp.loguniform('sgd_l2', log(1e-8), log(1e-1))]), '-l': hp.loguniform('sgd_l', log(0.01), log(10)), '--power_t': hp.uniform('sgd_power_t', 0.01, 1), '-q': hp.choice('sgd_q_outer', ['emtpy', hp.choice('sgd_q', ['-q SE -q SZ -q DR', '-q SE'])]), '--loss_function': hp.choice('sgd_loss', ['logistic']), '--passes': hp.quniform('sgd_passes', 1, 10, 1), }, {'type': 'ftrl', '--l1': hp.choice('ftrl_l1_outer', ['emtpy', hp.loguniform('ftrl_l1', log(1e-8), log(1e-1))]), '--l2': hp.choice('ftrl_l2_outer', ['emtpy', hp.loguniform('ftrl_l2', log(1e-8), log(1e-1))]), '-l': hp.loguniform('ftrl_l', log(0.01), log(10)), '--power_t': hp.uniform('ftrl_power_t', 0.01, 1), '-q': hp.choice('ftrl_q_outer', ['emtpy', hp.choice('ftrl_q', ['-q SE -q SZ -q DR', '-q SE'])]), '--loss_function': hp.choice('ftrl_loss', ['logistic']), '--passes': hp.quniform('ftrl_passes', 1, 10, 1), '--ftrl_alpha': hp.loguniform('ftrl_alpha', 5e-5, 8e-1), '--ftrl_beta': hp.uniform('ftrl_beta', 0.01, 1.) } ])
ãªãã·ã§ã³ã§ãæ€èšŒãµã³ãã«ã®æ倱é¢æ°ãšå埩ã®æ倧æ°ãå€æŽã§ããŸãïŒããã©ã«ãã§ã¯
--max_evals
ãããã©ã«ãã§ã¯
--max_evals
ãããã©ã«ãã§ã¯100ïŒã
matplotlib
ãšãã§ããã°
seaborn
å Žåã¯ãåå埩ã®çµæãä¿åãã--
--plot
ã°ã©ããäœæããããšãã§ããŸãã
ããã¥ã¡ã³ã
ããã«è©³çŽ°ãªããã¥ã¡ã³ããé 眮ããããšã¯æ £ç¿ã§ã¯ãªãããããªã³ã¯ãžã®ãªã³ã¯ã«éå®ããŸãã ç§ã®ãã©ãŒã¯ã§ãã·ã¢èªãŠã£ãã®ãã¹ãŠã®ã»ãã³ãã£ã¯ã¹ã«ã€ããŠèªãããã¡ã€ã³ã®Vowpal Wabbitãªããžããªã§è±èªçãåŸ ã€ããšãã§ããŸãã
èšç»
å°æ¥çã«ã¯ãã¢ãžã¥ãŒã«ã«è¿œå ããäºå®ã§ãã
- ååž°ããã³ãã«ãã¯ã©ã¹åé¡ã¿ã¹ã¯ã®ãµããŒãã
- ããŠã©ãŒã ã¹ã¿ãŒããã®ãµããŒãïŒäºåã«è©äŸ¡ãããHyperoptãã€ã³ããçºè¡ãããããã®çµæãèæ
®ããŠæé©åãéå§ããŸãã
- å¥ã®ãã¹ããµã³ãã«ã®åã¹ãããã§ãšã©ãŒãè©äŸ¡ãããªãã·ã§ã³ïŒãã ãããã€ããŒãã©ã¡ãŒã¿ãŒãæé©åããªãïŒã ããã¯ãäžè¬åèœåãããããè©äŸ¡ããããã«å¿
èŠã§ã-åèšç·ŽããŠããŸããã
-
--lrqdropout, --normalized, --adaptive
ãªã©ãå€ãåãå ¥ããªããã€ããªãã©ã¡ãŒã¿ã®ãµããŒã ããã§ãååãšããŠ--adaptive=\ ~O
èšè¿°ã§ããŸãããããã¯ãŸã£ããçŽæçã§ã¯ãããŸããã--adaptive=~B
--adaptive=~BO
ãŸãã¯--adaptive=~BO
ãããªããšãã§ããŸãã
誰ããã¢ãžã¥ãŒã«ã䜿çšãã圌ã誰ããå©ããŠãããããšãŠãããããã§ãã ææ¡ãã¢ã€ãã¢ããŸãã¯çºèŠããããã°ã«åãã§å¯Ÿå¿ããŸãã ããã«ããããæžãããgithubã§åé¡ãäœæã§ããŸãã
ã¢ããããŒã12.22.2015
ææ°ã®å€æŽãå«ããã«ãªã¯ãšã¹ããã¡ã€ã³ã®Vowpal Wabbitãªããžããªã«è¿œå ãããããããã©ã³ãã§ã¯ãªã䜿çšã§ããããã«ãªããŸããã