ããæç¹ã§ãå¿ ç¶çã«ExcelããŒã¿ãæäœããå¿ èŠæ§ã«åºãããããšã«ãªãããã®ãããªããŒã¿ã¹ãã¬ãŒãžåœ¢åŒã§ã®äœæ¥ã楜ãããšããä¿èšŒã¯ãããŸããã ãã®ãããPythonéçºè ã¯ãExcelãã¡ã€ã«ã ãã§ãªããä»ã®çš®é¡ã®ãã¡ã€ã«ã§ããä»ã®æäœãèªã¿åããç·šéããå®è¡ãã䟿å©ãªæ¹æ³ãå®è£ ããŠããŸãã
åºçºç¹-ããŒã¿ã®å¯çšæ§
転é
ãªãªãžãã«ã®èšäº-www.datacamp.com/community/tutorials/python-excel-tutorial
Karlijn Willemsã«ããæçš¿
ããŒã¿åæãããžã§ã¯ããéå§ãããšãå€ãã®å Žåãã«ãŠã³ã¿ãŒã®å©ããåããŠãããããKaggleãQuandlãªã©ã®ã·ã¹ãã ããããŒã¿ãã¢ããããŒãããããšã«ãããåéãããçµ±èšã«åºããããŸããããããã»ãšãã©ã®ããŒã¿ã¯ãŸã GoogleãŸãã¯ãªããžããªã«ãããŸããä»ã®ãŠãŒã¶ãŒãšå ±æããŸãã ãã®ããŒã¿ã¯ãExcel圢åŒãŸãã¯æ¡åŒµåã.csvã®ãã¡ã€ã«ã«ããããšãã§ããŸãã
ããŒã¿ã倧éã®ããŒã¿ããããŸãã åæ-ããããªãã ã©ãããå§ããŸããïŒ ããŒã¿åæã®æåã®ã¹ãããã¯ãæ€èšŒã§ãã ã€ãŸããåä¿¡ããŒã¿ã®å質ã確èªããå¿ èŠããããŸãã
ããŒã¿ãããŒãã«ã«ä¿åãããŠããå ŽåãããŒã¿ã®å質ã確èªããã ãã§ãªãïŒããŒãã«ã®ããŒã¿ã調æ»ã®ããã«æèµ·ããã質åã«çããããšã確èªããå¿ èŠããããŸãïŒããã®ããŒã¿ãä¿¡é Œã§ãããã©ãããè©äŸ¡ããå¿ èŠããããŸãã
ããŒãã«å質ãã§ãã¯
ããŒãã«ã®å質ã確èªããã«ã¯ãéåžžãç°¡åãªãã§ãã¯ãªã¹ãã䜿çšããŸãã è¡šã®ããŒã¿ã¯æ¬¡ã®æ¡ä»¶ãæºãããŠããŸããïŒ
- ããŒã¿ã¯çµ±èšã§ãã
- ããŸããŸãªã¿ã€ãã®ããŒã¿ïŒæéãèšç®ãçµæã
- ããŒã¿ã¯å®å šã§äžè²«æ§ããããŸããããŒãã«å ã®ããŒã¿æ§é ã¯äœç³»çã§ãããååšããæ°åŒã¯æ©èœããŠããŸãã
ãããã®ç°¡åãªè³ªåãžã®åçã¯ãããŒãã«ãæšæºã«éåããŠãããã©ãããå€æããã®ã«åœ¹ç«ã¡ãŸãã ãã¡ãããç¹å®ã®ãã§ãã¯ãªã¹ãã¯ç¶²çŸ çã§ã¯ãããŸãããããŒãã«ã®ããŒã¿ããã§ãã¯ããŠãããŒãã«ããâãã¢ãã«ãã§ã¯ãªãããšã確èªããããã®ã³ã³ãã©ã€ã¢ã³ã¹ã®ã«ãŒã«ãå€æ°ãããŸãã ãã ããè¡šã«å質ããŒã¿ãå«ãŸããŠããããšã確èªããå Žåã¯ãäžèšã®ãã§ãã¯ãªã¹ããæãé©åã§ãã
ãã¹ããã©ã¯ãã£ã¹ã®è¡šåœ¢åŒããŒã¿
Pythonã§ããŒãã«ããŒã¿ãèªã¿åãã®ã¯è¯ãããšã§ãã ããããããŒã¿ãç·šéãããã§ãã ããã«ãããŒãã«å ã®ããŒã¿ã®ç·šéã¯ã次ã®æ¡ä»¶ãæºããå¿ èŠããããŸãã
- ããŒãã«ã®æåã®è¡ã¯ããããŒçšã«äºçŽãããŠãããæåã®åã¯ãµã³ããªã³ã°åäœãèå¥ããããã«äœ¿çšãããŸãã
- ååãå€ããŸãã¯ã¹ããŒã¹ãå«ããã£ãŒã«ãã¯é¿ããŠãã ããã ããããªããšãååèªãåå¥ã®å€æ°ãšããŠè§£éãããããŒã¿ã»ããã®è¡ã®èŠçŽ æ°ã«é¢é£ãããšã©ãŒãçºçããŸãã ã¢ã³ããŒã¹ã³ã¢ã倧æåãšå°æåïŒããã¹ãã®åã»ã¯ã·ã§ã³ã®æåã®æåã¯å€§æåïŒããŸãã¯æ¥ç¶èªã䜿çšããããšããå§ãããŸãã
- çãååãåªå ããŸãã
- æåãå«ãååã䜿çšããªãããã«ããŠãã ããïŒã$ãïŒ ã^ãïŒã*ãïŒãïŒã-ãïŒãïŒã<ã>ã/ã|ã\ã[ã]ã{ãããã³} ;
- å€ãNAã®è¿œå ã®åãŸãã¯ãã£ãŒã«ããé¿ããããã«ããã¡ã€ã«ã§è¡ã£ãã³ã¡ã³ããåé€ããŸãã
- ããŒã¿ã»ããã®æ¬ æå€ãNAãšããŠè¡šç€ºãããŠããããšã確èªããŠãã ããã
å¿ èŠãªå€æŽãè¡ã£ãåŸïŒãŸãã¯ããŒã¿ãæ éã«ç¢ºèªãããšãïŒãå€æŽãä¿åãããŠããããšã確èªããŠãã ããã ããã¯ãèšç®ã«äœ¿çšãããå¯èœæ§ã®ããæ°åŒãç¶æããªãããå¿ èŠã«å¿ããŠããŒã¿ãããäžåºŠç¢ºèªãç·šéãè£è¶³ããŸãã¯å€æŽã§ãããããéèŠã§ãã
Microsoft Excelã䜿çšããŠããå Žåã¯ãããã©ã«ãã®æ¡åŒµåã«å ããŠããã¡ã€ã«ãä¿åããããã®å€æ°ã®ãªãã·ã§ã³ãããããšããåãã§ãããã.xlsãŸãã¯.xlsxïŒããã¡ã€ã«ãã¿ãã«ç§»åãããååãä»ããŠä¿åããéžæããŠãåæã®ããã«ããŒã¿ãä¿åãã-.CSVããã³.ïŒïŒã ä¿åãªãã·ã§ã³ã«å¿ããŠãããŒã¿ãã£ãŒã«ãã¯ãåºåãæåããã£ãŒã«ããæ§æããã¿ããŸãã¯ã³ã³ãã§åºåãããŸãã ãã®ãããããŒã¿ããã§ãã¯ãããŠä¿åãããŸãã ã¯ãŒã¯ã¹ããŒã¹ã®æºåãéå§ããŸãã
ã¯ãŒã¯ã¹ããŒã¹ã®æºå
ã¯ãŒã¯ã¹ããŒã¹ã®æºåã¯ãå®æ§åæã®çµæã確èªããããã«æåã«è¡ãå¿ èŠã®ããäœæ¥ã®1ã€ã§ãã
æåã®ã¹ãããã¯ãäœæ¥ãã£ã¬ã¯ããªã確èªããããšã§ãã
ã¿ãŒããã«ã§äœæ¥ããå ŽåããŸããã¡ã€ã«ã眮ãããŠãããã£ã¬ã¯ããªã«ç§»åããŠãããPythonãå®è¡ã§ããŸãã ãã®å Žåããã¡ã€ã«ãäœæ¥ããããã£ã¬ã¯ããªã«ããããšã確èªããå¿ èŠããããŸãã
確èªããã«ã¯ã次ã®ã³ãã³ããå ¥åããŸãã
# Import `os` import os # Retrieve current working directory (`cwd`) cwd = os.getcwd() cwd # Change directory os.chdir("/path/to/your/folder") # List all files and directories in current directory os.listdir('.')
ãããã®ã³ãã³ãã¯ãããŒã¿ã®ããŒãã ãã§ãªãããããªãåæã«ãéèŠã§ãã ãããã£ãŠããã¹ãŠã®ãã§ãã¯ãè¡ã£ãŠãããŒã¿ãä¿åããã¯ãŒã¯ã¹ããŒã¹ãæºåããŸããã æ¢ã«Pythonã§ããŒã¿ã®èªã¿åããéå§ããŠããŸããïŒ :)æ®å¿µãªãããŸã ã§ãã æåŸã«ããããšã
Excelãã¡ã€ã«ãèªã¿æžãããããã®ããã±ãŒãžãã€ã³ã¹ããŒã«ãã
ããŒã¿ãã€ã³ããŒãããããã«å¿ èŠãªã©ã€ãã©ãªããŸã ããããªããšããäºå®ã«ããããããããããã®ã©ã€ãã©ãªãã€ã³ã¹ããŒã«ããæºåããã¹ãŠæŽã£ãŠããããšã確èªããå¿ èŠããããŸãã Python 2> = 2.7.9ãŸãã¯Python 3> = 3.4ãã€ã³ã¹ããŒã«ãããŠããå Žåãå¿é ããå¿ èŠã¯ãããŸãã-éåžžããããã®ããŒãžã§ã³ã§ã¯ãã¹ãŠãæºåãããŠããŸãã å¿ ãææ°ããŒãžã§ã³ã«ã¢ããã°ã¬ãŒãããŠãã ãã:)
ãããè¡ãã«ã¯ãã³ã³ãã¥ãŒã¿ãŒã§æ¬¡ã®ã³ãã³ããå®è¡ããŸãã
# For Linux/OS X pip install -U pip setuptools # For Windows python -m pip install -U pip setuptools
ãŸã pipãã€ã³ã¹ããŒã«ããŠããªãå Žåã¯ãpython get-pip.pyã¹ã¯ãªãããå®è¡ããŸãããã®ã¹ã¯ãªããã¯ããã«ãããŸã ïŒã€ã³ã¹ããŒã«æé ãšãã«ãããããŸãïŒã
Anacondaãã€ã³ã¹ããŒã«ãã
Pythonã䜿çšããŠããŒã¿ãåæããå Žåã¯ãAnaconda Pythonãã£ã¹ããªãã¥ãŒã·ã§ã³ãã€ã³ã¹ããŒã«ããããšãã§ããŸãã ããã¯ãããŒã¿åæã®äœæ¥ãéå§ããããã®ç°¡åã§è¿ éãªæ¹æ³ã§ããããŒã¿ãµã€ãšã³ã¹ã«å¿ èŠãªããã±ãŒãžãåå¥ã«ã€ã³ã¹ããŒã«ããå¿ èŠã¯ãããŸããã
ããã¯åå¿è ã«ã¯ç¹ã«äŸ¿å©ã§ãããçµéšè±å¯ãªéçºè ã§ãããã®æ¹æ³ããã䜿çšããŸããã¢ãã³ã³ãã¯ãåããã±ãŒãžãåå¥ã«ã€ã³ã¹ããŒã«ããããšãªããããã€ãã®ããšããã°ãããã¹ããã䟿å©ãªæ¹æ³ã ããã§ãã
Anacondaã«ã¯ãJupyterãSpyderãªã©ã®ããã€ãã®ãªãŒãã³ãœãŒã¹éçºç°å¢ã§ããŒã¿ãåæããããã®100ã®æã人æ°ã®ããPythonãRãããã³Scalaã©ã€ãã©ãªãå«ãŸããŠããŸãã Jupyter Notebookã®äœ¿çšãéå§ããå Žå㯠ã次ã®ãšããã§ãã
Anacondaãã€ã³ã¹ããŒã«ããã«ã¯ãããã«ããŸãã
Pandas DataFrameãšããŠExcelãã¡ã€ã«ãããŠã³ããŒããã
ããŠãç°å¢ãã»ããã¢ããããããã«ãã¹ãŠãè¡ããŸããïŒ ä»ããããã¡ã€ã«ã®ã€ã³ããŒããéå§ãããšãã§ãã
ããŒã¿åæçšã®ãã¡ã€ã«ãã€ã³ããŒãããããã«ãã䜿çšããæ¹æ³ã®1ã€ã¯ãPandasã©ã€ãã©ãªã䜿çšããŠã€ã³ããŒãããããšã§ãïŒPandasã¯ãããŒã¿ãåŠçããã³åæããããã®Pythonããã°ã©ã ã©ã€ãã©ãªã§ãïŒã Pandasã¯ãäœã¬ãã«ã®ããŒã«ã§ããNumPyã©ã€ãã©ãªã®ããŒã¿ãåŠçããŸãã ãã³ãã¯åŒ·åã§æè»ãªã©ã€ãã©ãªã§ãããåæã容æã«ããããã«ããŒã¿ãæ§é åããããã«éåžžã«é »ç¹ã«äœ¿çšãããŸãã
Anacondaã«æ¢ã«Pandaãããå Žåã¯ãpd.ExcelfileïŒïŒã䜿çšããŠPandas DataFramesã«ãã¡ã€ã«ãã¢ããããŒãããã ãã§ãã
# Import pandas import pandas as pd # Assign spreadsheet filename to `file` file = 'example.xlsx' # Load spreadsheet xl = pd.ExcelFile(file) # Print the sheet names print(xl.sheet_names) # Load a sheet into a DataFrame by name: df1 df1 = xl.parse('Sheet1')
Anacondaãã€ã³ã¹ããŒã«ããªãã£ãå Žåã¯ãpip install pandasãå®è¡ããŠç°å¢ã«Pandasããã±ãŒãžãã€ã³ã¹ããŒã«ããäžèšã®ã³ãã³ããå®è¡ããŸãã
.csvãã¡ã€ã«ãèªã¿åãã«ã¯ãããŒã¿ãDataFrameã«ããŒãããããã®åæ§ã®é¢æ°ïŒread_csvïŒïŒããããŸãã ãã®é¢æ°ã®äœ¿çšæ¹æ³ã®äŸã次ã«ç€ºããŸãã
# Import pandas import pandas as pd # Load csv df = pd.read_csv("example.csv")
ãã®é¢æ°ãèæ ®ããåºåãæåã¯ããã©ã«ãã®ã³ã³ãã§ããããªãã·ã§ã³ã§ä»£æ¿ã®åºåãæåãæå®ã§ããŸãã ã€ã³ããŒãããããã«æå®ã§ããä»ã®åŒæ°ãç¥ãããå Žåã¯ã ããã¥ã¡ã³ããåç §ããŠãã ããã
Pandas DataFrameãExcelãã¡ã€ã«ã«æžã蟌ãæ¹æ³
ããŒã¿ãåæããåŸãããŒã¿ãæ°ãããã¡ã€ã«ã«æžã蟌ã¿ãããšããŸãã Pandas DataFramesãèšè¿°ããæ¹æ³ããããŸãïŒto_excelé¢æ°ã䜿çšïŒã ãã ãããã®é¢æ°ã䜿çšããåã«ã.xlsxãã¡ã€ã«ã®è€æ°ã®ã·ãŒãã«ããŒã¿ãæžã蟌ãå Žåã¯ãXlsxWriterãã€ã³ã¹ããŒã«ãããŠããããšã確èªããŠãã ããã
# Install `XlsxWriter` pip install XlsxWriter # Specify a writer writer = pd.ExcelWriter('example.xlsx', engine='xlsxwriter') # Write your DataFrame to a file yourData.to_excel(writer, 'Sheet1') # Save the result writer.save()
ã³ãŒãã¹ããããã¯ExcelWriterã䜿çšããŠDataFrameãåºåããããšã«æ³šæããŠãã ããã ã€ãŸããã©ã€ã¿ãŒå€æ°ãto_excelïŒïŒã«æž¡ããã·ãŒãã®ååãæå®ããŸãã ãããã£ãŠãæ¢åã®ã¯ãŒã¯ããã¯ã«ããŒã¿ã·ãŒããè¿œå ããŸãã ExcelWriterã䜿çšããŠãè€æ°ã®ç°ãªãDataFrameã1ã€ã®ããã¯ã«ä¿åããããšãã§ããŸãã
ã€ãŸãã1ã€ã®DataFrameãã¡ã€ã«ããã¡ã€ã«ã«ä¿åããã ãã§ããã°ãXlsxWriterã©ã€ãã©ãªãã€ã³ã¹ããŒã«ããã«å®è¡ã§ããŸãã pd.ExcelWriterïŒïŒé¢æ°ã«æž¡ãããåŒæ°ãæå®ããªãã§ãã ãããæ®ãã®ã¹ãããã¯å€æŽãããŸããã
.csvãã¡ã€ã«ã®èªã¿åãã«äœ¿çšãããé¢æ°ãšåæ§ã«ãçµæãã³ã³ãåºåããã¡ã€ã«ã«æžãæ»ãto_csvïŒïŒé¢æ°ããããŸãã ãã¡ã€ã«ãèªã¿èŸŒãããã«äœ¿çšãããšããšåãããã«æ©èœããŸãã
# Write the DataFrame to csv df.to_csv("example.csv")
ã¿ãä»ãã®å¥ã®ãã¡ã€ã«ãå¿ èŠãªå Žåã¯ãsepåŒæ°ã«\ tãæž¡ãããšãã§ããŸãã ãã¡ã€ã«ã®åºåã«äœ¿çšã§ããä»ã®ããŸããŸãªé¢æ°ãããããšã«æ³šæããŠãã ããã ããã«ãããŸã ã
ä»®æ³ç°å¢ã䜿çšãã
ã©ã€ãã©ãªãã€ã³ã¹ããŒã«ããããã®äžè¬çãªãã³ãã¯ãã·ã¹ãã ã©ã€ãã©ãªã®ãªãä»®æ³Pythonç°å¢ã«ã€ã³ã¹ããŒã«ããããšã§ãã virtualenvã䜿çšããŠPythonãµã³ãããã¯ã¹ãäœæã§ããŸããPythonãå¿ èŠãšããã©ã€ãã©ãªã䜿çšããããã«å¿ èŠãªãã¹ãŠãå«ããã©ã«ããŒãäœæãããŸãã
virtualenvãéå§ããã«ã¯ããŸãã€ã³ã¹ããŒã«ããå¿ èŠããããŸãã 次ã«ããããžã§ã¯ããé 眮ããããã£ã¬ã¯ããªã«ç§»åããŸãã ãã®ãã©ã«ããŒã«virtualenvãäœæããå¿ èŠã«å¿ããŠPythonã®ç¹å®ã®ããŒãžã§ã³ã«ããŒãããŸãã ãã®åŸãä»®æ³ç°å¢ãã¢ã¯ãã£ãåããŸãã ããã§ãä»ã®ã©ã€ãã©ãªã®ããŒããéå§ãããããã®æäœãéå§ã§ããŸãã
å®äºããããç°å¢ããªãã«ããããšãå¿ããªãã§ãã ããïŒ
# Install virtualenv $ pip install virtualenv # Go to the folder of your project $ cd my_folder # Create a virtual environment `venv` $ virtualenv venv # Indicate the Python interpreter to use for `venv` $ virtualenv -p /usr/bin/python2.7 venv # Activate `venv` $ source venv/bin/activate # Deactivate `venv` $ deactivate
Pythonã§ããŒã¿åæã®æåã®ã¹ããããå®è¡ããå Žåãæåã¯ä»®æ³ç°å¢ã«åé¡ãããããã«èŠãããããããŸããã ç¹ã«ããããžã§ã¯ãã1ã€ãããªãå Žåãä»®æ³ç°å¢ãå¿ èŠãªçç±ããŸã£ããç解ã§ããªããããããŸããã
ããããè€æ°ã®ãããžã§ã¯ããåæã«å®è¡ããŠããŠãåãPythonã€ã³ã¹ããŒã«ã䜿çšããããªãå Žåã¯ã©ãã§ããããïŒ ãŸãã¯ããããžã§ã¯ãã«ççŸããèŠä»¶ãããå Žåã ãã®ãããªå Žåãä»®æ³ç°å¢ã¯çæ³çãªãœãªã¥ãŒã·ã§ã³ã§ãã
èšäºã®ç¬¬2éšã§ã¯ãããŒã¿åæçšã®ã¡ã€ã³ã©ã€ãã©ãªã«ã€ããŠèª¬æããŸãã
ç¶ç¶ããã«ã¯...