Application of machine learning and data science in industry Habr, hello. I translated a post that goes strictly (!) To bookmarks and is passed on to colleagues. It has a list of notebooks and ML and Data Science libraries for various industries. All codes are in Python, and are hosted on GitHub. They will be useful both for expanding horizons and for launching an interesting startup.
Iβll note that if there are any readers who want to help and add a suitable project to any of the sub-sectors, please contact me. I will add them to the list. So, let's start exploring the list.
1. Real estate and food
1.1. Nutrition
1.2. Restaurants
1.3. The property
2. Accounting
2.1. Machine learning
2.2. Analytics
2.3. Text analysis
2.4. Data, Parsing and API
2.5. Research and articles
2.6. Web sites
2.7. Courses
3. Agriculture
3.1. Economy
Prices - forecast of prices for agricultural products 1; Prices 2 - forecast of prices for agricultural products 2; Yield - agricultural yield analysis in Ukraine; Recovery - strategic use of land in agriculture, taking into account ecosystem restoration; MPR - agricultural price reporting data. US Department of Agriculture products.
3.2. Development
Segmentation - segmentation of agricultural fields using satellite imagery; Water Table - predicting the depth of groundwater in agricultural areas; Assistant - laptops from the virtual Agriculture Assistant; Eco-evolutionary - eco-evolutionary dynamics; Diseases - identification of crop diseases and pests using the Deep Learning framework for images; Irrigation and Pest Prediction - analysis of irrigation and prediction of the likelihood of pests.
4. Banking and insurance
4.1. Consumer finance
4.2. Management and operations
Credit Card - CLV assessment of credit card customers; Survival Analysis - analysis of LTV clients; Next Transaction - a deep learning model for predicting the amount of the transaction and the days until the next transaction; Credit Card Churn - prediction of the outflow of customers with credit cards; Bank of England Minutes - the main ideas of the preliminary processing of the text using the minutes of the meetings of the Bank of England Monetary Policy Committee; CEO - An analysis of the correlation between the remuneration of a male CEO and a female CEO.
4.3. Rating
4.4. Fraud
4.5. Insurance and Risks
4.6. Useful
5. Biotechnology and science
5.1. Are common
Programming - programming for biologists in Python; Introduction DL - a textbook on the advanced study of genomics; Pose - animal pose assessment using DL; Privacy - the exchange of clinical data, while maintaining confidentiality; Population Genetics - population genetic conclusion; Bioinformatics Course - course materials in computational biology and bioinformatics; Applied Stats - Applied Statistics for High Performance Biology; Scripts - Python scripts for biologists; Molecular NN - a mini-framework for building and training neural networks for molecular biology; Systems Biology Simulations - practical system biology when writing simulations with F # and Z3; Cell Movement - LSTM for predicting the biological movement of cells; Deepchem - deep learning for the discovery of new drugs, quantum chemistry, materials science and biology.
5.2. Sequence
5.3. Chemoinformatics and drug discovery
Novel Molecules - a convolutional network that can study functions; Automating Chemical Design - creating new molecules for effective research; GAN drug Discovery - a method that combines generative models with training and reinforcement; RL - generating compounds predicted as active; One-shot learning - the use of machine learning in the field of drug search in simple and convenient ways.
5.4. Genomic
5.5. The science
Plants Disease - an application that identifies diseases in plants using a deep learning model; Leaf Identification - identification of plants through leaves based on their shape, color and texture; Crop Analysis - image library for detecting and tracking the future position of ears on corn plants; Seedlings - plant seedlings, classification from Kaggle; Plant Stress - an ontology containing plant stress; Animal Hierarchy - a package for calculating animal dominance hierarchies; Animal Identification - deep animal identification training; Species - big data analysis of various animal species; Animal Vocalisations - generative network for animal vocalization; Evolutionary - a tool for evolution strategies; Glaciers - educational material about glaciers.
6. Construction machinery
6.1. Building
6.2. Engineering
6.3. Materials Science
7. Economics
7.1. General
7.2. Machine learning
EconML - automated training and analysis of cause and effect relationships; Auctions - the best auctions using deep learning.
7.3. Calculations
8. Education and research
8.1. Students
8.2. School
9. Emergencies
9.1. Prevention
9.2. Crime
9.3. Ambulance
Ambulance Analysis - a study of changes in the time of ambulance arrival in Victoria; Site Location - ambulance location; Dispatching - application of game theory and simulation of discrete events to find the optimal solution for dispatching ambulances; Ambulance Allocation - time series analysis of ambulance departures in the city of San Diego; Response Time - analysis of the improvement in the response time of the ambulance; Optimal Routing - a project to find the optimal routing of ambulances; Crash Analysis - predicting the probability of accidents in this segment at a given time.
9.4. Disaster management
10. Finance
10.1. Trade and investment
10.2. Data
Datastream - Datastrem from Thomson Reuters, available through Python; AlphaVantage - an API wrapper to simplify the process of obtaining free financial data; FSA - A project to translate SEC Edgar Filings financial data into custom financial reporting analysis models; TradeConnector - relations with market data providers; Employee Count SEC Filings - exact values ββof the number of employees for companies from SEC applications; SEC Parsing - NLP for searching and extracting specific information from long unstructured documents; Open Edgar - OpenEDGAR; Rating Industries - stories from several agencies, converted to CSV format.
11. Health
11.1. General
12. Justice, law and regulation
12.1. Instruments
12.2. Policy and Regulation
12.3. Arbitrage practice
13. Production
13.1. General
13.2. Maintenance
13.3. Mistakes
13.4. Quality
14. Media and publishing
14.1. Marketing
15. Physics
15.1. General
15.2.
16.
16.1. Social politics
16.2. Charity
16.3. Election analysis
16.4. Politics
Congressional politics - House of Representatives of the US Congress; Politico - a platform for profiling public figures in Brazilian politics; Bots - tools and algorithms for analyzing Paraguayan tweets during the elections; Gerrymander tests - many metrics to quantify Gerrymandering; Sentiment - analysis of newspapers for their political conviction using subjective sentiments of party representatives; DL Politics - A comparison of a socialist party versus a popular party in Brazil; PAC Money - the influence of PAC money on US policy; Power Networks - creating a watchdog for Indian corporate and political networks; Elite - the political elite in the USA; Debate Analysis - a program for analyzing political debates; Political Affiliation - forecast political affiliation using Twitter metadata; Political Ads - an investigation into Facebook of political ads and targeting; Political Identity - a multiaxial political model of political identity; YT Politics - display policies on YouTube; Political Ideology - An uncontrolled study of political ideology using verbal vector projections.
17. Real estate, rental and leasing
17.1. The property
17.2. Rent and leasing
18. Utilities
18.1. Electric power
18.2. Coal, Oil and Gas
18.3. Water pollution
18.4. Logistics
19. Wholesale and retail trade
19.1. Wholesale
19.2. Retail
On this, our post on the application of ML and DS in industry came to an end. I hope you learned something new for yourself. If you have something that you can share yourself - write in the comments.
More information about machine learning and Data Science in my account on
HabrΓ© and in the telegram channel
Neuron , subscribe so as not to miss future articles.
All knowledge!
All Articles