Tools for Software Developers: Open Framework and Machine Learning Libraries

Continuing our series of open source developer tools. Today we talk about frameworks and libraries for MO - Transformers, Accord.NET and MLflow.





Photos - Franck V. - Unsplash




Transformers




This is a library of natural language processing models on TensorFlow 2.0 and PyTorch. It contains more than 32 pre-trained models - BERT, DistilBert, XLM, GPT-2, XLNet and others.



The authors of the library were engineers from the company HuggingFace, which develops NLP-algorithms. It was they who introduced the Hierarchical Multi-Task Learning (HMTL) machine multitasking model, which took another step toward solving the problem of β€œ catastrophic forgetfulness ”. HMTL was shown at AAAI 2019, an international academic conference on artificial intelligence systems.



A key characteristic of Transformers is the ability to exchange trained models and convert them from one framework to another: TF2.0 or PyTorch. The developers note that their solution allows you to describe the procedure for training the model with three lines of code.



An extensive community has formed around the library - almost 15 thousand stars on GitHub . You can evaluate the capabilities of Transformers yourself on the project website : the developers taught the neural network to append proposals for you.




Accord.NET




A framework sharpened by C # that provides basic tools for data analysis and machine learning: from testing statistical hypotheses to building models of computer vision and image processing. Accord.NET is one of the most popular MO solutions in the .NET ecosystem. Initially, it was an extension of the AForge.NET library, but then absorbed it.



The tool offers probability distributions, core functions and benchmarks for evaluating the performance of models. Accord.NET is divided into libraries available as executable modules, compressed archives, or NuGet packages . Among them are: Math for working with matrices, Imaging for image processing and Audio with sound functions. You can also highlight Neuro with the Levenberg - Marquardt and deep learning algorithms.



Accord.NET was used for research by engineers from universities in the UK , Egypt , China and other countries. And in general, the framework uses a fairly large number of developers - it has more than 3.5 thousand stars on GitHub .



Among the shortcomings can be distinguished confusing documentation, difficult for beginners. Although the situation is slightly simplified by the availability of a quick start guide and detailed comments in the code. Further information on Accord.NET can also be found in the literature. The developers themselves recommend Machine Learning Projects for .NET Developers , F # for Machine Learning Essentials , and a couple of others .





Photos - Franck V. - Unsplash




MLflow




It is a platform for the full cycle of machine learning, simplifying the development, deployment, and exchange of models. It offers a set of APIs that work with any library (TensorFlow, PyTorch, XGBoost, etc.) and in any environment, including the cloud. MLflow developers are programmers from Databricks, a startup founded by people from Apache Spark.



MLflow has built-in integrations with Docker, TensorFlow, PyTorch, Kubernetes, Java, Spark and other open source projects. At the same time, MLflow is used by organizations such as Microsoft, Accenture, SK Telecom and even Washington University.



Among the disadvantages of MLflow, one can single out the lack of support for R and Java, despite their popularity in the field of machine learning. But the point here is the relative youth of the project, and the developers promise to add appropriate APIs in the future. The youth of the instrument leaves another imprint - there are bugs in its work.



If you want to independently evaluate MLflow at work, you can start familiarizing yourself with the official documentation . If you have questions, a relatively small but active community on StackOverflow or Google Groups will help with their solution.



Our other collections:



Save time when working with the command line

Benchmarks for servers on Linux: a selection of open tools



What we write on HabrΓ©:



What is known about VMworld 2019

Understanding application and service privacy policies will help neural networks

EU court opposes cookies by default - there should be no pre-checked checkboxes




We offer a cloud facility storage service . For backup, archiving and document sharing.





All Articles