Julia. Where to start the project? ...







Very often, when solving problems of analysis and data preparation, one-time scripts are written, the support and development of which is not provided at all. This approach has a right to exist, especially in the student community. However, when there is more than one person working with the code, or if the code needs to be maintained for more than one working day, the option of organizing work in the form of a heap of files is not acceptable.

Therefore, today weโ€™ll talk about such an important topic as creating a project from scratch in the Julia language, how to fill it, and what technological tools exist to support development.







Project



As already mentioned, one-time scripts or Jupyter Notebooks have the right to exist on the desktop of one person, especially when the programming language is used as an advanced calculator. But this approach is completely unsuitable for the development of projects that should be developed and operated for years. And, of course, Julia, as a technology platform, has tools that provide developers with this opportunity.







For starters, a few general points. Julia has a Pkg module for package management. Any Julia library is a module. If the module is not included in the Julia base kit, it is issued as a separate package. For each package there is a project file Project.toml



, which contains a description of the project and its dependence on other packages. There is a second file - Manifest.toml



, which, unlike Project.toml



, is generated automatically and contains a list of all the necessary dependencies with the version numbers of the packages. The Toml file format is Tom's Obvious, Minimal Language .







Package Naming Rules



According to the documentation , the package name may consist of Latin letters and numbers. And this name should be chosen in such a way that it is clear to most Julia users, and not just experts in a narrow subject area.









At the same time, the name of the git repository usually has the suffix โ€œ.jlโ€.







Package generation



The easiest way to create a package is to generate it with a generator built into Julia. To do this, in the console, you need to go to the directory in which the package should be created, then run julia and put it into package management mode:







 julia> ]
      
      





The final step is to start the package generator by specifying the name that we want to give the package.







 (v1.2) pkg> generate HelloWorld
      
      





As a result, a new directory appears in the current directory corresponding to the name of the package, the composition of which can be seen using the tree



command (if installed):







 shell> cd HelloWorld shell> tree . . โ”œโ”€โ”€ Project.toml โ””โ”€โ”€ src โ””โ”€โ”€ HelloWorld.jl 1 directory, 2 files
      
      





In this case, we see a minimal but insufficient set of files for a well-designed project. For more details see https://julialang.github.io/Pkg.jl/v1/creating-packages/ .







An alternative way to create packages is with the PkgTemplates.jl generator. Unlike the built-in generator, it allows you to immediately generate a complete set of service files for servicing the package. The only drawback is that it itself must be installed as a package.







The procedure for creating a package with its help is as follows. We connect a package:







 julia> using PkgTemplates
      
      





We create a template that includes a list of authors, a license, requirements for Julia, a list of plugins for continuous integration systems (an example from the documentation for PkgTemplates



):







 julia> t = Template(; user="myusername", #   github license="ISC", #   authors=["Chris de Graaf", "Invenia Technical Computing Corporation"], dir="~/code", # ,    julia_version=v"1.0", #   Julia plugins=[ #     TravisCI(), Codecov(), Coveralls(), AppVeyor(), GitHubPages(), CirrusCI(), ], )
      
      





We get the template:







 Template: โ†’ User: myusername โ†’ Host: github.com โ†’ License: ISC (Chris de Graaf, Invenia Technical Computing Corporation 2018) โ†’ Package directory: ~/code โ†’ Minimum Julia version: v0.7 โ†’ SSH remote: No โ†’ Commit Manifest.toml: No โ†’ Plugins: โ€ข AppVeyor: โ†’ Config file: Default โ†’ 0 gitignore entries โ€ข Codecov: โ†’ Config file: None โ†’ 3 gitignore entries: "*.jl.cov", "*.jl.*.cov", "*.jl.mem" โ€ข Coveralls: โ†’ Config file: None โ†’ 3 gitignore entries: "*.jl.cov", "*.jl.*.cov", "*.jl.mem" โ€ข GitHubPages: โ†’ 0 asset files โ†’ 2 gitignore entries: "/docs/build/", "/docs/site/" โ€ข TravisCI: โ†’ Config file: Default โ†’ 0 gitignore entries
      
      





Now, using this template, we can create packages by simply specifying their name:







 julia> generate(t, "MyPkg1")
      
      





In a minimal version, the template may look like this:







 julia> t = Template(; user="rssdev10", authors=["rssdev10"]) Template: โ†’ User: rssdev10 โ†’ Host: github.com โ†’ License: MIT (rssdev10 2019) โ†’ Package directory: ~/.julia/dev โ†’ Minimum Julia version: v1.0 โ†’ SSH remote: No โ†’ Add packages to main environment: Yes โ†’ Commit Manifest.toml: No โ†’ Plugins: None
      
      





If we create a package named MyPkg2 from this template:







 julia> generate(t, "MyPkg2")
      
      





Then we can check the result directly from Julia:







 julia> run(`git -C $(joinpath(t.dir, "MyPkg2")) ls-files`); .appveyor.yml .gitignore .travis.yml LICENSE Project.toml README.md REQUIRE docs/Manifest.toml docs/Project.toml docs/make.jl docs/src/index.md src/MyPkg2.jl test/runtests.jl
      
      





The following fields should be noted:









After creating the project, a sufficient set of files will be generated and a git repository will be created. Moreover, all generated files will be added to this repository automatically.







Typical file location in a project



Weโ€™ll borrow a picture with a typical arrangement of files and their contents from https://en.wikibooks.org/wiki/Introducing_Julia/Modules_and_packages , but we will expand it a bit:







 Calculus.jl/ #    Calculus deps/ #       docs/ #       src/ #    Calculus.jl #    โ€”   . module Calculus #        ! import Base.ctranspose #     , export derivative, check_gradient, #      ... include("derivative.jl") #     include("check_derivative.jl") include("integrate.jl") import .Derivative end #   Calculus.jl derivative.jl #     - , module Derivative #    Calculus.jl export derivative function derivative() ... end โ€ฆ end check_derivative.jl #     , function check_derivative(f::...)#     ... # "include("check_derivative.jl")"  Calculus.jl end โ€ฆ integrate.jl #     , function adaptive_simpsons_inner(f::Funct#      Calculus.jl ... end ... symbolic.jl #     Calculus.jl export processExpr, BasicVariable, ...#       import Base.show, ... #     Base , type BasicVariable <: AbstractVariable# ...       ... end function process(x::Expr) ... end ... test/ #     Calculus runtests.jl #     using Calculus #    Calculus... using Test #   Base.Test... tests = ["finite_difference", ... #  -   ... for t in tests include("$(t).jl") # ...     end ... finite_difference.jl #    -   @test ... # ...      runtests.jl ...
      
      





We add that the deps



directory may contain the files necessary for the correct assembly of the package. For example, deps/build.jl



is a script that runs automatically when the package is installed. The script can contain any code for data preparation (download a data set or perform preprocessing) or other programs necessary for work.







It should be noted that there can be only one main module in a project. That is, in the example above - Calculus



. However, in the same example, there is a Derivative



nested module that connects via include



. Pay attention to this. include



includes the file as text, not as a module, which happens with using



or import



. The last two functions not only include the module, but force Julia to compile it as a separate entity. In addition, Julia will try to find this module in the dependency packages and issue a warning that it is missing in Project.toml



. Therefore, if our task is to make hierarchical access to functions, delimiting them by namespaces, then we include files through include



, and activate the module through a dot, indicating its local affiliation. I.e:







 module Calculus include("derivative.jl") import .Derivative ... end
      
      





The derivative



function that is exported from the Derivative



module will be available to us through Calculus.Derivative.derivative()









Project.toml Project File



The project file is a text file. Its main sections are disclosed in the description https://julialang.github.io/Pkg.jl/v1/toml-files/







After the file is generated, all the necessary fields are already present in it. However, you may need to change part of the description, change the composition of packages, their versions and specific dependencies of different operating systems or configurations.







The main fields are:







 name = "Example" uuid = "7876af07-990d-54b4-ab0e-23690620f79a" version = "1.2.5"
      
      





name



- the name of the package chosen according to the naming rules. uuid



is a unified identifier that can be generated by a package generator or any other uuid



generator. version



- package version number in the format of three decimal numbers separated by periods. This conforms to the format of Semantic Versioning 2.0.0 . Prior to declared version 1.0.0, any changes in the program interface are possible. After the release of this version, the package owner must comply with the compatibility rules. Any compatible changes should be reflected in the minor number (right). Incompatible changes must be accompanied by a change in the high number. Naturally, there is no automatic control over the versioning rule, but non-observance of the rule will simply lead to the fact that users of the package will begin to massively stop using and migrate to the package whose authors comply with this rule.







All package dependencies are presented in the [deps]



section.







 [deps] Example = "7876af07-990d-54b4-ab0e-23690620f79a" Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
      
      





This section contains a list of direct dependencies of our package. Cascading dependencies are reflected in the Manifest.toml



file, which is automatically generated in the project directory. All dependencies are represented by =



pairs. And, usually, this part is not filled with hands. For this, the functions of the Pkg



package are provided. And, most often, this is done from REPL



, switching it to package management mode - ]



. Next - the operations add



, rm



, st



, etc., but always in the context of the current package. If not, you need to execute activate .



.







Manifest.toml



can be saved in git



version control system. This approach with two files allows you to rigidly fix the packages in the dependency tree during the testing of the software product, after which it is guaranteed that if our package is deployed in a new location, then the same versions of third-party packages will be repeated in the same place. Or, conversely, in the absence of Manifest.toml



will be given the opportunity to use any available versions that satisfy the basic conditions.







The [compat]



section allows you to specify specific versions of packages that we require.







 [deps] Example = "7876af07-990d-54b4-ab0e-23690620f79a" [compat] Example = "1.2" julia = "1.1"
      
      





Packages are identified by the name previously used in the [compat]



section. julia



indicates the version of Julia itself.







When specifying versions, the rules listed in https://julialang.github.io/Pkg.jl/dev/compatibility/ apply. However, the same rules are specified in the Semantic Versioning .







There are several versioning rules. For example:







 [compat] Example = "1.2, 2"
      
      





means that any version in the range [1.2.0, 3.0.0)



is suitable, not including 3.0.0



. And this is fully consistent with a simpler rule:







 [compat] Example = "1.2"
      
      





Moreover, simply specifying the version number is an abbreviated form of "^1.2"



. An example of the application of which looks like:







 [compat] PkgA = "^1.2.3" # [1.2.3, 2.0.0) PkgB = "^1.2" # [1.2.0, 2.0.0) PkgC = "^1" # [1.0.0, 2.0.0) PkgD = "^0.2.3" # [0.2.3, 0.3.0) PkgE = "^0.0.3" # [0.0.3, 0.0.4) PkgF = "^0.0" # [0.0.0, 0.1.0) PkgG = "^0" # [0.0.0, 1.0.0)
      
      





If we need to specify more stringent restrictions, it is necessary to use a form with a tilde.







 [compat] PkgA = "~1.2.3" # [1.2.3, 1.3.0) PkgB = "~1.2" # [1.2.0, 1.3.0) PkgC = "~1" # [1.0.0, 2.0.0) PkgD = "~0.2.3" # [0.2.3, 0.3.0) PkgE = "~0.0.3" # [0.0.3, 0.0.4) PkgF = "~0.0" # [0.0.0, 0.1.0) PkgG = "~0" # [0.0.0, 1.0.0)
      
      





Well and, of course, an indication of equal signs / inequalities is available:







 [compat] PkgA = ">= 1.2.3" # [1.2.3, โˆž) PkgB = "โ‰ฅ 1.2.3" # [1.2.3, โˆž) PkgC = "= 1.2.3" # [1.2.3, 1.2.3] PkgD = "< 1.2.3" # [0.0.0, 1.2.2]
      
      





It is possible to specify several options for dependencies in the [targets]



section. Traditionally, in Julia before version 1.2, it was used to specify dependencies for using the package and for running tests. For this, additional packages were indicated in the [extras]



section, and in [targets]



the target configurations with package names were listed.







 [extras] Markdown = "d6f4376e-aef5-505a-96c1-9c027394607a" Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" [targets] test = ["Markdown", "Test"]
      
      





Starting with Julia 1.2, it is recommended that you simply add a separate project file for the test/Project.toml



.







Additional Dependencies



Additional dependencies can be deps/build.jl



through the deps/build.jl



, however, the Artifacts.toml



file is provided in the Julia project structure. The Pkg.Artifacts



project management Pkg.Artifacts



provides functions for automating the loading of additional dependencies. An example of such a file:







 # Example Artifacts.toml file [socrates] git-tree-sha1 = "43563e7631a7eafae1f9f8d9d332e3de44ad7239" lazy = true [[socrates.download]] url = "https://github.com/staticfloat/small_bin/raw/master/socrates.tar.gz" sha256 = "e65d2f13f2085f2c279830e863292312a72930fee5ba3c792b14c33ce5c5cc58" [[socrates.download]] url = "https://github.com/staticfloat/small_bin/raw/master/socrates.tar.bz2" sha256 = "13fc17b97be41763b02cbb80e9d048302cec3bd3d446c2ed6e8210bddcd3ac76" [[c_simple]] arch = "x86_64" git-tree-sha1 = "4bdf4556050cb55b67b211d4e78009aaec378cbc" libc = "musl" os = "linux" [[c_simple.download]] sha256 = "411d6befd49942826ea1e59041bddf7dbb72fb871bb03165bf4e164b13ab5130" url = "https://github.com/JuliaBinaryWrappers/c_simple_jll.jl/releases/download/c_simple+v1.2.3+0/c_simple.v1.2.3.x86_64-linux-musl.tar.gz" [[c_simple]] arch = "x86_64" git-tree-sha1 = "51264dbc770cd38aeb15f93536c29dc38c727e4c" os = "macos" [[c_simple.download]] sha256 = "6c17d9e1dc95ba86ec7462637824afe7a25b8509cc51453f0eb86eda03ed4dc3" url = "https://github.com/JuliaBinaryWrappers/c_simple_jll.jl/releases/download/c_simple+v1.2.3+0/c_simple.v1.2.3.x86_64-apple-darwin14.tar.gz" [processed_output] git-tree-sha1 = "1c223e66f1a8e0fae1f9fcb9d3f2e3ce48a82200"
      
      





We will not dwell in more detail, since the further description depends on the specific use case. The library functions artifact_hash



, download



, create_artifact



, bind_artifact



. See the documentation https://julialang.github.io/Pkg.jl/dev/artifacts/ for more details.







Main code implementation and debugging



Of course, we explicitly or implicitly specify the development directory when creating the package. However, if necessary, we can change it. If the package was generated by PkgTemplates



with default parameters, look for it in the ~/.julia/dev



. Despite the fact that the directory is hidden, the transition to it is possible through a direct link in the file navigator. For MacOS in Finder, for example, this is done by pressing Command + Shift + G. If the package is created in any other directory, just open it in a text editor. The best editor for working with Julia code is Atom and all that the uber-juno



plugin supports. In this case, you get a text editor with automatic formatting of the code, a REPL console for interactive code execution, the ability to execute only selected code fragments and view the results, including displaying graphics. And also, a step by step debugger. Although, we must admit that at the moment it is quite slow, so the current debugging mode - first we think that we want to check and put the debug output, then we run the test for testing.







It is recommended that you look at common design patterns for dynamic programming languages . Also, the book "Hands-On Design Patterns with Julia 1.0. Tom Kwong" and sample code for it . And when implementing programs, you should consider the recommendations on the Julia Style Guide programming style.







Of the subtleties of debugging, you can note the Revise.jl



package. Its activation can be set in the file .julia/config/startup.jl



only for interactive mode, in which REPL can be run from the Atom editor. Revise allows you to edit the function code inside our package without restarting the REPL session, and each run using / import in our tests will enable these updates.







For effective development, it is recommended to develop in parallel the main code and tests that test it. This allows you to implement only what is really needed, because otherwise, in the tests obviously there will be unnecessary functions. Therefore, they must be removed. In fact, Julia does not offer anything specific in the principles of development. However, the emphasis on development through unit testing is given here because Julia compiles the code rather slowly, and in the step-by-step debugging mode the performance is very much reduced. That is, it depends on the development of tests, their organization, how quickly the package being developed will be debugged and verified.







Tests



A typical test location is the test directory. The file test/runtests.jl



is the starting point for all tests.







In relation to the above mentioned example, a typical file type is:







 using Calculus #    Calculus  ... using Test #   Test... tests = ["finite_difference", "..."]#      -... for t in tests include("$(t).jl") #        end
      
      





Files of specific tests are recommended to be developed on the basis of the grouping of tested functions. For example, in the mentioned Calculus



module various algorithms for calculating derivatives, integrals, etc. may be present. It will be logical to test them with various tests located in different files.







For unit testing, Julia provides the Test



module from the core library set. The @test



macro is defined in this module, the purpose of which is to verify the correctness of the specified statement. Examples:







 julia> @test true Test Passed julia> @test [1, 2] + [2, 1] == [3, 3] Test Passed julia> @test ฯ€ โ‰ˆ 3.14 atol=0.01 Test Passed
      
      





Pay attention to the full form of the approximate comparison operator โ‰ˆ



.







The statement checking the choice of exception is @test_throws



. Example - create an array and access the index beyond it:







 julia> @test_throws BoundsError [1, 2, 3][4] Test Passed Thrown: BoundsError
      
      





A useful construct is @testset



. It allows you to group individual statements into a logically connected test. For example:







 julia> @testset "trigonometric identities" begin ฮธ = 2/3*ฯ€ @test sin(-ฮธ) โ‰ˆ -sin(ฮธ) @test cos(-ฮธ) โ‰ˆ cos(ฮธ) @test sin(2ฮธ) โ‰ˆ 2*sin(ฮธ)*cos(ฮธ) @test cos(2ฮธ) โ‰ˆ cos(ฮธ)^2 - sin(ฮธ)^2 end; Test Summary: | Pass Total trigonometric identities | 4 4
      
      





For each set declared through @testset



, its own table of tests passed is formed. Test suites can be nested. In case of their successful passage, a summary table is issued, in the case of failure - for each group of tests its own statistics will be issued.







 julia> @testset "Foo Tests" begin @testset "Animals" begin @testset "Felines" begin @test foo("cat") == 9 end @testset "Canines" begin @test foo("dog") == 9 end end @testset "Arrays" begin @test foo(zeros(2)) == 4 @test foo(fill(1.0, 4)) == 15 end end Arrays: Test Failed Expression: foo(fill(1.0, 4)) == 15 Evaluated: 16 == 15 [...] Test Summary: | Pass Fail Total Foo Tests | 3 1 4 Animals | 2 2 Arrays | 1 1 2 ERROR: Some tests did not pass: 3 passed, 1 failed, 0 errored, 0 broken.
      
      





@test_broken



, @test_skip



.









. julia



:







  --code-coverage={none|user|all}, --code-coverage Count executions of source lines (omitting setting is equivalent to "user") --code-coverage=tracefile.info Append coverage information to the LCOV tracefile (filename supports format tokens). --track-allocation={none|user|all}, --track-allocation Count bytes allocated by each source line (omitting setting is equivalent to "user")
      
      





code-coverage



โ€” . ( ), . , . .cov



. .

:







  - function vectorize(str::String) 96 tokens = str |> tokenizer |> wordpiece 48 text = ["[CLS]"; tokens; "[SEP]"] 48 token_indices = vocab(text) 48 segment_indices = [fill(1, length(tokens) + 2);] 48 sample = (tok = token_indices, segment = segment_indices) 48 bert_embedding = sample |> bert_model.embed 48 collect(sum(bert_embedding, dims=2)[:]) - end
      
      





track-allocation



โ€” . , , , , .mem



.







:







  - function vectorize(str::String) 0 tokens = str |> tokenizer |> wordpiece 6766790 text = ["[CLS]"; tokens; "[SEP]"] 0 token_indices = vocab(text) 11392 segment_indices = [fill(1, length(tokens) + 2);] 1536 sample = (tok = token_indices, segment = segment_indices) 0 bert_embedding = sample |> bert_model.embed 170496 collect(sum(bert_embedding, dims=2)[:]) - end
      
      





, . , , , , . , . .







โ€” :







 julia --project=@. --code-coverage --track-allocation test/runtests.jl
      
      





โ€” Profile.jl



@profile



. https://julialang.org/blog/2019/09/profilers . @noinline



, . , fib



fib_r



.







 julia> @noinline function fib(n) return n > 1 ? fib_r(n - 1) + fib_r(n - 2) : 1 end julia> @noinline fib_r(n) = fib(n) julia> @time fib(40) 0.738735 seconds (3.16 k allocations: 176.626 KiB) 165580141 julia> using Profile julia> @profile fib(40) 165580141 julia> Profile.print(format=:flat, sortedby=:count) Count File Line Function 12 int.jl 52 - 14 int.jl 53 + 212 boot.jl 330 eval 5717 REPL[2] 1 fib_r 6028 REPL[1] 2 fib julia> count(==(0), Profile.fetch()) 585
      
      





@profile fib(40)



. Profile.print(format=:flat, sortedby=:count)



. , , , fib_r



fib



, . , :







 julia> Profile.print(format=:tree) 260 REPL[1]:2; fib(::Int64) 112 REPL[1]:1; fib_r(::Int64) 212 task.jl:333; REPL.var"##26#27" 212 REPL.jl:118; macro expansion 212 REPL.jl:86; eval_user_input 212 boot.jl:330; eval โ•Ž 210 REPL[1]:2; fib โ•Ž 210 REPL[1]:1; fib_r โ•Ž 210 REPL[1]:2; fib โ•Ž 210 REPL[1]:1; fib_r โ•Ž 210 REPL[1]:2; fib โ•Ž โ•Ž 210 REPL[1]:1; fib_r โ•Ž โ•Ž 210 REPL[1]:2; fib โ•Ž โ•Ž 210 REPL[1]:1; fib_r โ•Ž โ•Ž 210 REPL[1]:2; fib โ•Ž โ•Ž 210 REPL[1]:1; fib_r โ•Ž โ•Ž โ•Ž 210 REPL[1]:2; fib โ•Ž โ•Ž โ•Ž 210 REPL[1]:1; fib_r โ•Ž โ•Ž โ•Ž 210 REPL[1]:2; fib โ•Ž โ•Ž โ•Ž 210 REPL[1]:1; fib_r โ•Ž โ•Ž โ•Ž 210 REPL[1]:2; fib ...
      
      





. PProf.jl, .













. https://github.com/vchuravy/PProf.jl .







Documentation



doc



. https://habr.com/ru/post/439442/

, , Julia .







Project.toml



, . , , - , , .









, , . , โ€” . :









, , . , , git clone, . PackageCompiler.jl



. , , - .







C



- , , ( - , ), deps, deps/build.jl



. . , , , . , , , , . , , build.jl



, :







 #!/usr/bin/env julia --project=@. using Pkg Pkg.activate(".") Pkg.build() # Pkg.build(; verbose = true) for Julia 1.1 and up Pkg.test() # (coverage=true)
      
      





. julia --project=@.



Julia Project.toml



. , โ€” build.jl



, executable



. , julia --project=@. build.jl



.







Pkg.activate(".")



( Project.toml



).







Pkg.build()



, C-, . deps/build.jl



, .







Pkg.test()



. , -, , . -, , . coverage=true



. , . build.jl



.







, . , PkgTempletes



. โ€” Gitlab CI, Travis CI, GitHub , .







Conclusion



, , Julia. , , . , โ€” , -, , . , .







References






All Articles