Introduction to Julia-
What is Julia ?
“Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. ”
It provides a sophisticated compiler and extensive mathematical function library.
It has a wide range of statistical packages like R and is easy to write and learn like Python.
Benchmark time of Julia relative to C : ( smaller is better, C performance = 1.0 )
[Image source : http://julialang.org/benchmarks/]
Growing importance of Julia:
Ambition of Julia’s project team is to create an open source language that is general purpose and excels at numerical computing and data science. The motivation and goals behind Julia excite researchers and analysts because many have had first hand experience with the difficulties of writing high performance codes.
Julia offers a unique feature set as well :
- Syntax similar to MATLAB.
- Free and open source.
- Sophisticated compiler , Just-in-time(JIT) compilation.
- Designed for parallelism and distributed computation.
- C functions can be called directly without any special API’s needed.
- A powerful mathematical function library written in Julia.
- Pycall package can call python functions in Julia.
Setting up Julia environment:
- Install the current release from http://julialang.org/downloads/
- Set path of julia/bin in .bashrc
- Open you terminal and type Julia.
Installing packages in Julia:
Using a package in Julia.
julia> using DataFrames
Updating your installed packages :
Introductory example of Julia using PyPlot :
Plotting a process
using PyPlot length = 50 epsilon_values = randn(length) plot(epsilon_values, "b-")
Statistical packages in Julia:
Julia provides easy to use open source tools for statistics and machine learning.
Some of the great packages offered by Julia are :
- StatsBase : Basic functionalities for statistics
- DataArrays : Arrays that allow missing data
- MultivariateStats : Multivariate statistical analysis
- HypothesisTests : Parametric and non-Parametric tests.
- MLBase : Swiss knife for machine learning
- Clustering : Algorithms for data clustering
- NMF : Nonnegative matrix factorization
- TimeSeries : Time series analysis
- MCMC : Markov Chain Monte Carlo