Showing posts with label Big Data Analytics. Show all posts
Showing posts with label Big Data Analytics. Show all posts

Thursday, March 19, 2015

Introduction to Julia

"Julia is a fresh approach to technical computing."  boasts the startup message, flourished with colorful circles hovering above a bubbly ASCII Julia logo.  The formatting effort is not wasted, it's an exuberant promise: Julia will make the command line fun again.
apptrain_1@julia:~/workspace $ julia
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation:
   _ _   _| |_  __ _   |  Type "help()" to list help topics
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.2.1 (2014-02-11 06:30 UTC)
 _/ |\__'_|_|_|\__'_|  |  
|__/                   |  x86_64-linux-gnu


Julia was created by four Data Scientists from MIT who began working on it around 2011.  The language is beginning to mature at a time when the Data Scientist job title is popping up on resumes as fast as Data Scientist jobs appear.  The timing is excellent.   R programming, an offshoot of S Programming , is the language of choice for today's mathematical programmer.  But it feels clunky, like a car from the last century. While Julia may not unseat R in the world of Data Analysis,  plans don't stop there.

If you want to code along with the examples in this article, jump to Getting Started with Julia and chose one of the three options to start coding.

Julia is a general purpose programming language.  It's creators have noble goals.  They want a language that is fast like C, they want it flexible with cool metaprograming capabilities like Ruby, they want parallel and distributed computing like Scala, and true Mathematical equations like MATLAB.

Why program in Julia?

1) Julia is Fast

Julia already boasts faster matrix multiplication and sorting than Go and Java.  It uses the LLVM compiler, which languages like GO use for fast compilation.   Julia uses just in time (JIT) compilation to machine code , and often achieves C like performance numbers.

2) Julia is written in Julia

Contributors need only work with a single language, which makes it easier for Julia users to become core contributors. 
"As a policy, we try to never resort to implementing things in C. This keeps us honest – we have to make Julia fast enough to allow us to do that" -Stephan Karpinski

And, as the languages co-creator Karpinski notes in the comments of the referenced post,   Writing the language itself in Julia means that when improvements are made to the compiler, both the system and user code gets faster.

3) Julia is Powerful

Like most programming languages, it's implementation is Open Source.  Anyone can work on the language or the documentation.  And like most modern programming languages, Julia has extensive metaprogramming support.  It's creators attribute the Lisp language for their inspiration:
Like Lisp, Julia represents its own code as a data structure of the language itself.
a) Optional Strong Typing
Using strong typing can speed up compiling, but Julia keeps strong typing optional, which frees up programmers who want to write dynamic routines that work on multiple types. 
julia> @code_typed(sort(arry))
1-element Array{Any,1}:
 :($(Expr(:lambda, {:v}, {{symbol("#s1939"),symbol("#s1924")},{{:v,Array{Float64,1},0},{symbol("#s1939"),Array{Any,1},18},{symbol("#s1924"),Array{Any,1},18}},{}}, :(begin $(Expr(:line, 358, symbol("sort.jl"), symbol("")))
        #s1939 = (top(ccall))(:jl_alloc_array_1d,$(Expr(:call1, :(top(apply_type)), :Array, Any, 1))::Type{Array{Any,1}},$(Expr(:call1, :(top(tuple)), :Any, :Int))::(Type{Any},Type{Int64}),Array{Any,1},0,0,0)::Array{Any,1}
        #s1924 = #s1939::Array{Any,1}
        return __sort#77__(#s1924::Array{Any,1},v::Array{Float64,1})::Array{Float64,1}

b) Introspective

Julia's introspection is awesome, particularly if you enjoy looking at native assembler code. Dissecting assembler code comes in handy when optimizing algorithms. Julia programmers have several introspection functions for optimization. Here the code_native method shows the recursive nature of a binary sort algorithm.
julia> code_native(sort,(Array{Int,1},))
        push    RBP
        mov     RBP, RSP
        push    R14
        push    RBX
        sub     RSP, 48
        mov     QWORD PTR [RBP - 56], 6
        movabs  R14, 139889005508848
        mov     RAX, QWORD PTR [R14]
        mov     QWORD PTR [RBP - 48], RAX
        lea     RAX, QWORD PTR [RBP - 56]
        mov     QWORD PTR [R14], RAX
        xorps   XMM0, XMM0
        movups  XMMWORD PTR [RBP - 40], XMM0
        mov     QWORD PTR [RBP - 24], 0
        mov     RBX, QWORD PTR [RSI]
        movabs  RAX, 139888990457040
        mov     QWORD PTR [RBP - 32], 28524096
        mov     EDI, 28524096
        xor     ESI, ESI
        call    RAX
        lea     RSI, QWORD PTR [RBP - 32]
        movabs  RCX, 139889006084144
        mov     QWORD PTR [RBP - 40], RAX
        mov     QWORD PTR [RBP - 32], RAX
        mov     QWORD PTR [RBP - 24], RBX
        mov     EDI, 128390064
        mov     EDX, 2
        call    RCX
        mov     RCX, QWORD PTR [RBP - 48]
        mov     QWORD PTR [R14], RCX
        add     RSP, 48
        pop     RBX
        pop     R14
        pop     RBP

c) Multiple Dispatch

Multiple dispatch allows Object Oriented behavior.  Each function can have several  methods designed to operate on the types of the method parameters. The appropriate method is dispatched at runtime based on the parameter types.

julia> methods(sort)
# 4 methods for generic function "sort":
sort(r::UnitRange{T<:real at="" bstractarray="" dim::integer="" pre="" r::range="" range.jl:533="" range.jl:536="" sort.jl:358="" sort.jl:368="" sort="" v::abstractarray="">

Thursday, February 12, 2015


Julia is a high-level, high-performance dynamic language for scientific computing. It has been gaining traction as a faster alternative to Matlab, R and NumPy and as a more productive alternative to C, C++ and Fortran. Julia is particularly relevant when both expressiveness and performance are paramount – in areas like machine learning, “big statistics”, linear algebra, bioinformatics, and image analysis.

Tuesday, January 7, 2014

Big Data Analytics by ICGX

The data revolution has only just begun. Everyone is talking about Big Data. Big Data grows up - Forbes Business opportunities is Big Data - INC. Big Data powers evolution decision making - WSJ How Big Data got so big - NYT Big Data is hot? Now what? - Forbes Businesses "freak out" over Big Data - Information Week 2012: The year of Big Data - WSJ The age of Big Data - NYT But it's not just hype. The world's data is doubling every 1.2 years. There are 7 billion people in the world. 5.1 billion of them owns cell phone. Each day, we send over 11 billion texts, watch over 2.8 billion YouTube videos and preform almost 5 billion google searches. And we're not just consuming it. We're creating it. We are data agents. We generate over 2.4 quintillion bytes everyday from consumer transactions, communication devices, online behavior, streaming service. In 2012, the world’s information totaled over 2 zetabyes. That’s 2 trillion gigabytes. By 2020, that number will be 35 trillion. We will need 10x more servers, 50x more data management, 75x more files to handle it all. If you're like most companies, you aren't ready. 80% of this new data is unstructured. It is too large, too complex, and too disorganized to be analyzed by traditional tools. There are 500K computer scientists yet only 30K mathematicians. We will fall short of the talent need to understand Big Data by at least 100K. To find opportunities in Big Data, we need new tools and new talent to mine this information and find value. We need Big Data Analytics. Big Data Analytics is more than technology. It’s a new way of thinking. It will help companies better understand customers, find hidden opportunities, even help our government better serve citizens and mitigate fraud. It will inspire hundreds, thousands and even millions of new startups. It will alter the landscape across virtually every industry and finally answer the questions looming over every CEO's head, "How can my business use Big Data?", "What problems can it solve?", "Who should be leading the charge, CIO, CMO, or Chief Data Scientists ?". In every revolution, there are opportunities that will be seized only by those armed with the right tools and right strategy. We are at the beginning of the Big Data Revolution.

Friday, November 29, 2013

Popular Articles