Thursday, February 26, 2015

The Choices Between Statistical Programming Languages

I recently attended a talk where an audience member posed a question to an experienced economist about which programming language is best for statistics.  The expert selected Python over MATLAB and Mathematica, and I think I heard the R language as one other choice (it wasn't clear).  Young professionals just entering the workforce with high-end skills take the choices between those languages very seriously.

MATLAB iterates the manual matrix programming most top MBA programs include in their decision science classes.  It interfaces with both modern languages like C++ and legacy languages like Fortran.  Some programmers, like our expert above, believe Python can replace MATLAB in transforming data into graphical displays.  Maybe it depends on the profession.  I have recently spoken with a hedge fund manager who swears by Python for its ease of use, comparable to MS Excel functions.  That's two anecdotal votes for Python in a week.

Mathematica's Wolfram language is almost as old as MATLAB, and just as heavily used in the sciences and engineering.  They both have open source competition.  R is a more recent innovation with open source origins.  I have heard R mentioned at tech conferences where hackathon alums discuss the techniques they used during competition.  Looking at snippets of R syntax I found on the web reminds me of the very basic DOS programming I did in an undergraduate business class in the early 1990s.  Comparing all of these to Python's supposed symbolic ease of use entices me to learn more about Python first.

Hedge fund quants know financial engineering, where high-level programming languages like Python are necessary.  I have long been skeptical of the value added by hedge funds.  Knowing their preferred lingo should make critiques of their approaches more credible.  I still expect a market cataclysm to shake out almost all hedge funds as redundant.  Any investment firms that survive will attract the best quant talent and programming will still be a valuable skill.  The key for those few quants who keep their jobs will be knowing the limits of financial engineering in marking risk/reward tradeoffs.  Economics are human-caused events, not natural phenomena governed by hard math.  Black Swans can still fly right through any Python script.

I am not yet at a decision point where I am ready to learn a computer programming language, but the changing demands of the finance sector may require such an adaptation.  Those of us in the middle of our careers need to seriously consider learning the basics of some of these languages.  None of the four languages I mentioned were in the curricula of my MBA courses but they are now recognized as de rigeur for enterprise data professionals.  The concept of coding as basic literacy is on many thought leaders' lips.  Young people who can do it are far ahead of middle-aged people in adapting to a high-value workplace.  Coding as literacy leads to a Big Data career.