Commentary: The Python programming language keeps overcoming challenges to its growth. Here’s why you should expect that to continue.
There are several reasons that the Python programming language shouldn’t exist, and yet there are tens of millions of developers and data scientists who are grateful it does. Python should have forked into at least two different communities–it didn’t. It should have required significant corporate funding like Go or Swift to develop and thrive–it didn’t. And it probably should have been ignored in data science as people swarmed to R–but it wasn’t.
Instead, Python keeps growing as one of the world’s most dominant programming languages. There are good reasons for this, and for why it didn’t fragment or suffer any of the problems noted above. Anaconda co-founder and CEO Peter Wang talked to me about the sustained, “absolutely explosive” growth of Python, and why it’s unlikely any other programming language will catch up.
SEE: Python is eating the world: How one developer’s side project became the hottest programming language on the planet (cover story PDF)
There’s something about Python
Python appeals to a broad swath of users, from hard-core data scientists to newbie university students. This is by design, said Wang, whose company has been central to Python’s evolution as a first-class tool for data scientists:
There are two things that Python does that are very different from all other major languages. Number one, it has a pedigree of being a teaching language. It’s easy to use, easy to pick up, kids use it, non-programmers pick it up in a weekend. This is not accidental; it has been a hardcore part of the design from the very beginning and quite intentional….The second thing that’s interesting about Python is that from the very beginning it’s good as a glue language.
This is also how Python started to find its way into data science, which had hitherto been the domain of R and other “built-for-data-science” languages/tools. But not necessarily through the people that already knew R or were well versed in MATLAB, branching out into numerical computing. Rather, it was newbies to data science, Wang said: “It’s the casual, non-developer person. Non CS people. It’s the VP of product, it’s marketing and sports analytics people. It’s everybody. I mean, Python’s competitor is Excel. It’s not Java or Ruby or R or Julia.”
Python, in other words, democratized data science by opening it up to a much wider range of people. As this has happened, and the Python community has innovated to make the language a first-class option for data science, languages like R have declined, according to a Terence Shin analysis of more than 15,000 data scientist job postings.
Python’s strength in data science (and numerical computing, generally) owes a huge debt of gratitude to the early efforts of scientific computing pioneers. Even as the early Python developer crowd tuned it to be a great competitor to Perl and other web development languages, Wang recalled, Guido van Rossum, the founder of Python, remained friendly with the scientific computing community, encouraging them to improve Python for their needs. This helped to minimize the need for the project to fork.
SEE: Programming languages: Why Python 4.0 might never arrive, according to its creator (TechRepublic)
And so we’re left with a programming language that does many things well. By Wang’s reckoning, it’s unlikely that any other programming language can catch up to Python:
Python has tens of millions of users. I think the press has vastly underreported how broadly Python’s been adopted. And at this point, its adoption is viral, and its adoption is an engine. Schools are teaching it. It’s just the obvious thing to do. If you’re a middle schooler, you’re outgrowing Scratch. You want to do some real programming, JavaScript, of course, gives you nice web pages. But if you want to do machine learning, like data stuff, of course you do Python. So you’ve got universities and you’ve got high schools and middle schools teaching Python. You’ve got VPs of XYZ learning Python to do a little bit of data analysis stuff. At this point, it’s an unstoppable adoption engine. It’s going to be hard for anything else to catch up.
This isn’t to say Python is perfect.
Python’s growing pains
In Wang’s view, there have long been problems with Python–like packaging. It’s fantastic that you can take existing libraries, C++, Fortran, etc. and connect them using the Python glue mentioned above. However, you still have to figure out how to compile all those libraries. A developer dealing with a web language like Ruby doesn’t really need to worry about this. She doesn’t touch native compiled libraries except maybe for SSL and encryption and maybe a few optimized data loaders, as for the most part, it’s all interpreted.
According to Wang, van Rossum didn’t want to clutter Python with this capability, so Anaconda took it on, creating its own packaging system for Python. Anaconda’s distribution (a bit like what Red Hat did in Linux) makes it easy to take hard-to-compile things like Fortran and make them work seamlessly with Python. Additionally, there has been increased focus within the community on improving Python performance.
SEE: Hiring kit: Python developer (TechRepublic Premium)
And, of course, there’s a long way to go. Fortunately, Python’s popularity means that there is a large and swelling population of contributors anxious to tackle any impediments to its growth. In Wang’s words, “The raw amount of users and existing code and valuable business problems out there creates such a potential, lucrative market for people to solve those problems that the Python ecosystem will well overcome [any] hurdles.”
Or, to misquote Linus’s Law, given enough Pythonistas, all Python problems are solvable. Which, of course, will simply lead to even more growth and adoption of Python.
Disclosure: I work for AWS but the views expressed herein are mine.