Blogs

RedMonk

Skip to content

The RedMonk Programming Language Rankings: September 2012

Tweet

In December of 2010, Drew Conway decided to explore in quantitative fashion one of the more popular and contentious subjects debated by developers: the relative popularity of programming languages. To do this, he compared the traction of the languages on both GitHub and StackOverflow, communities that are both popular with developers and yet have somewhat distinct communities. GitHub’s rankings are based on GitHub’s own stacking of the individual languages, while the languages on StackOverflow are ranked according to the volume of tags associated with each language.

The result was a plot that featured a high correlation; the popularity on GitHub tended to correlate with the popularity on StackOverflow. Ten months later, we repeated this analysis, and again in February. These analyses have proven very popular with developers; the latter post was linked to on Twitter nearly six hundred times.

The truth, however, is that with respect to language popularity, very little changes on a month to month basis. While we do snapshot the necessary data monthly in the event that we require it for more detailed analysis, then, the more interesting insights come when we can examine the data over longer periods of time. Which, having been collecting data over a period of years, we are now able to do.

Here, to begin with, is an up-to-date plot of programming language popularity (click the image for the full size version).

spacer

With more languages being tracked than previously, it can be difficult to process this plot effectively. As has traditionally been the case, rough groupings or tiers of languages are apparent. And if one compares this plot to previous iterations, it’s possible to detect progress amongst specific languages. Scala, as one example, seems to be gradually progressing to the top of the second language tier.

But because this plot can be difficult to decipher by itself, we’ve extracted a list of the Top 20 programming languages by popularity here.

  1. JavaScript
  2. Java
  3. PHP
  4. Python
  5. Ruby
  6. C#
  7. C++
  8. C
  9. Objective-C
  10. Shell
  11. Perl
  12. Scala
  13. Haskell
  14. ASP
  15. Assembly
  16. ActionScript
  17. R
  18. Visual Basic
  19. CoffeeScript
  20. Groovy

But while there may be a few surprises on this list – the continued traction of Java, as an example, is unexpected for some – by and large this list seems like nothing more or less than a reasonable representation of programming languages in use today. It is an inclusive list, from compiled to interpreted and everything in between, and thus more evidence of the runtime fragmentation that has been rampant for several years [coverage].

What is interesting, on the other hand, is observing how these rankings have changed over time. From December of 2010 to September of 2011, for example, the popularity of Actionscript, Emacs Lisp, Haskell, JavaScript, Objective-C, Ruby, Scala and Shell script remained unchanged. ASP and Groovy, however, jumped one spot in the rankings, Java 2 and Assembly and C# 5. C, C++, PHP, and Python on the other hand dropped 1 spot, R and Lua 2, while Clojure and Perl dropped 3 spots.

Comparing this September to last, the big winners were CoffeeScript (9 spots), Visual Basic (5), and ASP, Assembly, C++, Haskell and Scala, which all moved up one place. C#, Java, JavaScript, Objective-C, Perl, PHP, Python, R, Ruby and Shell were unchanged. This year’s losers, meanwhile, include Groovy (dropped 1 spot), C (1), Clojure (3), ActionScript (4), and Emacs Lisp (6).

But what if we compare this September 2012 to Drew’s original analysis in December of 2010, just shy of three years ago? What has changed with these languages overall in three years?

  1. Clojure -6 (Dropped out of the Top 20)
  2. Emacs Lisp -6 (Dropped out of the Top 20)
  3. ActionScript -4
  4. Lua -3 (Dropped out of the Top 20)
  5. Perl -3
  6. C -2
  7. R -2
  8. PHP -1
  9. Python -1
  10. C++ 0
  11. Groovy 0
  12. JavaScript 0
  13. Objective-C 0
  14. Ruby 0
  15. Shell 0
  16. Haskell 1
  17. Scala 1
  18. ASP 2
  19. Java 2
  20. C# 5
  21. Visual Basic 5 (Added to the Top 20)
  22. Assembly 6 (Added to the Top 20)
  23. CoffeeScript 18 (Added to the Top 20)

The more popular languages on this list – JavaScript, Ruby and the like are notable for their lack of movement. What is very interesting is that the two biggest jumps come from languages that could not be more unlike one another; CoffeeScript is a simplied version of JavaScript that infuriates technologists with its technical compromises, while Assembly is as close to the bare metal as most developers today are likely to get. That this study in contrasts should comprise the biggest gains over a three year period is interesting.

Outside of movement in the Top 20, there have been questions recently around Go, a language introduced late in 2009. Apcera’s Derek Collison, in particular, is bullish on the language, saying:

Prediction: Go will become the dominant language for systems work in IaaS, Orchestration, and PaaS in 24 months. #golang

— Derek Collison (@derekcollison) September 11, 2012

The numbers are not quite so bullish, but do provide some grounds for optimism for advocates of the language. Our rankings have Go jumping from #32 in 2010 to #30 today, a number that sounds modest but means that in that time it has improved more in popularity than Scala or Haskell and as much as Java, at least from a rankings standpoint (obviously growth becomes more difficult the more popular the language becomes). Second, there’s its age. At a bit less than three years of age, Go’s position as a solidly second tier language is enviable, given the fact that there are much older languages like Smalltalk that have yet to break that barrier.

Ultimately, these rankings are intended to serve as a datapoint, a snapshot of traction within two particular communities that happen to be substantial centers of gravity from a development perspective. While not strictly representative, they do confirm one of the more important developer trends observed within the past decade: fragmentation. As with so many areas of technology today, the programming language landscape is wildly diverse, with multiple languages being employed simultaneously by individual developers, often on the same project. Whatever your feelings on the specifics of the rankings above or the merits of the languages themselves, be aware that all of the listed languages are present, and present in volume, within today’s developer populations.

20 Comments

Categories: Open Source, Programming Languages.

By sogrady
September 12, 2012 at 7:35 pm
  • twitter.com/t3kcit Andreas Mueller

    I want a rosling-style animation of how the graph evolves over time and how the languages travel! Pretty please.

    • twitter.com/karianna Martijn Verburg

      +1 to this, would be very interesting!

      • twitter.com/sogrady steve o’grady

        We’ve gotten a couple of requests for that, and we may well do it in future. The problem, however, is that it’s not clear that the rankings change sufficiently to make Rosling-style motion charts work well.

  • Brad Wood

    Excellent.  Is the raw data available anywhere?  I’m interested in seeing the trends of some of the other languages who are farther from the top.  ColdFusion, specifically.
    Thanks for the thorough breakdown.

    • twitter.com/sogrady steve o’grady

      We haven’t cleaned the data up for publication, but it looks like ColdFusion went from 27 to 32 over the ~3 year period.

  • Adam Connor

    December 2010 to September 2012 is almost two years, not almost three years…

    • twitter.com/sogrady steve o’grady

      The intent is to present one snapshot for each of the last three years: 2010, 2011, and 2012. 

  • Isaac Gouy

    The “analysis” is just as broken as it was in February.

    The “popularity” of most of those languages is being grossly distorted when
    you convert the “# of Tags” and “# of Projects” data to rankings.

    The range in rank value for the stackoverflow tags was from 1 to 56, but
    the range in “# of Tags” that rank is based upon was from 0 to 82,923
    and the data was so skewed that only 11 of 56 languages had above
    average “# of Tags”.

    Haskell was well below average for “# of Tags” and Java was well above average for “# of Tags” —

    #56 Java = 82,923

    >>> mean = 18,770 <<<

    #40 Haskell = 1,896

    # 1 F# = 0 

    (The story was the same for the github "# of Projects" rank numbers.)

    • twitter.com/sogrady steve o’grady

      Agreed, the rankings are not linearly weighted. 

      • Isaac Gouy

        Do you agree that readers might describe this — “Our rankings have Go jumping from #32 in 2010 to #30 today, a number that sounds modest but…” — not as “modest” but as irrelevant if they knew it was based on ~0.05% of stackoverflow tagcounts?

        The ranking distorts the data, preventing readers from seeing what’s really happening.

        • twitter.com/sogrady steve o’grady

          I don’t believe readers are terribly concerned about the data volume behind the 30th ranked programming language, no. 

  • Isaac Gouy

    >>we’ve extracted a list of the Top 20 programming languages by popularity here<<

    You don't say what you mean by "popularity". You don't say whether that list is based on "# of Tags", or  "# of Projects", or some combination of the two, or something else entirely.

  • Velocity888

    Hold on, C# is clearly the leader in stackoverflow.com; and not may C# projects are hosted on github, they are on codeplex. This ranking is inaccurate.

  • Isaac Gouy

    >>What is interesting, on the other hand, is observing how these rankings have changed over time.<<

    Whether or not rankings were appropriate for Drew Conway's purpose, they are not appropriate as a way of understanding change over time.

    When you talk about "Go jumping from #32 in 2010 to #30 today" you don't show whether that's because Go so.tagcounts are being added at a faster rate or because so.tagcounts for #30 and #31 are being added at a slower rate.

    We don't know whether that's real change for Go or just an artifact of the way you processed the data.

    Instead of rankings, express the so.tagcounts as a fraction of the total so.tagcounts. That way, changes over time will show how much faster or slower so.tagcounts are being added for just one language, compared to the overall rate of growth.

    (You could use the geometric mean to combine "# of Tags" and "# of Projects" data.)

  • twitter.com/alexeiramone Alexei Martchenko

    I’ve been keeping track of this list for some times. Congratulations for the work on this. By the way, with 14 years porgramming Coldfusion I must say: “Adobe, change your licensing policies or give up on Coldfusion”. It’s an amazing language, condemned to a puny future

  • twitter.com/fabfas Faisal

    Interesting findings!

  • Pingback: Links 22/9/2012: September Catchup | Techrights

  • Pingback: Heroku Enterprise For Java – A New Play In A Crowded Market | TechCrunch

  • Pingback: Revisiting “Ranking the popularity of programming languages”: creating tiers « Another Word For It

  • Drhuffman12

    Next time you do a GitHub vs StackOverflow plot, it would be nice to see a connecting line for the language on GitHub to the matching language on StackOverflow.

The Apple Maps Lesson: Build a Data Moat Around Your Business » « The AGPL: Solution in Search of a Problem
gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.