The Time is Nigh

As many of you are already aware, Kate and I are expecting a baby. With a scheduled delivery date of next Monday, today is my last day in the office. As of the close of business, I will be headed out on paternity leave. My return date will depend on how things proceed with mother and baby, but I’m hoping to ramp back up at some point in mid to late January.

For RedMonk customers, not much changes: contact Juliane for any of your engagement needs and Marcia for anything operations related. James, at least, will be around to keep the lights on.

While I’m out of the office, I will be around as usual on Twitter – and I hear parents of newborns have plenty of free time. Until my return then, be well, enjoy your holidays and wish us luck.

5 Comments

Categories: Personal.

By sogrady

— November 25, 2015 at 11:54 am

DVCS and Git Usage in 2015

For many in the industry today, version control and decentralized version control are assumed to be synonomous. Slides covering the DevOps lifecycle, as but one example, may or may not call out Git specifically in the version control portion of the stack depiction, but when the slides are actually presented, that is in the overwhelming majority of cases what is meant. Git, to some degree, is treated as a de facto standard. Cloud platforms leverage Git as a deployment mechanism, and new collaboration tools built on services built on Git continue to emerge.

Are these assumptions well founded, however? Is Git the version control monster that it appears to be? To assess this, we check Open Hub’s (formerly Ohloh) dataset every year around this time to assess, at least amongst its sampled projects, the relative traction for the various version control systems. Built to index public repositories, it gives us insight into the respective usage at least within its broad dataset. In 2010 when we first examined its data, Open Hub was crawling some 238,000 projects, and Git managed just 11% of them. For this year’s snapshot, that number has swelled to over 683,000 – or close to 3X as many. And Git’s playing a much more significant role today than it did then.

Before we get into the findings, more details on the source and issues.

Source

The data in this chart was taken from snapshots of the Open Hub data exposed here.

Objections & Responses

“Open Hub data cannot be considered representative of the wider distribution of version control systems“: This is true, and no claims are made here otherwise. While it necessarily omits enterprise adoption, however, it is believed here that Open Hub’s dataset is more likely to be predictive moving forward than a wider sample.
“Many of the projects Open Hub surveys are dormant“: This is very likely true. But the size of the sample makes it interesting even if potentially limited in specific ways.
“Open Hub’s sampling has evolved over the years, and now includes repositories and forges it did not previously“: Also true. It also, by definition, includes new projects over time. When we first examined the data, Open Hub surveyed less than 300,000 projects. Today it’s over 600,000. This is a natural evolution of the survey population, one that’s inclusive of evolving developer behaviors.

With those out of the way, let’s look at a few charts.

(click to embiggen)

If we group the various different version control systems by category – centralized or decentralized – this is the percent of share. Note that 2011 is an assumption because we don’t have hard data for that year, but even over the last four years a trend is apparent. Decentralized tooling has moved from less than one in three projects in 2012 (32%) to closer to one in two in 2015 (43%). That’s the good news for DVCS advocates. The bad news is that this rate has become stagnant in recent years. It was 43% in 2013, actually dipped slightly to 42% in 2014, and returned to 43%, as mentioned, this year.

On the one hand, this suggests that DVCS generally and Git specifically might have plateaued. But the more likely explanation is that this is an artifact of the Open Hub dataset, and our imperfect view of same. It is logical to assume that some portion – possibly a very large one – of the Open Hub surveyed projects are abandoned, and therefore not an accurate reflection of current usage. Many of those, purely as a function of their age, are likely to be centralized projects.

Nor did the Open Hub dataset add many projects in the past calendar year; by our count, it’s around 9671 total net new projects surveyed, or around 1% of the total. Which means that even if every new project indexed was housed in a Git repository, the overall needle wouldn’t move much.

Overall, however, if we compare the change in individual share of Open Hub projects from 2010 against 2015, these are the respective losses and gains.

(click to embiggen)

Git unsurprisingly is the big winner, CVS the equally unsurprising loser. Nor has any of the data collected suggested material gains for non-Git platforms. DVCS in general has gained considerably, and is now close to parity and Git is overwhelmingly the most popular choice in that segment.

What the specific rate of current adoption is versus the larger body of total projects will require another dataset, or more detailed access to this one. For those who may be curious, we did compare this year’s numbers against last years, but as the largest single change was Git’s gain of 0.75% share it didn’t offer much in the way of new information. Given that existing projects may change their repository, we can’t simply assume that Git captured 75% of the net new projects.

Our annual look at the Open Hub dataset, then, does support the contention that DVCS and Git are effectively mainstream options, but is insufficiently detailed to prove the hypothesis that Git has become a true juggernaut amongst current adoption – even if the anecdotal evidence concluded this a long time ago.

3 Comments

Categories: Version Control.

By sogrady

— November 24, 2015 at 3:37 pm

Changing Tack: Evolving Attitudes to Open Source

Even five years ago, evidence that the role of software was changing was not difficult to find. Microsoft, long the standard bearer for perpetual license software sales, had seen its share price stall for better than a decade. Oracle was in the midst of a multi-year decline in its percentage of revenue derived from the sale of new licenses. Companies that made money with software rather than from it, meanwhile, such as Amazon, Facebook and Google were ascendent.

It wasn’t that software had become unimportant – quite the contrary. It was becoming more vital by the day, in fact. As Marc Andreessen pointed out later that year, across a wide number of traditional industries, the emerging players were more accurately considered technology companies – and more specifically, software companies – that happened to operate in a given vertical than the reverse.

This counterintuitive trend was what led to the publication of “The Software Paradox,” which attempted to explain why software could become more valuable and less saleable at the same time. And what companies within and outside the software industry should do about it.

One of the most important factors in both making software more difficult to sell and more vital to an organization was open source. In general, organizational attitudes towards open source tended to be informed by a variety of factors, but could be roughly categorized along generational lines. This classification was presented to the OSBC audience in 2011:

First Generation (IBM) “The money is in the hardware, not the software”:
For the early hardware producers, software was less interesting than than hardware because the latter was harder to produce than the former and therefore was more highly valued, commercially.
Second Generation (MSFT) “Actually, the money is in the software”:
Microsoft’s core innovation was recognizing where IBM and others failed to the commercial value of the operating system. For this single realization, the company realized and continues to realize hundreds of billions of dollars in revenue.
Third Generation (GOOG) “The money is not in the software, but it is differentiating”:
Google’s origins date back to a competition with the early search engines of the web. By leveraging free, open source software and low cost commodity hardware, Google was able to scale more effectively than its competitors. This has led to Google’s complicated relationship with open source; while core to its success, Google also sees its software as competitively differentiating and thus worth protecting.
Fourth Generation (Facebook/Twitter) “Software is not even differentiating, the value is the data”:
With Facebook and Twitter, we have come full circle to a world in which software is no longer differentiating. Consider that Facebook transitioned away from Cassandra – a piece of infrastructure it wrote and released as open source software – for its messaging application to HBase, a Hadoop-based open source database originally written by Powerset. For Facebook, Twitter, et al the value of software does not generally justify buying it or maintaining it strictly internally.

While it’s certainly possible to debate the minutiae of these classifications, the more interesting question is whether they would persist. Recently, we’ve begun to see the first signs that they will not. That second and third generation organizations that believed – at minimum – in software as a protectable asset have begun to evolve away from these beliefs.

Google

Google’s release of TensorFlow was particularly interesting in this regard. Google’s history with open source software was and is complex. The company was built atop it, and as representatives like Chris DiBona are correct to note, in the form of projects such as Android Google has contributed millions of lines of code to various communities over time. But it tended to be protective of its infrastructure technologies. Rather than release its MapReduce implementation as open source software, for example, it published papers describing the technologies necessary to replicating it, out of which the initial incarnation of Hadoop was born.

With TensorFlow, however, Google declined to protect the asset. Rather than make the code replicable via the release of a paper detailing it, it released the code itself as open source software. As Matt Cutts put it:

In the past, Google has released papers like MapReduce, which described a system for massive parallel processing of data. MapReduce spawned entire cottage industries such as Hadoop as smart folks outside Google wrote code to recreate Google’s paper. But the results still suffered from a telephone-like effect as outside code ran into issues that may have already been resolved within Google. Now Google is releasing its own code. This offers a massive set of possibilities, without reinventing the wheel.

This move is relatively standard at fourth generation companies such as Facebook or Twitter, but it represents a change for Google. A recognition that the benefits to releasing the source code outweigh the costs. While the market’s understanding of and appreciation for the benefits may lag – the WSJ writeup of the news apparently required a quote to confirm that “It’s not a suicidal idea to release this” – Google’s does not.

Microsoft

For many years and across many teams at Microsoft, open source was a third rail issue. In spite of the rational, good work done by open source advocates within the company like Jason Matusow or Sam Ramji, the company’s leadership delivered a continual stream of rhetoric that alienated and antagonized open source communities. Unsurprisingly, this attitude filtered down to rank and file employees, many of whom viewed open source as an existential threat to their employer, and therefore was something be fought.

With years and a change in leadership, however, Microsoft’s attitude towards open source is perceptibly shifting. While the company has been moving in this direction for years, recent events suggest that the thawing towards open source has begun to accelerate. In November of last year, nine months after Satya Nadella took the reins at Microsoft, large portions of the company’s core .NET technology were released as open source. Last April, the awkward Microsoft Open Technologies construct was decommissioned and brought back into the fold. Seven months after that, Microsoft inked a partnership with open source standard bearer Red Hat, one that president of product and technology Paul Cormier “never would have thought we’d do.” And most recently, the company’s Visual Studio Code project – built on Google’s Chromium among other pieces of existing open source technology – was itself open sourced in a bid to make the editor truly cross-platform.

It can certainly be argued (and was by RedMonk internally) that many of these are simple and logical decisions that should have been made years ago. It’s also important to note that Microsoft’s twin mints, Office and Windows, remain proprietary in spite of public comments contemplating the alternative. All of that being said, however, it’s difficult to argue the point that on multiple levels, it is, as engineer Mark Russinovich says in the above linked piece, “a new Microsoft.”

The Net

What does it mean when an organization that saw software as an asset worth protecting commits to open source? Or one that viewed software as the ends rather than the means and had tens of billions of dollars worth of evidence supporting this conclusion? The short answer is that it means that open source is being viewed more rationally and dispassionately than we’ve seen since the first days of the SHARE user group.

Open source is being viewed, increasingly, as neither an existential threat nor an ideological movement but rather an approach whose benefits frequently outweigh its costs. There’s a long way to go before these concepts become truly ubiquitous, of course. Even if the most anti-open source software vendors are beginning to come around, the fact that the announcement of Capital One’s Hygieia project was considered so unusual and newsworthy suggests that enterprises are lagging the vendors that supply them in their appreciation for open source.

But if the above generational classifications begin to break down in favor of nuanced, strategic incorporation of open source, that will be a good thing for the market as a whole, and for the developers that make it run.

Disclosure: Neither Google nor Microsoft is a RedMonk customer at present.

1 Comment

Categories: Open Source.

By sogrady

— November 19, 2015 at 4:50 pm

Crossing the Amazon: IBM in an Age of Disruption

The Wired headline in April of this year read, “Amazon Reveals Just How Huge the Cloud Is for Its Business.” The numbers for AWS were $4.6B for 2014, up 49% from the year before and on track to hit $6.23B by year’s end. The TechCrunch headline from October was “Amazon’s AWS Is Now A $7.3B Business As It Passes 1M Active Enterprise Customers.” Revenue at $7.3B, not $6.23B. A growth rate no longer of 49%, but 81%.

It is the velocity and trajectory of this business that has everyone in the industry spooked and valuations of the business formerly relegated to the “other” revenue category on financial statements accelerating. Even after seeing sales shrink for 14 consecutive quarters, after all, and amidst calls to rebrand the company from Big Blue to Medium Blue, each of IBM’s non-finance business units generated more revenue in 2014 than AWS projects to this year. Three out of the four were a multiple of the seven billion figure: GTS was ~$37B, Software $25B and GBS came in at ~$18B.

But the market and evaluators alike are less concerned, at least in the case of Amazon and IBM, with present day revenue figures than how they project to change over time, hence the euphoric AWS headlines and the quarterly pillorying IBM receives. What IBM is going through at present, in fact, suggests that Michael Dell’s original decision to take his firm private was a wise one.

Market disruption is a violent process, and surviving it can be almost as drawn out and painful as succumbing to it. As IBM knows, of course, having been one of the few companies to reinvent itself more than once. Expecting the same patience from investors, however, is a lot to ask, particularly in an age of activist shareholders carrying Damoclean swords.

If bullish perceptions of cloud native players, Amazon and otherwise, are driven by expectations of future returns driven by current models, however, it is perhaps worth taking a step back and evaluating IBM’s current models rather than current returns. The question is how should IBM, or companies in IBM’s position, respond to the macro-market factors currently disrupting its businesses.

From a high level, all of the incumbent systems players – from Cisco to Dell/EMC to HP to IBM to Oracle – need to recognize, among other market dynamics, the following:

Between the ascendance of ODMs and the explosion of IaaS, the market for premium low end hardware is gone. What hardware growth there is will come from the cloud – just ask Amazon, Google or Microsoft.
Traditional perpetual license software models are not gone, but in systemic decline. Customers instead are shifting to services-based models, with additional value adds from data (both collected and sourced).
Open source and commodity services have offered customers some relief from lock-in, but it remains as closely tied to profit as Shapiro and Varian described in 1999. This implies that while it’s important to offer commodity entrypoints, higher-end proprietary services will be critical to both profit and retention.
New market conditions require new partners.

Measured by this criteria, at least, IBM is making logical adjustments to its businesses.

Low-end hardware businesses have been divested, and investments redirected to potential growth businesses such as Softlayer.
An increasing emphasis within its software business is on services, e.g. Bluemix, acquisitions like Cloudant/Compose/etc, or the just announced Spark-as-a-Service.
Proprietary or exclusive offerings such as Watson or the Twitt

Blogs

RedMonk

The Time is Nigh

DVCS and Git Usage in 2015

Source

Objections & Responses

Changing Tack: Evolving Attitudes to Open Source

Google

Microsoft

The Net

Crossing the Amazon: IBM in an Age of Disruption