What is a software forge?

As we know, use of the term “infographic” generally causes involuntary gagging and may result in unwelcome skin irritation.

Paradoxically, open source licensing and vulnerability management solutions company Protecode (pron: pro-ta-code) appears to be using the “information graphic” (to use the old school expression) approach to good effect.


The firm has produced infographics to compare the attributes of open source projects held in organised, tightly governed forges, such as Apache and CodePlex, with free-for-all forges having little or no project governance, such as SourceForge or GitHub.

What is a software forge?

DEFINITION: A software forge is generally defined as a collaboration platform hosted as a website designed to facilitate, stimulate and concentrate community (and often, but not always, “independent”) software application development projects — the forge is home to software application development tools, management functions and access to a particular piece of software’s SDK and IDE as well as source code management and version control features. Forges can be unregulated or supported by a governing organisation.

For example, GitHub is currently the world’s largest code repository and has little or no governance — it does not require projects to confirm to “The Open Source Definition” set by the Open Source Initiative (OSI).

SourceForge was the world’s first centralised location for the management of free and open source software projects. SourceForge is another ungoverned forge that encourages developers to choose an OSI approved license for their projects.

A forge state of the nation

The following information is attributed to Protecode and its analysis of the forge state of the nation in the firm’s Global Intellectual Property (GIPS) Database — a database containing over 2.2 million open source packages, gathered from 4,000 sites around the web.

Launched in 2006, CodePlex is a Microsoft owned repository for open source projects. While the repository hosts a wide variety of projects, the most prominent are Microsoft driven (.NET, SharePoint).

Apache Software Foundation (ASF): Launched in 1999, the Apache Software Foundation develops and supports a variety of open source projects. Projects that are hosted at Apache are selected and approved by the Apache foundation and are licensed to the ASF with a grant or contributor agreement.

The firm’s latest infographic displays the distribution of five popular license families in all of the forges scanned – namely MIT, GPL, BSD, Apache, and LGPL. Combining all forges together, GPL (various versions) appears to be the most widely used license at 43%.

In the two lightly-governed repositories, contributors to GitHub prefer the permissive MIT license, while SourceForge users prefer the copyleft GPL.

As for the two tightly governed forges, each prefers the license of the sponsoring organisation.

CodePlex complexities

In CodePlex, Microsoft Open Source Licenses are preferred and Apache licenses are prominent among contributors to the Apache Software Foundation’s forge. GPL usage was lowest among the CodePlex community, since GPL has only been an option since October 2013.

Finally, the total number of permissive and copyleft licenses in each forge was tallied.

According to Protecode, the large number of GPL projects in SourceForge gives it the distinction of the forge with the highest percentage of copyleft projects. CodePlex is not too far behind due to the prevalence of projected licensed under the copyleft Microsoft Public License (MPL). GitHub users tend to prefer permissive licenses, while Apache has the lowest number of copyleft licenses since the vast majority of their projects are licensed under the permissive Apache license.

“In the open source domain, GPL and LGPL still rule, but more and more projects are licensed under non-reciprocal licenses such as BSD, MIT, and Apache. Generally, loosely governed forges, such as GitHub or SourceForge, support more copyleft licenses than more tightly governed forges, such as Apache,” said Protecode in an press statement.

Data Center
Data Management