You may have seen our previous posts regarding projects that Cevo has worked on that have gone on to be open sourced. If not, now’s a great time to catch up before we continue! Information about Watchmen can be found here and here. The second project was called Bakery, head on over here for info on that one.
Why open source?
Open sourcing in enterprise can have many benefits, here are a few that come to mind for me:
It’s an excellent way to attract programming talent to your organisation. Rather than just talking about the innovative projects that a candidate may work on in a job ad, include a link and actually show them a taste of the great stuff you are producing. An excellent example is Watchmen recently being added to the Thoughtworks technology radar.
The quality of code your organisation releases will increase, even with no external contributors. For a developer, knowing the world will be have access to the code you write is a big incentive to do the best possible job you can!
Open sourcing will also force better separation of configuration in an application. Most applications will have some amount of organisation-specific configuration. The easy way to deal with this typically involves hard coding, however this also makes it extremely time consuming to remove or add when dealing with a public repository. Keeping all configuration in a separate directory in source control makes it much easier and faster to manage.
For the good of all
Finally, there are some aspects of technology that shouldn’t be considered a competitive advantage. Things like compliance, security and reliability. When breaches or massive outages occur it doesn’t only hurt the company it happens to, it hurts consumer confidence in the rest of the industry as well. The more industry collaboration we can foster on getting the foundations right, the more time we will all have on building things that commercially beneficial.
Where to start
So you’ve decided that open sourcing a project sounds like a good idea, now what?
Picking a project
I use the term ‘project’ loosely here. The best place to get started will be with a small but useful utility that’s used within your organisation. Of course there is nothing stopping you from going big and pushing out something like Watchmen, but typically it will be much easier to get organisational support for a smaller project.
As I mentioned above, the quality and structure of the project will likely change significantly when moving it to open source. On Watchmen, I would estimate we spent a good month of the team’s time improving the hygiene of the project before going public. Bakery, which was a much smaller project, took half that time.
Open sourcing considerations
When starting out on the open source path for Watchmen, I didn’t really have much idea of what kind of things to consider. Fortunately the kind people at Harvard have, and they provide a very good list to get started on: Harvard Library Open Source Project Considerations
Some areas that are worthy of further discussion include:
There are a large amount of open source licensing option to choose from, luckily they fall into two broad categories.
- Copy left licenses
- Permissive licences
With a copy left license, any derivative work must follow the same license as the original. This means that if the original license requires the source code for a program to be made available, then if you take the code and modify it, you must also make the source code available. For this reason, copy left licenses are generally avoided in enterprise. An example of a copy license is the GNU General Public License (GPL).
A permissive license does not put restrictions on the use and redistribution of the original work. Any derivative work can be licensed or not licensed at the organisation or individual’s discretion. A example of a permissive license is the Apache License v2.0 (Apache-2.0).
For further information about licensing refer to: https://opensource.org/licenses
2. Managing contributions
Hopefully the software community will find the project you’ve opened sourced novel, interesting and want to contribute to it! Open sourcing covers the license around using, modifying and redistributing the software. It does not cover copyright or ownership of the source code. To avoid possible issues in the future, having any contributors outside your organisation firstly sign an agreement is recommended. There are two basic types of agreements:
- Contributor license agreement (CLA)
- Copyright assignment agreement (CAA)
For more information on what these are head on over to http://wiki.civiccommons.org/Contributor_Agreements/
3. Technical things
There are many people at Cevo that are better placed to comment on different ways to approach technical aspects of software development, however here are three points that I believe are important to cover on your open source journey.
Possibly the single most important aspect of open sourcing. There should be absolutely no hint of anything like connection strings, user names, account numbers, passwords, keys or IP addresses present in code that sits on a public repository.
While it’s possible to trust that the development team will do the right thing, humans are humans and mistakes can and will be made. To prevent leaking secrets in any code, include a secret scanning tool as part of your build pipeline. A build should break if any secrets are detected.
Static analysis, otherwise known as linting, provides another tool to ensure the quality of the code you are releasing is as good as possible. There are many different language specific tool available that work in different ways.
As part of Watchmen, the team implemented pylint. Pylint gives a score for each file that you point it at. As part of our team practice, we enforced a rule that each pull request had to increase the pylint score of the Watchmen project. This meant we could slowly increase the code quality over time, rather than send one person insane doing it all at once.
Showing the full commit history of your project to the world may not be desirable from an organisational perspective. The way we worked around this on Watchmen was to work day to day on a private repo, then push out updates to the public repo on a regular basis (say, once a week).
When pushing out the updates, we squashed the commits since the last update into one single commit. This meant the public repo looked much cleaner.
(Stay tuned for an upcoming blog from Bhushan on strategies for structuring to make maintaining public and private repos easy.)
That may seem like a lot to take in at once! However I really believe the rewards are worth the effort. The whole team I worked with learnt many new things along the way and the satisfaction of seeing the watches, stars and forks on your repository increasing is fantastic!