Publishing Scientific Software
So you are writing scientific software and are wondering how to tell the world about it. In this blog post I briefly list the things you can do to make your software more visible.
- Use version control to manage your code. These days that is probably git. Depending on complexity of your project, version control allows you to track your changes, collaborate with others and manage multiple versions of your code.
- Use a public repository for your code so that others can find and contribute to it. There are many web platforms out there that support git. Most notably there is github and gitlab. github has the advantage that many people are using it and that it interoperates with other services that you might want to use. There is also bitbucket which is similar to github. gitlab, on the other hand, allows you to run your own private repository where you or your organisation has full control over the data. The University of Edinburgh runs their own gitlab instance which you can use for projects that need to stay within the organisation.
- Now that you are making your work public you need to consider how to license it. There are many different licences to choose from ranging from very permissive (do what you want) to very restrictive. You also need to consider intellectual property issues. The Software Sustainability Institute has a good page on choosing an open source license. If you think there might be commercial interest in your software you should check with your employer – in our case that is Edinburgh Innovations.
- Now that your code is publicly available people need to understand how it works. So you need to document it. There are two aspects of documentation: API documentation and general overview. At the very least you need to add a README. You can format it using markdown which will get displayed as HTML on github. The API documentation should be part of the source code. This way it is easier to keep the documentation in sync with the code. There are various options for API documentation which depend on the programming language you are using. Sphinx is a good choice for python projects. C, C++ or Fortran code can be documented using doxygen.
- The next piece in the puzzle is publishing your documentation. You can use the readthedocs web service to host your documentation. readthedocs is designed to work with sphinx documentation. The breathe package allows you to integrate doxygen with sphinx. readthedocs can also be integrated with github so that when you push changes to your repository the documentation gets automatically updated.
- In order to make sure that your code is working you need to test it. There are different approaches to testing: unit testing which tests particular aspects of your software and integration testing which tests the software as a whole. There are frameworks which allow you to perform these tests automatically. I would suggest to use pytest for python projects. These tests can be automatically run on pushing code to your github repository using continuous integration github actions. This also works for other languages.
- When you collaborate with others on writing code it is important to agree on style (number of spaces, where braces go, etc) otherwise you end up with a lot of noise in your revision history. You can use linters to help you with this task. These will also detect some programming errors. I would recommend to use flake8 for python projects. Again this is something that can and should be run automatically.
- Your code is now available, documented and tested and people are starting to use it. You would like to get some credit for it. First of all you need an orcid ID, a digital identifier that can be associated with your work.
- You have created a release of your code and would like to cite it in a paper. For this you need to have a digital object identifier, or doi. You can use the zenodo web service to automatically create a doi from a github release of your code.
- Once you have a doi of your code you can also add a citation file to your repository. These are displayed by github and allow others to properly cite your software.
- Articles in journals are still the main currency in university life. There are a number of journals where you can publish a paper specifically on your software, eg the Journal of Open Research Software or the Journal of Open Source Software. The Software Sustainability Institute has a list of journals where you can publish articles on software.