Femke Morsch
Productmanager SURFconext & team lead trust en identity @SURF Meer over Femke Morsch
At SURFconext, we strive for 100% availability. That’s why SURFconext is configured redundantly and the platform runs at 3 different locations. The next issue that we want to improve is the prevention of disruptions after releases. Despite extensive testing and a solid release process, we cannot give complete assurance that everything will keep working for everyone. That’s why we are going to work with rolling updates: updates that are implemented gradually, per user group. By doing so, problems stay manageable and old software does not have to be restored if something goes wrong. How do rolling updates work and which problems do they solve?
Over the years, the SURFconext test and release process has been constantly improved. This is because we have kept on refining the process and have regularly performed releases. We have, for example, performed 18 releases in the past academic year.
Yet a lot happens even before the software goes into production:
New releases are always announced a week in advance, so if there are any problems, institutions and service providers know that something has changed on our side.
This process works well, but there are still a number of disadvantages with this method:
We would like to eliminate these drawbacks; to do so we need another way to perform releases: rolling updates.
With a rolling update a small percentage of users use the new software first and if the monitoring shows that there are no problems, this percentage is increased. If this goes well – with no deviations in the monitoring and no complaints from users – then more and more users will use the new software. This is a predictable and automated process.
If the new software does not appear to be successful, then the old software version can be restored. Both versions are available alongside one another and can handle full loads. The restoration of old software will then no longer be necessary. This way, rolling updates ensure that the drawbacks of the current release method are no longer present.
We started using rolling updates for a number of non-critical applications in June. This went well and the plan is to adopt this way of updating for the rest of the platform as well, after September. Before we do this, however, there are still a number of challenges that we have to tackle:
We have even more questions for which we would like to find solutions by working together with the member institutions. What is, for example, the best time to perform a rolling update? How long should a rolling update take? How do I ensure that my help desk is informed? Are you an IT manager, service manager or help desk team member who would like to collaborate with us on the rolling update process? If so, send an email to femke.morsch@surfnet.nl.
Productmanager SURFconext & team lead trust en identity @SURF Meer over Femke Morsch
0 Praat mee