Industrial XP:  CommunityPage? IndustrialxpPractices Continuous Risk Management

Continuous Risk Management

Revision r1.4 - 03 May 2004 - 14:33 GMT - SomikRaha

Description


The NASA rover Spirit developed problems soon after landing on Mars. It seemed to be draining itself of power and rebooting continually. It turned out that the system was designed to reboot if it detected a serious anomaly and start fresh. However, the serious anomaly was that the Flash memory on board was full. That sent the rover into continuous reboots.

Luckily for this project, the team had done some RiskMitigation?. They had built in a window of time (almost an hour) before the reboot, so they could have some time to communicate with the rover. This allowed them to tell the rover to stop using the FLASH memory and use the onboard RAM temporarily, allowing it to stay up longer for further commands. The rover was then sent commands to clean up the FLASH memory.

As we know now, the rover is a success story. Yet, it came dangerously close to failure, if it were not for the two risk mitigation steps that had been taken on this project:

  • Allowing a window of time before rebooting
  • Having fallback RAM if the FLASH memory stopped working for some reason

To see the rovers in action, visit http://marsrovers.jpl.nasa.gov/home/

-- SomikRaha - 26 Apr 2004


TWiki home


Useful Links

· Edit this page
· IXP Community Page
· Print preview
· Recent Changes
· Advanced Options
· Register
· Change Notification