Failure Recovery Unit-4 1
RECOVERY • Recovery refers to restoring a system to its normal operational state. • A process has memory allocated to it and a process may have locked shared resources. • If a process fails: ➢It is imperative that the resources allocated to the failed process be reclaimed so that they can be allocated to other processes. ➢If a failed process has modified a database, then it is important that all the modifications made to the DB by it are undone. ➢If a process has executed for some time before failing, it would be preferable to restart the process from the point of its failure and resume its execution.
BASIC CONCEPTS • A system consist of a set of hardware and software components and is designed to provide specified service. • Failure : occurs when the system does not perform its services in the manner specified. • Erroneous state : is a state which could lead to a system failure by a sequence of valid state transitions. • Fault :is an anomalous physical condition. Cause include design errors, manufacturing problems, damage fatigue or other deterioration and external disturbances (such as harsh environment conditions, unanticipated inputs or system misuse). Note: ▪ An error is a symptom of a fault in a system, which could lead to system failure. ▪ Failure recovery is a process that involves restoring an erroneous state to an error free state.
Fault Manufacturing Problems Design errors External disturbances Fatigue deterioration Erroneous System State(error) Process/System failure 4