Dynamic Modeling of the Trade-off Between Productivity and Safety in Critical Engineering Systems

M.M. Cowing, M.E. Pate-Cornell, and P. W. Glynn

Reliability Engineering and System Safety, 269-284 (2004)

Short-term tradeoffs between productivity and safety often exist in the operation of critical facilities such as nuclear power plants, offshore oil platforms, or simply individual cars. For example, interruption of operations for maintenance on demand can decrease short-term productivity but may be needed to ensure safety. Operations are interrupted for several reasons: scheduled maintenance, maintenance on demand, response to warnings, subsystem failure, or a catastrophic accident. The choice of operational procedures (e.g. timing and extent of scheduled maintenance) generally affects the probabilities of both production interruptions and catastrophic failures. In this paper, we present and illustrate a dynamic probabilistic model designed to describe the long-term evolution of such a system through the different phases of operation, shutdown, and possibly accident. The model’s parameters represent explicitly the effects of different components’ performance on the system’s safety and reliability through an engineering probabilistic risk assessment (PRA). In addition to PRA, a Markov model is used to track the evolution of the system and its components through different performance phases. The model parameters are then linked to different operations strategies, to allow computation of the effects of each management strategy on the system’s long-term productivity and safety. Decision analysis is then used to support the management of the short-term trade-offs between productivity and safety in order to maximize long-term performance. The value function is that of plant managers, within the constraints set by local utility commissions and national (e.g. energy) agencies. This model is illustrated by the case of outages (planned and unplanned) in nuclear power plants to show how it can be used to guide policy decisions regarding outage frequency and plant lifetime, and more specifically, the choice of a reactor tripping policy as a function of the state of the emergency core cooling subsystem.