Today's IP backbones are adequately provisioned to meet service level agreements (SLAs) in terms of loss, delay, and port availability. However, performance degradation is likely in the case of failures (e.g., fiber cuts, router crashes), routing instability, or router misconfigurations. This talk gives an overview of a real-time monitoring infrastructure deployed within an operational Tier-1 IP-backbone to analyze the characteristics of failure events. The analysis is based on IS-IS routing updates from three different points in the backbone. We will also investigate the potential impact of failures on emerging services such as Voice-over-IP. In the second part of the talk, we present results from a typical link failure scenario in the backbone, including the re-convergence delay in response to link failures and its effect on service disruption. Our results offer insights on two basic components for constructing a realistic link failure model, and for defining "service" availability, which we believe is the more appropriate basis for service-level agreements to support emerging applications such as VoIP and VPNs.