Friday, 24 September 2010 03:43		
			
	  	  
	  
  
    
  
  
	  
	
		
	
	
  Facebook Gives A Post-Mortem On Worst Downtime In Four Years
	  	Facebook’s had a rough day. In fact, it’s had its worst day performance-wise in over four years, with 2.5 hours of downtime that resulted in countless complaints from users. Perhaps more important, it also had a bevy API problems, and its Like buttons — which are embedded on over 350,000 sites across the web — were apparently busted too. When Facebook goes down, it’s a big deal.
This evening Facebook Director of Software Engineering Robert Johnson has written a post-mortem of the outage, explaining what caused the site to fail.
According to Johnson’s post, the problem stemmed from an automated system Facebook had built to check for invalid configuration values in its cache. Unfortunately, that automated check backfired — to the point that Facebook had to turn off the site entirely to recover. Here’s a portion of the explanation:
Today we made a change to the persistent copy of a configuration value that was interpreted as invalid. This meant that every single client saw the invalid value and attempted to fix it. Because the fix involves making a query to a cluster of databases, that cluster was quickly overwhelmed by hundreds of thousands of queries a second.
To make matters worse, every time a client got an error attempting to query one of the databases it interpreted it as an invalid value, and deleted the corresponding cache key. This meant that even after the original problem had been fixed, the stream of queries continued. As long as the databases failed to service some of the requests, they were causing even more requests to themselves. We had entered a feedback loop that didn’t allow the databases to recover.
The way to stop the feedback cycle was quite painful – we had to stop all traffic to this database cluster, which meant turning off the site. Once the databases had recovered and the root cause had been fixed, we slowly allowed more people back onto the site.
Facebook has generally had a good track record in terms of keeping its homepage alive, but I’ve heard repeated complaints about the integrity of its API. And given Facebook’s goal of becoming the social fabric of the web — which entails maintaining a presence on countless third party sites — it’s imperative that it keeps its various widgets and authentication buttons working properly.
CrunchBase InformationFacebookInformation provided by CrunchBase
  
 
 
 
 
 
 
 
0
0
1
1 
2
2 
3
3 
4
4 
5
5 
6
6
7
7Authors: Jason Kincaid	  
	  	  		
		
	  	  
		
	  
	  	  
	  
	  
	  
  
						
			
				Read 4989 times			
					
						
			
		
		
				
		
			Published in
			News Technologique-Tech News
		
				
	  
	  	  
	  
		
  
  	More in this category:
	
				
			« Réferencement : Soyez visible depuis googleMaps		
				
				
			Internet Control Issues: It’s Not Just China »
		
				
  
  
  
    
  
    
 
	
	
	
			
		accident	
				
		Amazing	
				
		animal	
				
		animals	
				
		animaux	
				
		avec	
				
		baby	
				
		bébé	
				
		car	
				
		Cat	
				
		chat	
				
		chien	
				
		comment	
				
		Crazy	
				
		Cute	
				
		dans	
				
		Dog	
				
		droles	
				
		Echec	
				
		fail	
				
		From	
				
		funny	
				
		jump	
				
		nature	
				
		new	
				
		people	
				
		pour	
				
		raté	
				
		russia	
				
		russie	
				
		saut	
				
		sauvage	
				
		Sport	
				
		stupid	
				
		sur	
				
		Technique	
				
		The	
				
		truck	
				
		une	
				
		usa	
				
		vehicule	
				
		vehicules	
				
		video	
				
		video du jour	
				
		videos	
				
		voiture	
				
		webbuzz	
				
		wild	
				
		with	
				
		étonnant	
		
				 Le principe Noemi concept
		    			Le principe Noemi concept			   
			 Astuces informatiques
		    			Astuces informatiques			   
			 Webbuzz & Tech info
		    			Webbuzz & Tech info			   
			 Noemi météo
		    			Noemi météo			   
			 Notions de Météo
		    			Notions de Météo			   
			 Animation satellite
		    			Animation satellite			   
			 Mesure du taux radiation
		    			Mesure du taux radiation			   
			 NC Communication & Design
		    			NC Communication & Design			   
			 News Département Com
		    			News Département Com			   
			 Portfolio
		    			Portfolio			   
			 NC Print et Event
		    			NC Print et Event			   
			 NC Video
		    			NC Video			   
			 Le département Edition
		    			Le département Edition			   
			 Les coups de coeur de Noemi
		    			Les coups de coeur de Noemi			   
			 News Grande Région
		    			News Grande Région			   
			 News Finance France
		    			News Finance France			   
			 Glance.lu
		    			Glance.lu			   
			








