DV Hardware bringing you the hottest news about processors, graphics cards, Intel, AMD, NVIDIA, hardware and technology!

   Home | News submit | News Archives | Reviews | Articles | Howto's | Advertise
 
DarkVision Hardware - Daily tech news
December 10, 2016 
Main Menu
Home
Info
News archives
Articles
Howto
Reviews
 

Who's Online
There are currently 68 people online.

 

Latest Reviews
Zowie P-TF Rough mousepad
Zowie FK mouse
BitFenix Ronin case
Ozone Rage ST headset
Lamptron FC-10 SE fan controller
ZOWIE G-TF Rough mousepad
ROCCAT Isku FX gaming keyboard
Prolimatech Magnetic Pin
 

Follow us
RSS
 

Google publishes DRAM failure study

Posted on Friday, October 23 2009 @ 07:00:17 CEST by


Google has published a paper about DRAM errors in the wild, this is one of the first large-scale field studies about this subject. You can read it over here (PDF).
Errors in dynamic random access memory (DRAM) are a common form of hardware failure in modern compute clusters. Failures are costly both in terms of hardware replacement costs and service disruption. While a large body of work exists on DRAM in labo- ratory conditions, little has been reported on real DRAM failures in large production clusters. In this paper, we analyze measure- ments of memory errors in a large fleet of commodity servers over a period of 2.5 years. The collected data covers multiple vendors, DRAM capacities and technologies, and comprises many millions of DIMM days.

The goal of this paper is to answer questions such as the follow- ing: How common are memory errors in practice? What are their statistical properties? How are they affected by external factors, such as temperature and utilization, and by chip-specific factors, such as chip density, memory technology and DIMM age?

We find that DRAM error behavior in the field differs in many key aspects from commonly held assumptions. For example, we observe DRAM error rates that are orders of magnitude higher than previously reported, with 25,000 to 70,000 errors per billion device hours per Mbit and more than 8% of DIMMs affected by errors per year. We provide strong evidence that memory errors are dominated by hard errors, rather than soft errors, which previous work suspects to be the dominant error mode. We find that temperature, known to strongly impact DIMM error rates in lab conditions, has a surprisingly small effect on error behavior in the field, when taking all other factors into account. Finally, unlike commonly feared, we don’t observe any indication that newer generations of DIMMs have worse error behavior.



 



 

DV Hardware - Privacy statement
All logos and trademarks are property of their respective owner.
The comments are property of their posters, all the rest © 2002-2016 DM Media Group bvba