NVIDIA's Titan V is the most powerful video card available to consumers but computer scientists discovered the product suffered from glitches that can cause issues with professional workloads.
The Register heard from engineers that repeated tests of simulations of an interaction between a protein and an enzyme on the $2,999 Titan V did not result in identical output. Two of the four cards gave numerical errors about 10 percent of the time, whereas they should output the exact same value each time again and again. The site says NVIDIA declined to comment on the reproducibility issue.
An unnamed industry veteran speculates the problem is caused by clocking the memory too high. This results in read errors, it's not a big problem for gaming but it makes the cards useless for data science.
An industry veteran, who alerted us to the issue, reckoned this is due to a memory issue. Chip companies normally push their high-end silicon to the limit to maximize performance. Nvidia may be overclocking or red-lining its Titan V in some way, causing read errors from memory. These mistakes are carried forward in calculations, resulting in numerical errors. Another cause could be a design blunder.
It is not down to random defects in the chipsets nor a bad batch of products, since Nvidia has encountered this type of cockup in the past, we are told. The moneybags biz released patches for some of its older GeForce and Titan models that exhibited similar problems to address these errors. There was no issue with its Titan X card based on its Pascal architecture, we're told.