What Happens When the Data Is Wrong?
Everybody's placing their faith in "data." They even have a name; Datarists. Commentators are singing the praises of data and data-centric approaches to anything and everything. From the CEO to the CXO to Joe Schmo (thanks, Phil Regnault!) it's all about what the data's telling us. But it's been wrong before. So, exercise caution in this headlong rush to data. Beware the classic human behavior as we lurch from no data to all data – find the middle ground and realize that data alone (even "big" data) won't solve anything. So, take a walk with me, we'll discuss.
In 1962, as the US became embroiled in Vietnam, Robert McNamara was the Secretary of Defense. He bought a "data-centric" approach to the department, based on the notion that defense spending had to be better quantified. McNamara was looking for ROI. He tasked his people to collect data to assess whether they were winning, or not. They did. They provided data on more than 100 separate indicators including the numbers of enemy weapons recovered, the numbers of Viet Cong casualties, the number of Viet Cong defections. Edward Lansdale, Head of Special Operations in The Pentagon, suggested that they needed to assess "the feelings of the Vietnamese people" to get a balanced view. He was ignored, at least until 1965. The data was gathered, and it gave them the news they wanted. The U.S. and it's South Vietnamese allies were winning the Vietnam War.
In 1964 The Defense Department retained the services of The Rand Corporation, a secretive think tank based in California. Leon Goure headed up a project that was tasked to take the approach suggested by Lansdale. It was called the Viet Cong Motivation and Moral Project and as Malcolm Gladwell points out in his Podcast Revisionist History – Season 1, Episode 2 despite gathering mountains of data and intelligence, you can still get it wrong. The project collected 62,000 pages (that's a lot) of interviews with North Vietnamese and Viet Cong defectors and prisoners to answer the question of whether the US was breaking the will of the Viet Cong, with its long-range bombing of the North. Early findings were that 65% of the interviewees believed the North would win. After one year of the bombing strategy, this dropped to 20%. Leon Goure, therefore, concluded this plan would deliver victory. When victory had not appeared by 1966, Conrad Kellen replaced Goure, looked at the same data and guess what? He concluded the opposite. So, was it the wrong data, was the data gathered in the wrong way, was the data "bad," or just poorly interpreted?
One could argue that it's different now. That mistakes like this couldn't happen again. We have access to technology, processing, storage, IoT, and AI that enables us to gather even more data. True, but as we have seen, at least in foreign policy from Gulf War One to 9/11 to WOMD, Gulf War Two, Afghanistan, and ISIS the outcomes don't appear to support any better decision making. The central government and its massive resources can gather piles of data (in sometimes questionable ways), and despite all those resources, they can still get it wrong. Why do we think that private companies with significantly fewer resources trying to solve challenges around changing consumer behavior (for example) can get it right?
Maybe we do need to gather more data OR watch out for the influence of inherent biases in the interpretation of the data. If you listen to Gladwell's podcast (and I recommend you do), he explains that profound biases on the part of those interpreting data hugely impact the outcomes. The presence of these biases acts as an argument for "automating" the interpretation of such data. Pattern recognition through AI is growing and will continue to do so. Taking the human out of the loop would eliminate some of the biases that influence the conclusion.
But let's not throw the baby (or human) out with the bathwater just yet. The human sensory system remains years ahead of its robotic pursuers, especially in human interactions. A friend reminded me of a great example while refereeing a youth soccer match. We all love to second guess the refs in any sport these days. Now with challenges, slow motion from 10 different angles, and replay judges, etc. Do we even need referees anymore? The technology can handle all of it. Well not so fast. As my friend reminded me, during his youth match, he yellow carded a player for what, to the distant viewer, might have looked like an innocuous infringement. Being close to the action, however, our friend could bring a uniquely human perspective. He could sense the intent of the guilty player, based on their approach, speed, body language, facial expressions. These are clues that technology can record but remains poor at both communicating and interpreting. The human brain, on the other hand, is unmatched at this type of interpretation.
The Verto Verdict
Don't get seduced by the wealth of data out there and our seemingly uninhibited ability to harvest it all. Think hard about where to look for the data you need. The advantage of Big Data is that it allows you to distill to more specific information while retaining the benefits of massive data sets. In his book “Everybody Lies” Seth Stephens-Davidowitz recommends looking for data in unlikely places. Focus on the location of the data without getting seduced by the volume. He uses irresistible examples of how this had paid off. The role of bias in human decision making is as well known, as it is sometimes equally well ignored. Different people viewing the same data can provide varying opinions, sometimes motivated by political or professional expediency. It’s difficult to maintain objectivity if you’re invested in a certain answer. Finally, despite the biases, the human ability to process "intangible" data (like sensing the physical intent of the soccer player in the above example) has no equal.
So, do we go with “There are three kinds of lies: lies, damned lies, and statistics” (or “data as” we wordsmithed) – as Mark Twain (might have) said. OR do we go with “In God we Trust, all others bring data” – from Edwards Deming (Grandfather of the Lean Startup). As usual the answer lies in finding the balance. In "Everybody Lies” by Seth Stephens-Davidowitz he says “The solution is not always more Big Data. A special sauce is often necessary to help Big Data work best: the judgement of humans…And Big Data does not eliminate the need for all the other ways humans have developed over the millennia to understand the world.” So, there’s a symbiotic relationship between Big Data and the human mind. In the words of the great Forest Gump; “they go together like peas and carrots”. Wise words Forrest, wise words.