The death of Garbage In Garbage Out
All the IT professionals know the Curse of GIGO .. Garbage in Garbage Out principle of Computer science. If your inputs are wrong then even the fastest computer with the best algorithm cannot give you right results. We can build fastest computers, quickest storage and greatest algorithms but if the inputs are wrong, the output will always be wrong.
It will be interesting for all you to know that this GIGO curse was recognized more than 150 years ago. This curse was recognized as early as 1864 in this passage from the Life of Philosopher by Charles Babbage, who is acknowledged as the first person to talk about programmable computers.
"On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."
---- Charles Baggage, Life of Philosopher 1864.
Just few days ago, I saw this headline in Live Mint dated Nov 17th 2016.
GIGO a hurdle for fintech's progress
So this GIGO curse identified in 1864 is still seem to exist in 2016!. 152 years old curse.
We the computer fraternity were busy solving this problem by various ways of Data cleansing. Large Enterprises were asked to do even physical inventory once a year to physically count items in the warehouse and correct inventory data in the ERP systems!. We started developing data cleansing algorithms by cross checking, building some rules etc. and struggling to fix this issue.
The new technologies such as mobile, IOT etc. started helping out more in this area. Let us take a case study workflow. Something most Indians will relate to. Electricity Board employee comes to your home on a monthly basis, physically reads the meter reading,
- notes the reading in a piece of paper
- Data entry person in the office and enters the data in a system
- Billing system calculates usage and determines the monthly utility bill
Initially we thought eliminating the 2nd transmission loss of Paper to IT system by developing notebook/mobile based system so that the Field staff can read the meter and enters the information in the system directly on the spot. This still does not solve the 1st data transmission problem of a person going in to a dingy / smelly/ hot & humid / dark corner where the meters are usually placed in India and read the meter correctly and enter it in to his Ipad/mobile. Lately we have solved this also by having IOT smart meters who on the dot of the 1st of the month directly send the usage data to the ERP system for correct billing. With this we thought we have overcome the curse of GIGO and moved in to new world of NOGI. NO garbage Input!!.
No. I do not think we have solved the problem fully. Yes, I agree we have come a long way since 1864 and the 152 year curse is slowly getting lifted but not fully yet. In the same example, you go to a low income neighborhood where the apartment is 1 Room + kitchen type setup. Imagine all kinds of electrical appliances used in this tiny apartment, we know that the usage cannot go beyond a point. However, if either due to pilferage by neighbor or faulty meter, the readings could be very high. Even the smartest IOT meter will just spit out the data as it sees without thinking about the anomaly.
What we have done here? We have just shifted the GIGO from input coming from Humans to input coming from smart machines. But not really looking at final use of data is to correctly bill the consumer. With all the Smart IOT meters etc. all we have done is to bill the customer a very high amount, may even cut off power for non-payment and let the customer go thru huge hassles to prove there is some external anomaly to get the bill corrected.
Hence, I agree with Vlad and restate my hypothesis that we are still under the curse of GIGO but AI/machine learning systems will be the fairy princess who will touch the cursed Toad to make him back in to a prince once again!.
How did I started thinking about this topic? Like an Apple falling from Tree moment, I also had my apple falling like moment. I was going on a personal trip to Chennai from Hyderabad. The SUV had some issues and the inbuilt GPS was on and we were unable to close the GPS app. I wanted to do some calls and the GPS was giving instructions every once a while and disturbing my call. I fiddled around the controls and best I could do was to mute the GPS and went on with my calls. After a while when I finished my call I just saw that GPS is asking the driver to take a U turn at every possible U turn in the national Highway to Chennai. The instructions came 100's of times and the expected time to reach the destination was increasing with every display. I looked at the GPS and realized that the driver has set the destination and the poor GPS is asking us to take the U turn wherever possible in the National highway and travel towards Hydrabad, the opposite direction. Immediately my mind went to " Oh.. Classic GIGO Issue". You give the wrong destination and the GPS is just doing its job and telling you how to reach you there.
At that moment, I questioned whether an intelligent / machine learning GPS understand that may be the Input destination could be wrong with so much data points suggesting the same. (a) The driver is unwilling to follow the GPS instructions repeatedly when there are no traffic obstructions on the U Turn junctions to follow the GPS advice or b. Car is going in a fast highway in exactly opposite direction? Once the GPS system can start questioning the Input, we are very close to getting out of the GIGO curse. The next generation GPS systems with AI/ML can deduce that I am sitting in the car using my mobile GPS, look at my calendar and find out that I have an appointment in Chennai and correlate the driver's unwillingness to reverse the car and change the input destination from Hyderabad to Chennai!
All the pollsters, all the great big data analytic software failed miserably in the recent USA presidential elections and many are attributing this to the GIGO problem. They say "our algorithms were the best ,our sample size and sample selection were good but people did not say the truth about their voting choices and hence we were victims of the GIGO curse". Do we have any AI/ML solution to change the input? Who knows? May be a video/Audio recognition AI/ML system can analyze the body language and voice of the person answering the question can deduce whether the person is telling the truth and do the input correction!.
We can see and benefit from these AI/ML input correcting systems every day. Vlad mentioned about Auto Correct on our smart phones and how real garbage inputs from us are auto corrected with good contextual fitment. We have seen Gmail reminding us to attach files when we forgot to attach just by reading your mail text where you indicate that you are attaching a document.
Today Gmail is reminding us to go to the airport to pick up your nephew just be reading your nephew's mail to you. Tomorrow if you are supposed to pick him up in Newark Airport and you type in JFK Airport in your GPS by mistake, the system will tell you " are you sure you want to go to JFK? It is 1000am and your nephew's flight arrives at 11:30am in Newark airport and there is no way you can go to JFK and travel from JFK to Newark in time". GIGO curse broken!. Otherwise, you will be waiting in JFK arrivals and your nephew waiting in Newark arrivals with no USA SIM card to contact you. I am sure your next call to your sister will be fun to watch".
Coming back to my Utility meter case study. Watson and other AI/ML systems have a concept called Anomaly detection. You teach the AI/ML system large number of typical normal patterns and the system can automatically detect anomalies. These are used for fraud detection in credit card and other financial transactions. We can use this feature and deduce that this low income person can no way consume so much power and generate an average last six month bill and initiate audit/review to determine meter problem or power theft. This will increase the customer experience multi fold compared to the old way using IOT with the GIGO curse.
Hence I conclude my hypothesis that the 152 year old GIGO curse is at last getting lifted by the AI/ML systems and we can truly get the benefit out of the new technologies.