Debunking Myths – Rethinking the Approach to Data
Since the inception of NLP Logix in 2011, clients have asked the team to solve an array of machine learning and automation challenges. In the quest to unravel those challenges and provide client solutions the NLP Logix team has often defaulted to multiple lenses… meaning sometimes the obvious path requires a second look from a different perspective. Recently, our team was able to figure out a difficult issue by rethinking the approach to the data. Read on.
Case in point, when working on a computer vision model the NLP Logix team was tasked with identifying malfunctioning pistons. When the pistons were caught in the engaged position they were considered malfunctioning. These faulty pistons were causing the client excess expenditures in gas and parts maintenance.
Using computer vision NLP Logix sought to identify the engaged pistons for the client. The obvious tell of the engaged piston was the positioning of the piston mechanism as a whole. Annotated photos of engaged and disengaged photos were collected to train the predictive model. Despite iterations, training predictions to spot the engaged pistons (model precision) were found inconsistent approximately 50% of the time. It was discovered that the discrepancies were found due to the “shine” of the metallic piston mechanism, sometimes caused by environmental elements, such as rain, or the metallic luster of the engaged piston itself.
Grabbing a sleuth hat, NLP Logix visited the client on-site to investigate the piston positioning. After contemplating the mechanics of the piston, it was decided the team needed to take another approach. Instead of looking at the piston in its entirety, engaged versus disengaged, NLP Logix simplified and looked at the individual components of the piston. Thus, the question changed from “Can we distinguish between an engaged and disengaged piston” and was now “Can the distance between the collar and cone be measured?”
By looking at the individual components separately and allowing business rules to guide the final prediction, the mechanical section causing the glare and miscalculations was eliminated from the equation. Precision rose to above 95%, enough that NLP Logix no longer needed manual audit of the results. Given that the model identified both the cone and collar confidently and determined that the distance was large enough to trigger a warning, false positives almost disappeared.
Not sure where to start?