Good Machine Learning Practice: The First Step Toward Transparency & Trust

Machine learning (ML) innovation in healthcare is growing, and the oversight on its development should keep pace to ensure it’s developed in a scientifically sound, safe way. Just as the FDA requires detailed ingredients on the side of cereal boxes to help consumers make informed health decisions, the same transparency should apply to the ML technology our patients and clinicians use every day to make informed care decisions. Instead of nutrition facts, ML should include proof that a model was built with quality data representative of its intended patient population in addition to disclaimers on its use.

The good news is that federally-issued Good Machine Learning Practices (GMLP) have emerged to offer guiding principles when developing ML. These ten principles run the gamut, from making sure training data sets are independent of test sets, to ensuring data sets are representative of the intended patient population. These guidelines are an important first step in encouraging adoption of proven, quality practices.

However, it’s important to remember that truly responsible innovation requires more than an agreement of what developers should do. We need to go beyond recommendations to standards that we are all held to and are fundamentally baked into all of the behind-the-scenes work that goes into an algorithm. Secondly, we must provide transparency around how the algorithm was built along with proof that the developer followed these guidelines.

From Transparency to Trust

AI is too often viewed as a “Black Box”; a tool shrouded in mystery that somewhat magically delivers the insight one is looking for. Given today’s lack of visibility into how an algorithm comes to its conclusions and how exactly it uses data to produce its output, it is understandable how an industry so reliant on reporting and audit trails would hesitate to adopt ML solutions. To build trust in these solutions, we need to make them understandable and appealing to their end users, and hold digital diagnostics to the same standard as physical diagnostics.

This should be a reasonable ask for many of us in the pharmaceutical industry – we are very used to providing extensive documentation around the development of a drug and steadfast proof of its impact. Carrying this vigilant tracking and the idea of providing real-world evidence over to the technology we use will help reassure users that these models are safe and accurate.

Advancing Responsible Innovation

To date, the enforcement of scientifically-sound, ethical practices when developing ML has largely relied on good faith in developers to do the right thing. As these tools are put into the hands of more and more clinicians and patients, they need more evidence than good intentions alone. Regulations and federal oversight of this growing industry continue to evolve, but one thing is for certain — peeling back the curtain and prioritizing traceability of a model’s development and output will be key to not only offering peace of mind, but spurring innovation in this budding industry.

We’ve discussed the promise of AI and ML in healthcare for over a decade, and we now find ourselves at a tipping point. Transparency and standardization are inevitable next steps to improve confidence and drive adoption of tools with the potential to improve patient care and accelerate drug development.