For decades, data scientists and AI developers have typically followed the belief that “bigger is better” – the bigger the data set, the more analytical freedoms one has, and the more insights one gains. The AI industry still heavily relies on big data analytics, and for good reason – we need to feed AI large amounts of high quality, diverse data to ensure it works effectively for its intended patient population. But, as we strive for precision medicine and advance clinical research of rare diseases, there’s a need for data approaches that allow for more targeted analysis. As of late, the conversation around data analytics has shifted not just from the volume of data to the applicability of that data. More experts are asking, what if you could use less data but garner more actionable insights, and in a more specific context?
The need to leverage more diverse types of data is paving the way for the next generation of AI development – Gartner recently predicted that 70% of organizations will shift their focus from big data to “small and wide data” by 2025. By using big data as AI’s foundation, and combining it with small and wide data approaches, we can take personalized patient care to the next level and drive more agile, affordable innovation.
Using big data as AI’s foundation
Big data’s volume and diversity offer the necessary building blocks for AI, helping to build bigger-picture trends and general correlations, such as visualizing a particular market trend or teaching AI to recognize an apple versus a baseball. In the clinical research realm, imagine a computer vision algorithm to monitor hand tremors of patients with Essential Tremor. To do so, big data is needed to teach the algorithm to successfully recognize a variety of hands of different colors, sizes, ages and more, as well as recognize when these hands are moving. But, to accurately measure where a patient falls on the Essential Tremor Rating Assessment Scale (TETRAS) with the sensitivity needed to inform treatment decisions, researchers need a more targeted data approach, and collecting videos of thousands of patients with Essential Tremor for a big data approach is nearly impossible. Building off big data with small and wide data will not only help make AI development in healthcare more attainable, but also will allow sponsors and clinicians to pinpoint the unique needs of their patient population.
The next generation of data analytics
In contrast to big data’s holistic views and correlations, small and wide data pick out specific information and distinct insights from individual data components. Wide data allows for data scientists and AI developers to look across disparate data from a wide range of sources – small, large, unstructured or structured – to come up with meaningful analysis. Small data on the other hand is focused on using small, individual sets of data to draw specific, more personalized insights. Contrary to big data, small data is simple, manageable, and allows for fast or even real-time decision-making. Using the apple versus baseball example, wide data allows you to collectively examine green and red apples or new and worn-down baseballs, while small data can help tell you if the baseball’s stitching is loose, or if the apple is bruised. Together, small and wide data are capable of using the data one has available to them more effectively, both in terms of reducing the quantity of data needed and extracting more value from diverse data sources.
Advancing precision medicine
Big data is not going away, but when combined with small and wide data, we open the door to more pragmatic AI development and more precise patient care. By making an algorithm more specialized and eliminating the need to spend resources aggregating and processing massive databases at each stage of an algorithm’s development, companies can be nimbler and more creative in the insights they derive from particular data points. A better understanding of one’s data means we can capture the nuances of disease and help match the right patient to the right treatment at the right time.
While the AI industry relies on big data as a foundation, understanding the true value and actual needs of datasets is essential to driving effective and efficient algorithm development. Small and wide data could ultimately provide similar problem-solving in a much more specific, manageable way, paving the way for meaningful AI innovation.