Ingesting Data the Right Way – is the key to AI

The potential of AI is impossible to resist. The ability to predict a customer’s next question, to assess financial fraud, determine the most successful treatment for a disease or even forecast when the next catastrophic climate event will take place.

In fact, many large organizations are creating AI research and development centres. Recently Samsung announced the establishment of three state-of-the-art AI centres; in Toronto (Canada), Cambridge (UK) and Moscow (Russia). Canada has now become a global leader of AI research and development.

At the Big Data Summit this year in Toronto, Chetan Mathur (CEO of Next Pathway), provided his rationale for why AI is truly achievable now:

  1. The availability and reduced cost of compute power.
  2. AI Algorithms have greatly improved.
  3. Greater availability of skilled talent.
  4. Businesses are being digitized.

Data is the raw material for AI – this is true. However, how you ingest, clean and prepare this data is critical in being able to achieve the promise of AI.

Our firm, Next Pathway, is a global leader in the development of accelerators and products in the data space.
We strongly recommend that you build a data manufacturing pipeline that collects and standardizes the data – once. This pipeline should allow for immediate and early analytics and data science. Further down the pipeline, more complex transformations can take place that require more curated data. For example, risk, regulatory and financial reporting. We call this “2-Speed Data” which we explained in a recent blog post Two-Speed Data. Not all data needs to cost the same.

The key point is, the same data can be used for customer sentiment analysis, pricing, fraud, financial reporting or product mix analysis. It’s all the same data, the same raw material.

Collect it once, standardize it once, collect the lineage once and then transform it as many times as you like.

Multi-purpose data use cases don’t need to be complicated and can serve the enterprise at multiple levels and for multiple business lines. Ingesting, cleaning and standardizing the data correctly is paramount. Companies that get this right, we be able to capitalize on the promise and potential of AI.