In my last AI blog, 7 Tips To Make Your SAP Data Ready For AI, I discussed how you can become data ready to leverage AI solutions. The TUSCANE (Timely, Usable, Structured, Complete, Accurate, Neutral, Enough) model was designed to ensure that an AI tool or model in use has the right data available on SAP for accurate insights.
In this blog, we will take a step back to understand how you can ensure your data meets each of the conditions laid out in the TUSCANE model. In simpler words, we talk about the importance of a Data Dictionary.
A robust data dictionary is the secret sauce to unleashing SAP’s power because it ensures a firm grasp of organizational data to leverage not only advanced AI insights but even the most pedestrian day-to-day insights that inform the managerial strategy and its execution.
Let’s say you decide to write a book because you have a great story in mind. This story and other information supporting it are your data. However, only jotting them down on paper doesn’t create a book; that would be akin to simply taking notes.
That is the SAP equivalent of simply adding information to fields. A good book also needs you to plan the chapters, understand the link between different points of information, be clear on the core objective or message of your story, validate and verify the accuracy and availability of information, use the right sources for accurate information, plan the flow of the story in a way that would hook your target readers best, design a proper cover and get the right people to review it.
A data dictionary brings all this information and puts it in your control. Without it, you may successfully add your story to the pages but you won’t really unleash its true potential.
To move beyond the book analogy, organizational data needs a dictionary also because of the dynamic nature of data flow, with new data coming in, old ones becoming redundant at times, and the very purpose or method of analysis of data evolving with time.
Now, the TUSCANE model of data readiness aims to ensure that the data available to an AI model is timely, usable, structured, complete, accurate, not biased, and enough. A quick note on the “structure” requirement: While advanced models can also work with unstructured data, it does not imply that we can feed the model garbage in the guise of an excel file or database, with incorrect information in incorrect fields or spread across multiple fields without a logical structure to it.
Coming back, while TUSCANE explains the state of data that a model should be able to feed on, how do we get our data on SAP to meet these conditions? This is where a data dictionary can help. So, what does it look like?
A data dictionary contains your metadata. SAP is a good host for your database but can become severely complex with all information we add to it unless we find a way to track and manage all that information on SAP.
A data dictionary can be created on a simple Excel file for starters. It showcases the overall purpose of collecting data, the KPI’s that are central to the business and for which data is being collected, the lower level metrics down to individual pieces of data feeding them, the sources of that data, the gaps therein, the relationship between the data points, and the stakeholders responsible.
In doing so, it reveals points of risk as well as rewards that have not yet been leveraged. Let’s look at how to construct it on a simple Excel file to inform your SAP instance.
A data dictionary can help SAP admins ensure that the individuals in RACI framework understand their roles in having the data on SAP adhere to the TUSCANE criteria. That not only makes the life of SAP admins easier but also ensures that organizations walk the talk en route to data, and by extension AI, readiness.
A crucial factor behind AI functioning well is Data. A 2019 report mentioned that 96% of organizations ran into problems with AI and machine learning projects, primarily owing to data. In fact, a few ...