Data Governance

How Data Products Are Powering Decentralized Data Mesh Architectures

David Vavruska
May 26, 2023 | 12.5 min read
In our recent article and Accurity Data Vibes, we have discussed what a data mesh is and how it revolutionizes the way companies work with data. In this article, we are going to take a look at a very crucial element of the data mesh that makes the entire architecture possible – the concept of data products.

Data mesh recap

To summarize, data mesh is a decentralized data management architecture. Where normally, there would be a central data team that would fulfill data-related requirements from all the other departments of the company, the main point that data mesh promotes is the idea that nowadays, most companies have to work with so much data that no single centralized data team can possibly keep up. And to remedy that, it proposes that professionals from every department, technical and business-oriented alike, should learn to satisfy their own data needs to at least some extent. Moreover, a central data team could never possibly understand the context of requests coming from the Risk or Business Intelligence departments. People who understand the context deeply are able to resolve any request to greater satisfaction. And who can possibly understand the context better than the people experiencing the original issue first-hand?

What are data products, and how do they align with the data mesh concept?

Data products can be anything – they can be a file, a database table, a video, a whole data warehouse, or an API. Anything, as long as it brings unquestionable data value to its stakeholders within the correct context and they are self-servicing. A data product aligns with the data mesh concept as long as it embodies the core principles and practices of data mesh architecture, such as decentralized ownership and domain-oriented thinking. Let us explore this in more detail…

Why data products can do literally anything

Because, as stated, they can quite literally be anything. Truly, the context is the most crucial part that makes the difference between a data product and something created by a centralized data team. It is something created by the people experiencing a data problem firsthand to fix that problem.

For instance:

  • An API that takes specific data from a multitude of data sources, applications, and files to consolidate it for reading specific information.

  • A set of dashboards that takes data to provide visualizations of key reporting indicators.

  • A group of Excel files that contain data relevant to a niche topic.

  • A video portal providing an internal databank of onboarding and employee education courses.

  • A database containing data for the use of a business unit.

  • A data stream that takes data about transactions and assigns it to individual customers in the CRM accordingly.

  • A machine learning model that automates data discovery.

Data products can really be anything as long as they fulfill these criteria: data products solve a problem, utilize data, and are defined by the person who uses them.

Technically speaking – what a data product achieves

It may come as somewhat of a surprise to most, but the true value of the data products concept isn’t the value it provides to the people who create and use it, oh no. Its true value comes to light when you see the efficiency the concept brings to an organization’s overall data management architecture.

The concept of data products works so well in a data mesh because they also produce many things that bind elements of a data mesh together.

The process of creating a data product, as outlined by the data mesh methodology, results in a treasure trove of metadata being created as a byproduct. Information about data ownership, the business domain it belongs to, what rules are there for access and security of the product’s data, or how the data of the product can be classified – all this not only helps to add the much-needed context to a data product, but it also helps tremendously to correctly place the new data product within the data mesh of the entire organization.

Of course, in order to benefit from this metadata treasure chest, the organization needs to have a metadata management system in place. As we explained in our last article of this series, metadata management is a crucial part of managing a decentralized data architecture. Decentralized architectures such as data meshes are like machines made up of independently moving parts, and metadata are used to coordinate their movements without a need for a centralized data repository.

Combine the two concepts together, and you’re starting to see why data products and data mesh are made for each other. Data products produce metadata about who uses them, for what purpose, and what data is involved. And that is the exact information a data mesh needs to enable the organization to work without a specialized data team, instead focusing on educating its people on how they can work with data themselves.

Another thing a data product has conceptually built into its DNA is observability: The methodology encourages the data product’s creators to set up metrics to measure the quality of data and efficiency of use of the data product from the very start. This can lead to some major savings by avoiding having to fix costly regulatory and quality issues down the road.

How important is context and use cases?

In short: When it comes to data products, context and use cases are everything.

If you wanted a longer version, I would explain that data products allow companies to skip a couple of steps that normally come between defining a problem and designing a solution.

Employees are utilizing data to generate some value for their company. These very employees are often the ones to notice areas for improvement in either the way they work with data or in the system they are using to work with data. Their suggestions will come from firsthand experience and from the context of their jobs and responsibilities.

These suggestions will not take into account the problems of the whole company, matters of system integration, and any of the other holistic issues. Therefore, when these suggestions reach the IT department or the central data team, the IT guys will either scrap it as not beneficial for the entire company or heavily rework it into something that does.

The end result, however, is that the original problem usually isn’t fixed. A lot of work was done to try and improve the situation, but the reason the situation needed improvement is most likely still there.

Not with data products! A data product is created not to benefit the entire company but to benefit the people who suggested those improvements alone. Data products follow the maxim: “If you want to satisfy everyone universally, you will end up satisfying no one. If you satisfy one by one - case by case - individually, you end up satisfying everyone eventually.“

Sure, you will end up with a lot of systems with very niche user groups. But, boy, will you be efficient!

In simple terms – what can adopting data products bring your business?

Adopting data products means giving a key voice to employees in deciding how to best do their job. If they are to efficiently do that, they need to understand what can and cannot be achieved with data. Data products require a workforce that understands the role data play in their jobs, and often it will be their job to develop these data products themselves.

That means investing in the education of staff and data democratization. That is not a small investment, but it definitely pays off.

For a one-time big investment, your company ends up with a large data-literate workforce that can be used for development and other technical projects.

Adopting data products also means devolving data project initiatives down to lower-level employees. Business units will encourage their staff to solve their data-related problems by developing a solution on their own as long as it can function within the company data mesh.

That means exchanging some managerial control for greater overall efficiency and flexibility of the company as a whole.

Finally, adopting data products means introducing alignment between business and data as a fundamental concept of your company. They encourage the diffusion of understanding of data among business professionals and business purposes for data among technical professionals. In the end, they result in the creation of a company where data architecture has transformed to mirror the business reasons for its existence.

That means data and business will no longer be separate worlds in your company. They will become entangled to a point where one cannot exist without the other, leading to better, quicker, and more reliable decisions in every aspect of your business and the futureproofing of your organization.

Conclusion

Of course, going through all this alone can be an arduous journey of trial and error. That is why an entire market exists to help organizations like yours through the process of planning and implementing data products, the data mesh framework, and the data fabric architecture.

There are also many tools designed to help you manage this new style of data management. The Accurity platform, for instance, was explicitly designed as a tool to ingest metadata and help spread context and understanding about data across organizations of all kinds and sizes, from small e-commerce businesses to major systemically important banks. It also provides data quality observability and management to make sure the data products your colleagues are using in their daily work are reliable.

If you feel your organization is about to make its move towards implementing data products or one of the other data management frameworks on the rise, why not schedule a short talk with us? We can show you what such a setup would look like in practice and demonstrate its benefits in your specific use case.

David Vavruska
Product Analyst