Data Governance

Make Data Governance a Top Priority

Why it is important and how to start

William D'Souza
8 min readFeb 4, 2021
Photo by Tim Mossholder on Unsplash

In development, a common practice is to identify a stage when a product is worthy enough to be pushed out to the world, even if the product is not as mature as expected. All products need to start somewhere, as their growth is an iterative process that takes time and patience. Releasing something at a predefined stage has its benefits: entering a market fast, getting customer feedback early on, and identifying bugs or gaps. No matter what the product is, there is always the conversation and thoughts revolving around the data you can harvest with it. What is released may be used for data collection in analysis, or it could be a mandated security requirement. It could be there for operational reasons on the user’s end, or it could be there for tracking requirements. There is usually a combination of reasons, which carves the path to understand the importance of data governance.

A car has many components housed in different areas: under the hood, interior mechanics, and the undercarriage. The majority of these components are essential and connected in some fashion, even though they may be performing individual tasks. A single component missing can have consequences (with a car the consequences are more serious) and we do our due diligence in maintenance to ensure the vehicle’s longevity. When the car is in operation, we typically send a signal of some sort to obtain a reaction. This comes from many inputs: turning the wheel, hitting the gas, using a turn signal. With any requested input, the car receives a set of instructions and knows which components to activate. It would be devastating if the brake pedal worked similarly to the gas pedal.

Establishing good data governance is similar to building a car. It is not just about the frame or outer appearance, the design of the components is more appealing. A quality car can take a set of inputs, applying context in a safe, consistent, and repeatable manner. Good data governance procedures in organizations are incredibly beneficial. It is critical to establish the fundamentals and create the process with successful designs. A car comes with a maintenance schedule as things depreciate and components need servicing to function. Data governance is no different than a car; audits are necessary as changes to structures evolve.

You typically wouldn’t purchase a car with serious problems (unless the tradeoff with the cost is worth the benefits you see). So, why you buy into an organization’s data when their data governance has problems?

How Can Poor Data Governance Affect You?

Photo by Marco Oriolesi on Unsplash

While the temptations of working with data in a flexible open environment can bring attitudes of excitement and enthusiasm, it isn’t always in your best interest to operate in this system. There is a time and place for it, as it surfaces innovative insights that can drive decisions. However, in an organization, it can affect the productivity and reliability of the system itself. These flexible methods are great when you are in the depths of understanding something, but they lack the standardization needed to fuel an organization for continuous success.

Knowledge Issues

The saying that one person’s garbage is another person’s treasure is easy to understand. A more accurate statement in data should be something like…

One person’s garbage is probably another person’s garbage

One of the major issues when incorrectly implementing data governance is that it creates a knowledge gap. Data in the system can be outright wrong or the underlying context a data point entails can be flawed. The data may not be consistent or the data can contain duplicates where it can be misinterpreted. Whatever the case is, this causes a huge issue in understanding what the data means. If you can’t understand what the data means, the only thing you can learn from it is how to better implement it!

Communication Issues

The United Nations assembly is represented by nations around the world all speaking different languages. It isn’t expected that everyone is fluent in one single language, there are translators to fill in the gap when needed (in good faith they are doing their jobs as required). The system seems to work, but there could be some flaws as translations in some languages can be misinterpreted. They are not always English words that can be translated with the same intent to another language. Wouldn’t it be ideal if it was one standardized language?

Implementing one standardized language in the United Nations may not be sensible, but it works differently for organizations. Organizations are composed of individuals that gather evidence to perform or prove the need for a task. Typically, some form of data is usually the fuel for this, and communication issues arise when everyone is using different data to prove their points. When the sources of truth to acquire data is large, the possibility of the data being used with consistency is low. The possibility that the data is misused is high. This causes massive communication errors as you can have people sitting around a table with a common goal but using different data to convey a message. Everyone around that table is speaking a different language! This leads to a lack of productivity, arguments, and little confidence in decision making.

Business Planning and Decisions Suffer

There are plenty of decisions that are made without data as some are just sensible or a standard. Not every single decision needs to be backed by data as it’s not a requirement to make an impactful decision. Although it’s always better to gather evidence as opposed to using your gut feeling, constraints are a real thing (data is not present and tight deadlines) and we may just have to make a call. The majority of decisions in the workplace will call for evidence; qualitative or quantitative.

When there is no sense of data governance in an organization and standards are lacking, business decisions and their planning suffer. We sometimes use inaccurate data and spend weeks planning based on it, only to be told later that the data was incorrect and now the whole planning process is flawed. This issue may have been the person curating the data and presenting the insights, but there’s a deeper problem that should be addressed. Without valid data, data teams are forced to work off an increasing number of assumptions, manual cleaning processes, and a lack of validation testing. This doesn’t do any good as it just increases the probability of errors in the results. Added to that, the team will most likely suffer from increased stress levels, which directly affects their productivity.

Failure to Comply

Privacy with your data is becoming a hot topic these days. Companies have been persecuted legally and morally from users for issues concerning how companies store, utilise and protect their data. People are increasingly choosing alternatives for products that are more private. It shouldn’t be surprising to see a larger call (or mandate) that companies behave differently when it comes to the data they store. They are already regulations that companies have to follow, not just limited to GDPR. Depending on the industry you operate, information can be more sensitive, and greater protections are required. They may be government regulations that enforce you to comply, you may be audited at any time.

If you don’t have a data governance policy that includes required compliance practices, you take on an unnecessary risk that can result in punitive measures. Why bother taking that risk if you don’t have to? It affects your reputation, financial capabilities, time, and brings unnecessary costs. More importantly, companies should be complying with all the regulations that apply to them, if anything to protect themselves legally. Even though we may not agree with some of them, they are proper ways to lobby our beliefs. Failure to comply (purposely or negligently) does you no good in the long term. It can only hurt you, even if the short term benefits are appealing.

What Should You Start Thinking About to Establish Data Governance?

Photo by Maria Teneva on Unsplash

Before you embark on establishing data governance in your organization, you will first need to strategize how to implement it. Implementation takes careful thought, process, and policy development. A proper dissection is necessary to understand how data works or will work in your organization and what is needed for your goals. Figuring out the below checklist is a great start!

Data Pipelines? Lineage? How Does it Flow? ✅

  • Understand where your data sources are, where it needs to go, and how it is going to get there.
  • Drawing out a holistic vision of the overall structure will help determine how to build it and what applications you need to create.

Data Storage? Databases? Data Warehouses? Data Lakes?✅

  • How are you going to store your data? They are a few options out there but you want to try to minimize your sources of truth as much as possible. This is so the data and logic can be controlled from higher levels.
  • It is not always possible to host all your data in just one place, there may be constraints that force you to store it in different places, but keeping the data sources as little as possible or having the proper ETL processes is essential.

Provisioning Requirements? Who Gets What? Compliance?✅

  • Understanding who gets what type of data is critical. Certain information may only go to one department and some may go to the whole company. Understanding who has access to what can also reduce the risk of misusing it.
  • Understanding compliance with regulations will help your development process for your lineage.

Processes? Who are the Right People?✅

  • Set a governance team or board. You want to get the people who understand data architecture well, but also the people who understand the context of the numbers.
  • It’s best that this isn’t just left to one person and is more of a committee. Having the necessary oversight that sets standards is one of the most important tasks in the framework

Policies? Audits?✅

  • So many forget that governance requires policies that enforce the intended function. After understanding how the system will work, it’s important to formalize this with policies to maintain the integrity of the process
  • Audits are necessary. Things will always change and data may go stale. Things that were once important may not be important today. Setting audit procedures around the data and pipelines within the system is crucial for continuous success
Photo by Scott Graham on Unsplash

Establishing data governance can’t be done overnight. It is a constant process that takes dedication; the benefits that you will get far outweigh any reason to not do it. Data has become the primary fuel for successful organizations, and the ones not using it lack a competitive advantage. Putting a constant focus on your policies and procedures are of utmost importance. Most importantly, without a functional framework that nurtures it, the number of problems associated with it will consistently and significantly increase.