How to get data people to actually document their work, and why it is important

How to get data people to actually document their work, and why it is important

How to get data people to actually document their work, and why it is important

Noy Twerski

Dec 17, 2024

“Getting people to actually document their work, that’s the hardest part”.

I was talking to a Data Director yesterday that tried creating an internal knowledge base for their SQL queries, and after multiple tries found us and shared this sentence with me.

I also recently stumbled across Olga Berezovsky’s post that dives deep into this exactly, one of the most underestimated yet super important topics in data: documentation.

Since Sherloq is all about making SQL documentation easy, it really got me thinking.

Olga states:

“I am convinced that the challenge of maintaining and adopting documentation in analytics is the hardest one to solve because analytics is a very cross-functional discipline. It requires fitting into technical, business, and product domains at the same time.”

And I definitely agree that maintaining and adopting documentation in analytics is the hardest challenge to solve.

But here’s a different angle as to why:

➡️ On the one hand:

Documentation, whether it’s in data, product, development, sales, etc., is actually quite boring and really relies on people remembering to do it.

That means you ultimately need someone, a Data Analyst in this case, to be willing to sit down and actually write down all this documentation.

And for what purpose?

Maybe for their future self, maybe for a future teammate. It doesn’t matter which one — we’re talking about the future.

And when it comes to actually prioritizing documentation over other tasks, well… It makes sense to do the other things first. And when will you go back to documenting? You probably know the answer to that :)

⬅️ On the other hand:

Data Documentation is actually pretty important.

It can be the difference between having 200 active users or 300. The difference between a 5-month sales cycle and a 10-month sales cycle. Or the difference between 100 signups last month and 150 signups this month.

Unlike other types of documentation, data documentation can actually affect decision making (Data-driven companies, right?).

When you don’t have data documentation, creating consistency in data results becomes really tricky. Many articles have been written about creating one source of truth for data — but what is that one source of truth?

It’s actually various types of well-defined and documented data: HL metrics and definitions, metadata and architecture, and the SQL/Python code itself.

And when you don’t have those, what you end up with are a lot of frustrated people, wasting so many hours on:

trying to find the exact definition for Weekly Active User (WAU), looking for the SQL that they’ve written for an analysis from a few months ago, or trying to understand the difference between “users_table” and “users_tables_v2.”

So the issue with data documentation is that there’s an exact correlation between how important it is and how tedious it is:


Then the main question would be:
If it’s that important, how do we get people to document anyway?

In my opinion, there are 4 main ways:

  1. Personal Benefits -

  • Creating immediate value — Start out with documentation that’s actually useful for the team in their day-to-day job, for instance in the SQL coding world these could be common SQL errors, useful SQL functions, best practices for optimizing their queries.

  • Praise co-workers who actually do the work — Talk about them in weekly meetings, send their documentation to the team on Slack, tell everyone when you used it in your day-to-day work and it was helpful.

  • Gamify the process — For instance: anyone who documents x times gets cool company swag, a team that documents over time gets a night out.

2. User Experience -

  • Stick to the Current Workflow — Enable documentation in their current workflow, meaning no context switch and no excuses in the form of “I’ll do this later.”

  • Cross Platform — Data is a cross-functional thing, and it lives everywhere. Documentation tools need to connect with the org’s DB/DTWH, into their BI tool, their SQL editor, their productivity tools. And if they have a few of each kind? Then the tools should support them all as well.

3. Collaboration -

  • Create engagement — Add the ability to star / like / love someone else’s code / function / description etc.

  • Show the users when someone used what they’ve done — Viewed the page, copied the description, edited their query (can be done simply with docs, notion, jira, confluence etc). This creates a feeling of value over time.

4. Automation + AI -

  • Documentation that can be generated or automated, should be — Generate names, descriptions, tags, graphs, summaries for: metrics, queries, tables, fields, analysis. Whatever makes sense and doesn’t become spammy.

  • And if it can’t — Create automatic triggers that would remind users to document their work. For instance: notifying users via email, slack, zapier etc when changes are made — in the data, DB, code. This can also be relevant for periodic events (like weekly stakeholder meetings or ongoing analysis).

No matter what you choose, documentation is a matter of creating habits. So find something that works for your team and be sure to stick with it over time.

Last note — I couldn’t end this post without addressing what we’re doing at Sherloq. We’ve built the tool based on these 4 techniques, and have seen the impact they have on our users.

If you’re looking for a tool that (also) documents your SQL work, feel free to check out Sherloq for free.