In defense of the noble work of the Data Analyst
It's not the hottest role in the current data hype cycle, but it's a vital one
[Upfront note from Randall: displaying my keen understanding of timing, I’m sending this out three days before Christmas; I’m sure soooo many people will read it! If this is too long, I’ve made a short summarized version at the bottom. Have a wonderful Christmas (or Hanukkah) and New Year. I will be back in early 2023.]
On Saturday I read an interesting LinkedIn post by Bethany Lyons from Shape Data in London, and it’s worth quoting it in full:
Something that surprised me: many analysts are moving into data engineering because it's easier to deliver value as a data engineer, despite being further from the business. "I harmonised N data sources that can now be linked" has more value than "I created N charts that no one is using." A driving thesis in the data community seems to be "If no one is using my work, it's more valuable if it's highly reusable." 🤔
There’s a lot to unpack here, actually, since this brief post touches on many of the big themes in data over the last few years: the increasing focus on technical solutions in data, and the challenges of getting companies to actually use the output of data teams.
My own first thought was to say that shouldn't a chunk of the blame fall on the heads of analytics leaders in this case? If analysts feel under-valued vis a vis data engineers, that's not because data analysis is not a valid line of work, it's more likely that their talents are not being used effectively, and really that's the responsibility of the data leadership to unlock the problem. Data analysis is generally the most outward-facing part of any data team; if the output isn't being properly appreciated, that's not a problem the analysts should shoulder alone.
In this post I would like to expand on that thought, as I think the moment is ripe for a fresh appreciation of the importance of the work that data analysts do. Data analysts play a crucial role in data teams, and this shouldn’t be forgotten.
I’m going to start by discussing the hype cycle of data, then look at some of the challenges and frustrations that data analysts can face, and then I’m going to draw on my own experience to present some ideas for how data leaders can help analysts to conquer these challenges.
The Hype Cycle
One thing that I’ve noticed in my time working in data is that this is a professional field that is quite prone to hype cycles; seemingly every year there is a New Exciting Techology That Will Solve All Data Problems and an Amazing Key New Role That You Need To Create Immediately. I’m sure this is also the case in the wider tech world, but data is my own field, so it’s where I’ve observed this most closely. There’s a variety of factors that drive the hype cycle: some of it is commercial, i.e. the needs of software vendors, consultants, and thought leaders to stand out from the pack, and some of it, I think, comes from the type of person who is attracted to data work. This is a field, I’ve observed, that tends to attract people with high curiosity levels and openness to new solutions - I’m certainly including myself in that group!
With this being the case, I think that there is a tendency to always look for new ways to solve problems, because, well, that’s exciting. This is good, but it also means, sometimes, downplaying existing solutions. In the data world over the last few years this has played out through a focus on new technical solutions, particularly with the rise of the so-called Modern Data Stack, as well as associated new roles like that of the Analytics Engineer.
Companies en masse have moved to implement these new tools in their data stacks, and this has coincided with a new appreciation for the importance of data to their business, as well as a fresh understanding of just how many challenges companies face when it comes to making their data aspirations a reality (as I discussed in my last newsletter).
Taken together,I have observed that much of the data world has turned inwards towards solving technical challenges: fixing pipeline processes, improving data quality, building new and improved data models, with the implicit promise that this will allow for analysts to generate improved insights down the line, or even to do away with the analyst function and allow all stakeholders to generate their own insights directly from very clean, well-structured data sets.
This has meant a big increase in demand for more technically-minded data roles, particularly in data engineering; I can tell you from personal experience that data engineering has been the hardest data function to hire for in recent years. This is reflected in salaries - there is now a premium to be paid for senior data engineers vis a vis senior analysts, at least in Berlin (although I assume this is true elsewhere as well).
Consequently, the question of how to derive relevant business insights from data (which is our central task, in my opinion) has become, if not lost, then somewhat downgraded in importance relative to technical fixes.
Don’t get me wrong - there is important work being done here, and many interesting new technologies and techniques are being developed. However, there is certainly an element of hype afoot, and I think part of it, if I’m being cynical, is down to the fact that it’s easier to replicate technical solutions than it is to replicate insights.
I can read a data influencer talk about new Snowflake functions and add them into my workflows immediately, but they won’t be able to tell me how to prioritize between different analytical projects, as business insights are so company-specific, and what any company needs at a particular time in terms of analysis is not generalizable.
The challenges of being a Data Analyst
So one reason that data analysts might be moving to engineering roles would be the (current) high demand for the roles, and consequent salary bump for switching. The second reason, as Bethany Lyons alluded to in her post, is more related to job satisfaction, and to frustrations with analytical work not making the desired impact on the business.
First, let’s look at what a data analyst typically does inside a tech company (with the usual caveat that there is always variation).
My personal definition of a data analyst (and I was one before I moved into leadership) is someone who is responsible for working with data to provide information that will allow their business stakeholders to make better decisions.
At Penta we split our analytical work into the following categories:
Data dumps: pulling data from the data warehouse and dropping it into a G-Sheet or Excel file; depending on the request, the underlying query could be simple or very complex.
Monitoring dashboards: Automated Tableau dashboards for monitoring KPIs and other metrics; some were company-wide, some were for specific departments (like Marketing), and others were for individual teams (like the Cards team in Product).
Ad hoc dashboards: One-off Tableau dashboards used for answering specific business questions; we tried to avoid doing too many of these, as they required quite a lot of work and didn’t have much longevity.
Long-form analysis: These were deeper analytical pieces that we published on our internal Confluence (usually with a slide show presentation to accompany it); these featured text, tables and charts and were designed to provide a deeper dive into a topic, for example comparing user behavior on the Penta website and on the mobile apps, or doing a deeper dive on better understanding our customers.
Most data analysts will use SQL to query relational databases, and some will use other programming languages like Python and/or R for exploratory analysis. Analysts’ output will typically amount to some combination of dashboards (often created with visualization software like Tableau or Looker), spreadsheets, and longer-form reports in slide decks or PDFs. Depending on the company culture, analysts might also model reusable data sets that make it easier to answer business questions - this has become a very hot topic with the growth of dbt, but the actual practice has been around for much longer (at my previous employer, Neugelb, we automated data models using Airflow).
So, what would be some of the frustrations that an analyst might face in the job?
Technical frustrations can be abundant in a data analyst role, and these are fundamentally about technical issues getting in the way of producing insights - spending your time wrangling data, workingwith bad or missing data, debugging SQL statements, and so on. All this stuff eats up time that you could otherwise spend trying to figure out what the data means. Solving these frustrations has been the focus of the technical revolution that has washed over the data world in the last few years, and, indeed, a lot of progress has been made.
The other major type of frustration is something that is less easily fixable via technology - it’s more to do with people and processes.
What kinds of non-technical things have I seen that would frustrate a data analyst?
Being a report monkey: Sometimes analysts are just expected to go from ticket to ticket, answering (often simple) stakeholder requests without an opportunity to go deeper.
Unreasonable demands for ad hoc reports: Sometimes stakeholders will demand that an analyst drop everything and turn around a report NOW, usually because the data points are needed for some upcoming meeting.
Tons of effort for little reward: Every analyst has experienced working really hard to create a dashboard, and had their heart sink when they see it has hardly been used by the intended stakeholders ... or anyone at all!
Indecisive stakeholders and zombie projects: Another common source of frustration are ‘zombie reports’ that keep coming back from the dead, as stakeholders request constant changes - these are frustrating because they require continuous work without the fulfilment of being finished.
Requests arriving from everywhere: If you don’t have some kind of process for maintaining data requests, it can be really tough to deal with requests arriving via email, message services (like Slack), in video calls, in person, etc.
Overwhelming demands: Most tech companies typically have a small number of analysts relative to the number of teams and business units, and those analysts therefore face a huge backlog of requests; it can be tough to feel like you are perpetually scrabbling up a mountain that you can never fully scale, with little time to really breathe and think deeply on a project.
Difficulty prioritizing: When you have a lot of potential projects and low business context, it can be hard to prioritize; often it means that whoever shouts the loudest or is the most senior gets their request looked at first.
Distance from the business: Every analyst wants to contribute to the success of the business, and it can be demotivating if it’s not clear how (or even if) their efforts are informing the decisions being made by company leadership.
I’m sure those of you who have worked as analysts will recognize stuff from this list!
One thing to say about these frustrations, is that they aren’t really fixable through technical means. Tools can help with them, but ultimately they are only solvable (or at least ameliorable) via leadership, and leadership isn’t something that software vendors can offer. It’s worth remembering that data analysts are the most outward facing part of a data team, and therefore are the most exposed to external opinions and frustrations. Data leaders, therefore, have to be particularly sensitive to the challenges they face, and ready to take action to solve these issues.
My theory of managing Data teams
So here we get to the part where I share how I’ve tackled some of these issues while leading data teams. I don’t pretend to have all the answers, but I think I’ve done a pretty good job of managing analysts and making them feel that their work is valued and that their contribution is vital.
One thing that I’ve always done is to give each analyst their own areas of responsibility; topics that they own. So for example at Penta I gave Alex Lee, our most experienced analyst, responsibility for two of the toughest topics: Marketing and Onboarding. This meant that he owned the day-to-day relationships with the key stakeholders (in this case the Head of Marketing, the Performance Marketing Lead, and the Onboarding PM), and was responsible for working with them to prioritize analytical projects on these topics.
For the stakeholders, the advantage of such a setup is that they have a dedicated point of contact in the data team, someone to speak to directly, to bounce off ideas, and iteratively build up a set of resources to help them. For the individual analyst, it gives them a sense of ownership, they develop domain expertise, and they can build up soft skills. They are in an equal relationship with the stakeholder and can suggest projects and share ideas; they aren’t just report monkeys churning through tickets, but someone who can help shape the overall marketing strategy (for example).
Of course, it’s not fun to only ever work on one topic, so we also made sure that every piece of work was checked by another analyst before being handed to the stakeholder - this meant checking the code as well as the output. Of course, the genesis of this idea was for quality control purposes, but it has the additional bonus of ensuring that even if you aren’t responsible for, say, finance you would have at least some exposure to the topic, to the main metrics, to the relevant tables and themes. This also made it smoother to ensure coverage when analysts were sick or on vacation.
The downside to such an approach is that it does put a lot of responsibility on each person’s shoulders to prioritize their projects and time accordingly; I mostly stayed out of the day-to-day work choices and focused more on the strategic big picture. This is not a setup that works for everyone; for people who prefer a very structured work environment, with only one task to focus on at a time, it’s not really a good fit.
Overall, though, I think that this approach helped to ameliorate some of the common issues that I discussed above - because the analysts owned these stakeholder relationships, they weren’t just churning out reports blindly, they were connected to the business, and they had the opportunity to work with the stakeholders to refine requests and ensure that what we actually created was worth the effort and was useful - obviously some dashboards were more useful than others, but I think we had a good strike rate.
Where I came into the picture to help was in shaping the processes - I worked with our Data Ops Manager (Gaurav Punjabi) to create a centralized request process where people would fill out forms that would automatically create Jira tickets. This reduced a lot of the back-and-forth involved in handling requests, and created a digital trail that stakeholders could follow so that they wouldn’t feel like their request disappeared into the void. (Maybe I might write a more detailed post about this process, if people are interested?)
The other way I tried to help was by acting as a shield for the team, by helping them to prioritize their work according to the overall business strategy, by adjudicating between competing requests, as well as pushing back against unrealistic deadlines (and giving them the permission to push back as well). We also set the rule that once we had agreed with a stakeholder the details of a project, that is what we would deliver; and then they could make a new request if they wanted additions or changes after delivery. This helped to stop the phenomenon of zombie projects that lurched across weeks and months with endless small tweaks.
One thing that I didn’t do enough of, was to market our work. The analysts produced a steady stream of things that were greatly appreciated by the stakeholders that requested it, but I didn’t consistently push the insights out into the wider organization, and this is something that would have been good for the analysts (to get the appropriate recognition) and good for the company (for the sharing of knowledge). I did it in fits and starts, but it would have been better to publish a weekly digest of key insights, and to make sure that we were regularly sharing the most important info with the whole company. If you’re in a similar position, make sure that you aren’t just doing the work, but relentlessly marketing it inside the company.
Taken together, these strategies helped to ensure strong morale among the team and overall high job satisfaction - the analysts had a diverse range of projects to work on, and the opportunity to learn new skills and develop new competencies. Not everything I’ve described will be appropriate in every team organization, but I firmly believe that many of these strategies will help you to build happy and productive analyst teams in your own organization.
Short Summary of this Post
There is a trend in the field of data towards a focus on technical solutions and new roles, such as analytics engineers, rather than on data analysis. As a result, there has been a increase in demand for technically-minded data roles, particularly in data engineering, and a premium on salaries for senior data engineers compared to senior analysts. One result is that the question of how to derive relevant business insights from data has become downgraded in importance compared to technical fixes. However, it’s important to acknowledge that there is important work being done in the field, due to the new technologies and techniques that are being developed.
There are two reasons why data analysts might be moving into data engineering roles. The first reason is the high demand for and consequent salary bump for these roles. The second reason is related to job satisfaction and frustrations with data analysis not having the desired impact on the business. A data analyst is responsible for working with data to provide information that will allow stakeholders to make better decisions. Analysts typically use SQL to query databases and may also use programming languages like Python and R for exploratory analysis. Their output may include dashboards, spreadsheets, and longer-form reports. Frustrations for data analysts can include technical issues and non-technical issues such as unrealistic expectations for ad hoc reports, lack of recognition for their work, and difficulty getting their insights implemented.
As a data team leader, I have implemented a system where each data analyst is given their own areas of responsibility and topics to own. This allows them to develop domain expertise, build up soft skills, and feel a sense of ownership over their work. It also helps them to build relationships with stakeholders and have a say in shaping the overall strategy for their topic. I also implemented a centralized request process using Jira tickets and acted as a shield for the team by helping to prioritize work and push back against unrealistic deadlines. I also tried to ensure that the team had a diverse range of projects to work on and provided training and development opportunities.
One last (music) thing
As I mentioned before, I’ve been a dj for 25 years, so I’ve decided to end each newsletter with one of my mixes from my (extremely extensive!) back catalogue.
Self-indulgent? Sure.
For this one, I’ve chosen my most recent mix, which I recorded last month to listen to while traveling to Vienna. It’s a selection of some of my favorite recent(ish) dubstep tunes, and as ever it’s mixed completely on vinyl (which I am still hopelessly addicted to - obviously digital is more praktisch, although way less fun).
If you like deep, dark, dangerous bass … well, this is for you.
(If you don’t, that’s ok too!)
Thanks again for reading, and see you in 2023.
By the way, if you missed it, here’s the last edition: