Data Science Stakeholder Management

I previously wrote about two Data Scientist pain points: the thrash, when you’re not sure what to work on next, and the slightly toxic coworker. But my slack messages and 1:1 meetings are full of complaints that are not covered by either of these scenarios.

My coworkers and I complain about challenging interactions with stakeholders. Requests and requirements that are difficult to accommodate or that just don’t make sense. Meetings. So many meetings. Awkward conversations. In meetings, via emails, or in instant messages. Incorrect interpretations of Data Science work. So this post is about all of that.

I’ll try to be helpful and add notes on the best ways I’ve found to deal with certain issues. I write from the perspective of a Data Scientist who has mainly worked on applications of Optimization (math programming), but will try to generalize.

Request Management

Probably the main complaint I have heard from coworkers past and present is that stakeholders are asking for too much in too little time.

“They ask for a different analysis every day and each time they want it done by the end of the day.”

“How long do they think it takes me to run the code?”

“They are asking for me to be in another recurring meeting. How will I have time to do the real work?”

“There is no time to understand the data, no time for research. We are accumulating massive amounts of tech debt.”

“This is not what I came here to work on. Why would the company want ME to work on THIS?”

“There are 12 business people asking for stuff at the same time. We have a 4 person Data Scientist team. And Liz just got here and Jim doesn’t know SQL. &?*#!”

Documentation and active management can help. Make a backlog for your team or even for yourself if no one else has made one. Refer stakeholders to the backlog when they make a request. Get your Data Science or Project Manager involved if you feel overwhelmed. They can make and manage the backlog. They can be the guardian of your time, the filterer of requests. Have efficient meetings with a limited set of invitees and an agenda set ahead of time. Request changes and even turn down meeting requests if you find yourself in too many inefficient meetings. It is okay to say no.

In Production

There is often a particular crunch when Data Science models are put into production and users first start seeing the results. A small expert team can build and productionalize a model that can help a much larger number of end users. But the small team may not be able to respond to all the feedback generated during model use.

Train a larger team to respond to user feedback. This team doesn’t necessarily have to be experts in Machine Learning or Optimization. I would recommend that the team who built the original model develop a playbook for responding to ad-hoc requests.

I have found that the majority of interesting results generated by Optimization algorithms are generated by interesting features of input data. This is true in Machine Learning too but Optimization seems especially … not robust. Automated input and output data validation checks can help. Set these up as soon as possible.

Also consider having a team dedicated to researching model enhancement. Especially for models at the core of business operations. I would argue that the research-focused team DOES necessarily have to be experts. I’ve been a Data Scientist long enough to see models maimed or replaced by less elegant or even less functional models after the original project team turns over.

Requirements Management

Another of the main complaints I hear from other Data Scientists is that the requests or requirements defined by the product manager or business don’t make sense.

Typically the concern is technical in nature. They are asking for something that cannot be done in the way that they are talking about doing it. This is especially problematic when the requirements are not just high level guidance but include low level details about how the Data Science model should work.

Sometimes I also hear from Data Scientists with domain expertise who question stakeholder logic. This is me on transportation and logistics projects. Tech does a pretty poor job of recognizing the domain expertise of, ahem, more seasoned Data Scientists. Sometimes more seasoned Data Scientists have a difficult time adjusting to the role of a Data Scientist. Startup veterans can struggle with just how little they oversee at a large company.

One variation on this theme is when the high level requirements are missing. They are asking for a system to help them make this decision but I have no idea how they want me to make that decision.

In the worst cases, the high level requirements are missing but the technical details are specified. Welcome to government work! I kid.

“They want me to model price elasticity but we have only ever charged customers one set price.”

“They want me to apply my model from the other project here but we have millions of unique customers and my model can’t support that scale.”

“They want me to optimize our assortment in order to maximize sales while ignoring any operational constraints. The solution is to put all possible products in the assortment. Do I just build an algorithm for that?”

There are variations on this complaint that involve objectives which conflict with one another or with the objectives of what some other project or team is doing.

The situation is more tense if the requirements change or these types of complaints surface midway through a project.

This is especially true on Optimization projects. Changes to objectives or constraints can break problem structure with dramatic implications for runtimes. Solution algorithms or data pipelines may need to be rebuilt.

It can help to get involved in planning discussions early and to be vocal. Let the broader team know if anything sounds especially challenging or unrealistic from a Data Science perspective. Make sure your position is known. Bring your concerns up any chance you get. Talk to other Data Scientists to make sure your concerns are valid. At the same time, commit to trying to make the project work regardless of what the broader team agrees on. Oh no, am I just paraphrasing crap Amazon leadership principles now?

It can also help to show literature or results that back up your positions. The vast majority of Data Science projects involve building models similar to models which have been built before, at the very least in academic papers on toy examples.

Research in the Real World

The challenges described above are complicated by the need to do research. It’s often not obvious which requirements will or will not pose particular problems. Business and operations staff are used to hashing out issues during relatively quick and intense meetings. But Data Science issues often cannot be solved, or even identified, like this.

Stakeholders will want to be involved at all times and to track progress continuously. But there are often lengthy periods where the team needs to run experiments that are only interesting from a Data Science perspective or which end up being a dead end. Or to build up the code structure. Or to collect and wrangle the data, etc. This is especially true towards the start of a project, roughly between the planning and productionalization phases.

My advice here is to get business results in front of stakeholders as soon as possible. Even if you aren’t confident in those results. Do not show stakeholders the mathematical programming formulation. Do not discuss the metaparameters of your Machine Learning pipeline. The discussion will lose focus. You may be cajoled into making technical changes that do not make sense technically.

It’s easy for me to say this sort of thing but difficult to get those early results fast enough. If you are struggling, I’d go back to my advice above to lean on your Data Science manager. Communicate a lot during the planning phase before the research and infrastructure work. Ramp up communications as soon as the exclusively Data Science work is done and re-invest in stakeholder relationships.

Confidence and Misinterpretation

My advice about early results runs somewhat contrary to lots of advice you’ll see about Optimization projects specifically. If you show results to stakeholders too soon, they may misinterpret the results or see something off and lose confidence in your work.

These are totally valid concerns, fitting in nicely with the last of the common complaint archetypes.

“the business saw one result that they didn’t like and spent the whole meeting lecturing me about how the business operates”

“Jan from the ops team kept referring to the output as probability of conversion but that’s not what I am modeling at all”

I would say that the concerns will be worse if you show models in place of results or if you wait too long to show results. The results should be universally understood. If there are inconsistencies in how people view the results, it’s best to uncover these inconsistencies early. If there are troubling results, it’s best to uncover those troubling results early. A healthy Data Science environment will allow for, and even encourage, model revision.

It helps to neither be overly confident nor overly feeble when presenting results. Business and operations staff, in particular, will know details of how your company works better than you. Don’t assume that their concerns are trivial. But also don’t agree to change everything because of one result that one stakeholder didn’t anticipate.

Talk with your tech lead or manager when unsure what to do next. Consider escalation. Your Director may not realize that your model has become famous on the Ops team and that they are attempting to use the results 12 different ways. But don’t escalate unless you’ve tried speaking with your stakeholders first and are seriously convinced that the project setup is wrong or concerned that a mistake is about to be made. Pick your battles.

Consulting taught me not to take comments and questions at face value. Try to understand why a stakeholder is making a particular comment. What experiences have led them to speak up here, now? What larger point are they trying to make? How can I address the larger point, moving forward?

Was that helpful? Obvious? It’s hard to be interesting without saying anything controversial. Especially on a dry, somewhat negative topic. I’ll try to have something more upbeat and more interesting next time. Special thanks to Ethan and Alice for helping me think through some of this.