The Data Scientist’s Hierarchy of Needs

Posted by:

|

On:

|


I haven’t written in awhile. I have a good excuse. My babies recently turned one. Hurray!

I remember worrying, in the before times, that I wouldn’t know what my babies needed when they cried. That hasn’t been an issue.

They need very few things at first. You change their diapers and feed them every 3 hours. That works most of the time, for most babies.

Taking care of my babies for those first few months was still the most difficult thing I’ve ever done. The main difficulty was having to do the cares around the clock. When they eat slowly, cleaning and down time disappear. There was also the trauma of not being able to help one baby because you’re busy taking care of another baby. RIP parents of multiples. We had some extra anxiety because we were first time parents. Medical conditions can make this all much worse.

I’m mentally recovering from those first few months. The babies are much happier now, even though they need and want more things.

The babies want to crawl around the house most of the day in order to try to kill themselves in various ways. It’s obvious even though they can’t speak. It’s also not an emergency if they don’t get to do everything they want to do right away. Being hungry, tired, constipated, and under- or over-stimulated can make them cranky. Not an emergency at first, but slowly increasing in urgency. Maslow’s Hierarchy of Needs seems relevant.

Cranky Data Scientists

Most companies have a lot of cranky babies employees. Lots of companies have paused hiring and/or are laying people off. Decision making about layoffs and other policy changes often happens high up in the org chart and is opaque to Individual Contributors (ICs). Not great for morale. Lots of companies are mandating a Return To Office, tweaking compensation formulas or benefits, making moves to please the new administration, and/or pushing ICs to use Artificial Intelligence. All of these things can rub tech workers the wrong way.

When my work buddies and I speak, we complain about lower level issues. My previous post about stakeholder management covers a lot. There are also other potential sources of discomfort. More generally, Data Scientists complain that something or the other is preventing them from doing the work they want to be doing to their standards. Their managers don’t get it or aren’t helping or aren’t helping fast enough. A lot of times the ICs aren’t communicating effectively, to their managers or others. I don’t want to minimize their our concerns but it does remind me of my babies.

The Basics, via ChatGPT

I asked ChatGPT to come up with a Data Scientist hierarchy of needs. The (manually tweaked) results are a good jumping off point.

🧱 1. Physiological Needs (Survival Level)

The bare minimum to function in the role.

  • Access to data (no data = no data science)
  • Basic tools: a working laptop, Python/R, Jupyter, SQL, spreadsheets
  • Coffee ☕ or whatever fuel keeps you going

🛡️ 2. Safety Needs (Job Security & Stability)

Feeling secure in the environment and role.

  • Stable employment
  • Clean and documented datasets
  • Version control (Git saves lives)
  • Reasonable deadlines

👥 3. Belongingness and Love Needs (Team & Collaboration)

Being part of something bigger than yourself.

  • Supportive teammates (engineers, analysts, product)
  • Mentorship and peer feedback
  • Cross-functional collaboration (not just “go crunch this”)
  • Company values data-driven decisions

🏆 4. Esteem Needs (Recognition & Mastery)

Respect, status, and skill mastery.

  • Recognition for impactful insights or models
  • Presenting findings to stakeholders (and not being ignored)
  • Mastery of advanced techniques: NLP, deep learning, optimization, etc.
  • Being the go-to person for tough problems

🌄 5. Self-Actualization (Fulfillment & Innovation)

Doing your best work and pushing boundaries.

  • Driving strategy, not just implementation
  • Publishing work (blog posts, research papers, talks)
  • Building reusable tools or frameworks
  • Exploring passion projects

I think Walmart measures up fairly well here, although there’s always room for improvement. The datasets aren’t always clean and well documented. This might be the biggest issue for Data Scientists at Walmart. It’s a function of the retailer’s relatively long history and immense scale. We are making progress. I’ve heard mentorship and useful peer feedback can be hard to come by if you end up on the wrong team. That’s probably true everywhere and you can find help within Walmart if you knock on the right doors. A few years ago, I was helping my org try to stand up up a formal mentorship and buddy matching program. Maybe it’s time to revive that. We don’t drive strategy or publish nearly as much as we should. This is another major issue, especially for more senior Data Scientists. Times are tough. I’ve personally heard very senior people say the right things. Stay tuned. I would say Walmart does relatively well on all the other measures.

At least in my area, the basic Data Scientist needs are largely being met. But there’s more to the story.

Inventory Placement

My main project for the past two years, Inventory Placement, is an interesting case study. I don’t want to criticize any of my coworkers or paint the project in too bad of a light. The project is increasing customer satisfaction, saving Walmart billions, and reducing environmental emissions. But I was happy to spend a good chunk of time on a side project recently that felt a little more … fulfilling on a personal level (?). A general malaise does seem to have crept in. On Placement, our primary stakeholder on the Business side noticed that our velocity on the project seems to have slipped. Now both him and I, and probably others, are trying to dig deeper and understand what’s happened on Placement. I started meeting regularly with all the Data Scientists on the project, trying to understand what each is dealing with.

I previously wrote about my first project at Walmart. We replaced a complex greedy algorithm for Assortment Selection with a short script calling an optimization solver. We had a working prototype operational in a few weeks. We replaced a bunch of hacks that shaped algorithm outputs with optimization constraints and modification to the objective function. Business liked the results. A team of 10 relatively junior new hires took over from 45+ experienced consultants. We abandoned a solid but burdensome agile project management paradigm. I owned the optimization formulation while a Business lead, two Machine Learning Engineers, and a Data Engineer did nearly everything else. Clarifying the requirements and data contracts early on helped immensely. Everyone got along. The impacts were quantifiable and obvious.

Inventory Placement at times last year felt like the opposite. We replaced my optimization with a pseudo-greedy algorithm. Wrote scripts to modify our input and output data in various ways to shape our results. We started building a solid but burdensome agile project management framework. Some features were put into production months or even years after they were proposed. Requirements were missing, vague, or changed. Our team of 10 became 45+. Everyone seems to have a fancy job title. At some point, we had to stop and reconsider our “Ways of Working.” The potential impacts of new features were difficult to estimate.

That all sounds worse than it was. The psuedo-greedy algorithm is neat. The decision to implement it was clearly correct based on feedback from our initial results. We desperately need the additional emphasis on project management. Requirements gathering for the features we are working on now is challenging, to put it mildly.

The existing code base, the new features we work on, and the team’s organizational structure have all become larger and more complex. Maybe this is the nature of tech projects. Start small, scrappy, and, if you’re lucky, successful. Grow to the point where things become harder to manage and morale begin to break down. That sounds a bit pessimistic, maybe even nihilistic. What are the tangible issues on Placement? Has it gotten more difficult for the average IC Data Scientist to get their needs met and, if so, why?

Back to ChatGPT

Let’s go back and analyze what ChatGPT wrote earlier. I highlighted 3 areas where there is room for improvement at Walmart out of 19 Data Science needs. These same 3 issues have become major concerns for Data Scientists at Walmart. The new features we are working on relatively sophisticated and we’ve ended up looking for new input data. In a handful of cases, we found data sets that other teams built that looked good initially but didn’t pass our initial QA/QC checks. The Data Scientists I’ve spoken with aren’t surprised or annoyed, but they have noticed that data issues slow our velocity and make communication with stakeholders more challenging.

There are also just more new features we are working on, leading to people feeling more isolated and making it difficult to get effective peer feedback. Possibly just a curse of a successful, large and growing project.

Data Science not driving strategy is maybe the most concerning issue on Placement and has been for some time. An optimization engine or two is at the heart of the project but somehow most new features are being planned without much Data Science input. Our stakeholders like to divide and conquer, agreeing on a plan amongst themselves before scheduling meetings with 4 or 5 conspirators convincing 1 or 2 Data Scientists to develop code to modify our input or output data in this new way. There’s often little in writing for the Data Scientist to share with his or her team.

So these are all issues but I’ve heard other concerns that haven’t come up yet. Let’s also go back and ask ChatGPT for a bit more specifics about a Data Scientist working on an Optimization based project in the Replenishment space. We can skip the Survival Level and Self-Actualization needs. Everyone is surviving and the complaints I’m hearing aren’t about failure to find deep meaning.

🛡️ 2. Clarity & Support (Security Layer)

  • Clearly defined objective function and constraints.
  • Context around the business domain, stakeholders, and expected outcomes.

It’s interesting how ChatGPT reframed this layer of the hierarchy. This short list covers a lot of the complaints I hear about on Placement that I mentioned above about driving strategy. But I’m not sure these are really low level needs of a Data Scientist. My coworkers and I expect that it is part of our job to suss out objective function(s), constraints, and Business context. But we still complain. Doing this well is pretty advanced optimization Data Scientist stuff in general and seems especially difficult on Placement. We could all use more help.

More emphasis on documentation would improve things. The whole project team has started to document a bit more, but it doesn’t feel like there is any quality control on that documentation to date. Some of the requirements documents I’ve seen have been shocking. The more conspiratorial among us think it’s on purpose. I hope I’m not offending anyone here. I think my position is well known. We are learning how to do work together and what a well crafted technical requirements document, for example, should look like. Writing these documents and meeting these needs isn’t as easy as it sounds because the new features we are working on are complex and vague. What are the expected Business outcomes or acceptance criteria when the task is to choose new geographic units for use in an optimization engine? It can help for Data Science to produce sketch documents when we aren’t getting the documents or appropriate versions of the documents from the project team. One of my goals is to try to offer assistance to teammates having difficulties here.


🔁 3. Iteration & Experimentation (Belonging Layer)

  • Tools for profiling performance and diagnosing infeasibilities.
  • Logs, metrics, and test suites for monitoring solution quality.
  • Working with data engineers to ensure pipeline robustness.
  • Feedback from stakeholders on usability and interpretability of results.

ChatGPT threw me for a bit of a loop here. It’s not obvious what profiling tools, test suites, and available Data Engineers have to do with Belonging. Maybe it would make Data Scientists feel like they are on a project where they are set up to succeed? The first 3 listed items aren’t things that come up frequently in discussions with my Data Science coworkers. But they do seem relevant.

I do feel like there is room for improvement here on Inventory Placement. I’ll be discussing with my teammates. I have personally felt for a long time that a good Data Engineer and Machine Learning Engineer who work horizontally across Placement project areas might be the missing pieces. We could build more robust code, logs and tests. We are definitely now struggling with tech debt. The growing scale and complexity of the project and project team again makes the situation worse. It’s also worth pointing out that a lot of our recent efforts have gone into defining the optimization problems we wish to solve rather than, say, solution algorithm improvement. A lot of the bullet points listed above will almost certainly become more relevant later.

The last bullet point is particularly salient. A complaint I have heard a lot recently is that there is no great way to judge the quality of the results of recent Data Science work. And/or stakeholders aren’t concerned about the quality and just seem interested in ticking a box off on a feature tracking spreadsheet. The former is something I hear more from stakeholders and the latter from the Data Scientists themselves. This is a complaint I have heard at other organizations as well, particularly when I worked with government agencies. The magnitude of the issue does seem correlated with organization / project / project team size and complexity / hierarchy. I know on Placement we actually do get a lot of feedback, both from senior leadership and from downstream teams impacted by our work. Making that feedback more visible would be a step in the right direction. It might be time to have a Data Science team dedicated to evaluating our results. We’ve started building a stochastic supply chain simulation model that could help. I feel like we are slowly moving in the right direction here. Maybe if I surface this concern more, we can make more progress faster.


🏆 4. Esteem Needs (Recognition & Mastery)

Respect, status, and skill mastery.

  • Acting as the “product owner” of the optimization engine — balancing performance, maintainability, and user needs.
  • You’re consulted for decisions beyond your immediate scope — from system design to business strategy.
  • Your contributions are acknowledged across org layers — not just by other data scientists.

This just adds more color to the issue I already mentioned about how Data Scientists don’t seem to be driving strategy. Part of the problem is that we’ve got a lot of very senior people working on this project, many of whom have a strong background in supply chain logistics. If there’s one thing everyone on the project can agree on, it’s that there are too many cooks in the kitchen.

I think the Engineering team is particularly sensitive on some of these points. I don’t feel informed enough to say much more.

I score better than many of my Data Science coworkers on the Esteem Needs. It helps that I’ve been on the project since its inception. I’ll be working to get my coworkers more recognition and ownership.

One of the tricky parts will involve giving everyone opportunities for ownership while also ensuring we have a cohesive team with effective peer review and manageable tech debt. Our Data Science team is already split up on 5 or 6 islands of work. That’s the only way to get everything done. But it’s difficult to keep track of what is happening on other people’s islands. We planned to have deep dive technical sessions at our weekly Business review meetings. These would help. It’s a shame people are a bit hesitant to volunteer to give demos. Maybe I can volunteer them.

Even though I believe that I score better than my coworkers on Esteem Needs, I have to admit that I am often frustrated. There’s a lot of praise that rings false or doesn’t translate into action. I’ll also be giving that feedback.

One of the awkward things is that different teams and people are competing for the same respect and status. Too many cooks in the kitchen. Redefining our Ways of Working helped in some respects but made things worse in others. It’s easy to point to the produced documents and complain that the other team isn’t living up to their end of the deal. A single threaded owner would help. Consequential sprint retrospectives could help.

Conclusion

So what went wrong on Inventory Placement? I’m not sure all that much has gone wrong but I will be meeting with my coworkers in the next few weeks to try to understand more.

It feels like we are having a hard time defining acceptance criteria for new tasks, finding acceptable input data, managing tech debt, and giving people sufficient ownership. The growing size and complexity of the project and project team org chart are largely to blame.

Just like with babies, it’s easier to list the needs than it is to keep them met around the clock. Just like with toddlers, the situation is complicated by issues that aren’t urgent until they blow up. Wish me us luck!

Posted by

in