Ask these 3 questions to make sure an employer is ready for you
Data science has been called “the sexiest job of the 21st century” — a sentiment I’d believe if I saw more business leaders hiring data scientists into environments where we can be effective. Instead, many of us feel misunderstood and invisible.
The sexiest job of next century
We are the people who help inspire new directions for your business, reduce your risk of setting important decisions on fire, and automate the ineffable through machine learning and AI. We make your data useful, yet you make us live in resource squalor. You ask us to make our peace with:
Unskilled leadership — If you don’t have personnel skilled at leading and managing the data science function, we’ll have a miserable time. Without decision-makers skilled at assigning work appropriately and making data-driven decisions, data scientists are effectively useless.
No data — If you hire data scientists before data engineers, it usually means we have no data to work with and we must first build the data engineering function for you or we’re forced to tuck and roll. If we stay, we end up doing a job other than the one you claim you’re hiring us for. I’ve said it and I’ll keep saying it: you need high-quality data for data science to be effective. We’re not magical leprechauns, so we can’t make something out of nothing for you.
Nasty tools — Data science developer tools are a misery. The ecosystem is fragmented, especially when it comes to AI, and even the best options are far from perfect. There’s always something that makes the ride bumpy.
If you’re interviewing for a data science gig, make sure you grill your potential employer about their plan for all three of these points so you don’t end up in a sad spot. Don’t forget to ask about people – whose job is it to make sure you have data? Who gets fired if all your insights aren’t used for anything? Who picks the tools you use and makes sure they play nice with all the other infrastructure?
If the answer to all of these is YOU – perhaps because the company doesn’t know what data science is (but wants it anyway) – then set your expectations appropriately. You’re going to have to perform multiple jobs yourself, tackling them in approximately this order. Trained for #6 at school? You probably won’t get to those cool Bayesian networks for a few years (and you’re unlikely to do the actual lifting on them when the time comes because you’ll be too busy managing the team you built). It might be a perfect fit for you, but please do look (and think!) before you leap.
I’ve written plenty about leadership and data, so it’s high time I mentioned tools. Applied data scientists (including those working on ML/AI) don’t want to build our tools from scratch (that’s a different job – if we wanted it, we’d be in it already). For example, we’d much rather use an existing package to make a histogram than write the code that displays rectangles to a screen. Asking us to roll our own is like asking you to build your own microwave if you’re opening a restaurant. We’ll build them if we have to, but we’d prefer to jump straight into the cooking.
Working with Satan
Sometimes the proprietary tools that management foists on data scientists are even worse than the ones they could cobble together themselves. I remember one that my friends had nicknamed “Satan” — as in, “Yeah, I know that takes one line in R but you should probably budget all day to get it working in Satan.” It’s hard to go through the day with a song in your heart when the tools at your disposal are horrible.
Take a designer’s perspective
Sometimes the trouble with available tools is in the eye of the beholder – perhaps the root of your frustration is that you’ve picked up a tool that wasn’t made for you. Let’s take a look at two tools of Google origin. Keras is not only a beautiful API, but it was built with the data scientist in mind. For example, Keras’s error messages are designed to guide the data scientist’s next move, so they’re concise and friendly-looking, while an equivalent mistake in TensorFlow spits out a text jumble of Dickensian proportions.
This shouldn’t surprise you if you’ve got your design thinking hat on; as the industrial lathe of AI, TensorFlow was not originally designed with data science users in mind. It was made for researchers breaking new ground at Google scale… and it’s good at what it’s built for. It’s also a tool you often feel like strangling.
The great news for us data science types is that even TensorFlow is getting more cuddly. The new 2.0 release is moving in our direction and it shows. Let’s cheer them on towards the day data scientists can actually say “I love TensorFlow” (as opposed to “I tolerate TensorFlow because it’s the only thing that handles my data at this scale”).
I’m delighted to be part of TensorFlow initiatives that explicitly identify data scientists as primary users. One example I’m looking forward to telling you all about in a future post is the What-If Tool, which makes model understanding, bias detection, and ML data exploration easy. The team included a user experience designer tasked with making data scientists happy… from Day 1! You can sneak a look at the results here.
If it’s not made for you, it probably won’t fit you
It’s important to take a moment to think about the origins of a tool you’re considering learning as well as the communities its builders are making overtures towards as they’re steering the development of new versions.
Try before you buy
While we’re at it, if you’re calling the tooling shots for your organization, don’t commit to a tool before your data scientists have playtested it. You’d think this would go without saying, but Satan suggests otherwise. Thinking about picking up a tool built for analysts in the retail space and plugging it right into your healthcare company? Oh dear.
You might want to consider some tooling support from dedicated engineers so your data scientists aren’t miserable. Your current analysts probably didn’t sign up for dealing with what will feel like a piece of junk to them, and they might not have been around the block enough times to know to ask about tools (and engineering support for these tools!) during their interviews.
If you’re a data scientist looking for a new gig, don’t forget to check that whoever you’re about to trust with your career understands your needs. Ask potential employers pointed questions about data, decision-makers, and tools. Make sure they have what our kind needs to be happy and effective. Don’t assume employers are ready for you this century. If you love the work, I’d hate to see you become yet another Director of Data Science in a company with no data!