Skip to content

Writing

Dos and Don'ts for ML Hiring

This is primarily for my future self. These are observations based on my own experience of 2 years at Verloop.io and helping a few companies hire for similar roles.

Do

  • Seniormost hire first: Start by hiring the senior most person you're going to hire. E.g. start by hiring the ML Lead (assuming you already have a CTO)
  • Have a means to tell that your investment in data science is working out or not
  • Closest to User first: Hire the person who will consume the data to build for the user first
  • Sourcing: Begin sourcing early and over-emphasise two channels: Referrals and Portfolios
    • Typically, in India - expect:
      • ~2 months to close a full time role at early career (0-3 years) and
      • ~3 months to close a mid career (3-7 years) and
      • 6+ months to close a senior hire
  • If a developer has open source code contributions in the last 2-3 years, consider waving off the coding or algorithmic challenge to speed up the interview process
  • Pay above market cash salaries
    • In 12-18 months from now, when your ML Engineer will have internalised all the requirements, company culture and built a bunch of important tooling - she would get an offer which is 2-3x of today. If you're already paying above market salaries, a 20-30% jump is quite often enough to retain many folks
  • Have 3 at least versions of your shipping timeline
  • Do hire full stack Data Science people/teams. If you're hiring for early members for your team, this is practical necessity. An example of T shaped skills could look like

Don't

  • Rely on HR or your usual backend engineering hiring channels to work well for you, in general
  • Don't hire the person who builds the means to move data (e.g. ML before data engineering) before hiring atleast 1-2 stakeholders in ML
    • Why? This is because it's cheaper (and often faster) to change ML modeling approach than to make changes in data engineering pipelines
  • Don't start by hiring an intern to implement papers or take things to production before you've done them
  • Don't expect data science to deliver or ship at the same "user value" pace as Product Engineering
    • Why? Data Science suffers from the twin problems of being new and experiment-driven
  • Don't assume that you've so much data, and since all of it is queryable, it's all usable
  • Hire ultra-specialists e.g. post-docs and PhDs too early, barring products which requires invention and not application

Why I Quit Data Science

Question from a friend: I am interested in knowing how did you come to this decision of moving to SWE from DS/MLE. Since I've been asked a variant of this question quite a few times, I thought it would be good to share my answer.

What kind of research did you do to get to this decision?

I spoke to a lot of people who were both big companies and startups. I also spoke to folks across multiple markets: Singapore, India, US & Europe. I primarily spoke to people with more than 10-12 years of experience. This is a big difference in my perspective.

What were your considerations while making this decision?

Skills

This is how I understand the world today: There are 3 primary functions around data: data engineering, modeling (e.g. predictive) and data analytics.

I could keep going deeper into modeling e.g. learning more about CNNs and Transformers. Between writing the NLP Book and professional Machine Learning, I'd guess that I'm in top 20-30% of the world doing this. The journey to get from here to being the best is hard and I'm not sure if I'm going to be able to do it.

The field also suffers a bit from the Red Queen effect on the applied side of things. I'm not sure if I want to keep doing it 5, 10, or 20 years from now. I started doing Machine Learning because I was interested in the field and I was curious about how it would work.

It's no longer about the thrill of solving a puzzle/problem anymore. The roles I've access to, have the drudgery of making the same pipelines work in similar ways and then applying them to different problems.

I'd much rather add another skill and get to the top 25% in it -- and then quickly rise to the top on it's intersection. This will also be easier as I've tons of novelty and new ideas to learn.

Since analytics roles are neither well respected nor well paid, the process of elimination works. I'd rather be a platform enginer than a data/product analyst.

Competition

Within this, let's talk separately about pre-series D startups and big companies (e.g. FAANG/MAGA) for modeling roles. Startups are usually open to hiring folks without a MS/PhD degree, while big companies are more open to hiring folks with a MS/PhD.

For modeling roles at big companies you will be competing with folks with a PhD. For startups, I often see they end up hiring better trained folks as they scale up and relegate older, less 'specialised' folk to roles closer to engineering (e.g. API Design, uptime) and away from modeling.

It's also much harder for me, personally to find truly exceptional Machine Learning mentors, but relatively easier to find proven, battle-tested quite senior engineers. And as much as you might underestimate the role of coaching, I believe that in our craft - it can save you 4-5 years of learning time.

Title/Impact

I've the least confidence on this being true over a longer duration. But I'm mentioning it here since a lot of senior people do think about this.

Growing within modeling-related roles is hard and you hit the ceiling as Head of Machine Learning. Notice that in most of the cases, you are not even Head of Data, you're Head of Research or ML or some function within Data. The Head of Data in turns reports to the senior most Engineering Leader e.g. the CTO.

This means your influence over things which shape your day: tooling, infrastructure, product direction, org structure, promotions etc. is limited. You can't even learn these things.

I'd like to keep my options of becoming a Engineering Manager/VP Engineering/CTO in a few years. I'd much prefer that to Head of Data Science or Analytics. This option is so much more valuable to me that I'm happy to pay a price to "buy" it.

What was/is the goal of this particular switch? What were you trying to optimize?

I'm optimizing for being great (but not best) at the intersection of 3 things instead of 1 narrowly, clearly defined role. I'm trying to get to the top of the intersection.

I was also bored by the mundanity of problems you encounter in typical early-stage startups. The need to trade off personal-notion-of-quality for speed is sometimes a bit of a problem, but I usually enjoyed the challenge.

Why not Machine Learning Engineering at a Big Company?

I fear that this role combines the worst of two worlds. You've the skills of a backend engineer: you can design microservices, implement them, scale them, deploy them, and manage the infrastructure. But you also have the skills of a Data Scientist: you can build models, train them, deploy them, and manage the experimentation infrastructure.

You don't get paid or recognized for either of them. The backend developer thinks of you as a "ML guy" and the Data Scientist thinks of you as a "Backend guy". This is made worse at a Big Company because they tend to reward specialists via promotions. You're going to get underpaid for both roles.

Not to mention that a large fraction of your knowledge is getting outdated faster than I can learn. Of course, you might be 10x faster, better learner than me - in which case this blog post is not meant for you.

Why not take a 1 year Research focussed Sabbatical?

Well, because companies which ask for skills acquired via a MS/PhD are often not willing to pay for a 1 year research year. It'd not be that much better than being endorsed by Dr. Andrew Ng, writing a book, mentoring folks for ACL papers and speaking at PyCon India.

What information did you find for and against this switch?

Stepping away from Machine Learning Lead roles can be a massive cash and title/designation downgrade. It definitely turned out to be true for me.

My alternate job offer was a Series B/C Machine Learning Lead, instead of a Platform Engineer. I would not be happy with the role, but I'd be very happy with the salary. It'd be 3-4x in cash, and 4-5x in total compensation terms. Another way to look at this, I took at 75% cut on my cash compensation.

I'm betting that I'll have a lot more fun doing this, but I'm also betting that I'll be a little more successful - which will compensate for this over a 4-10 year chapter of the career.

In addition to the cash and title hit, it's a bit of social shaming: People might be inclined to assume that you were not quite good as a Data Scientist and that is why you moved to SWE. I don't care enough about that to influence my decision but I do care about it and hence worth mentioning.

My Machine Learning skills will also atrophy with time. I'll be able to get to similar productivity faster in a few years because the half-life of knowledge in the field is really short and tooling improvements make it easier to ship well.

Anti Skills

You learn a well-paying skill and years later - it comes back to hurt you in unexpected ways. That's an Anti Skill.

Consider this hypothetical: You start your software engineering career and build a reputation as someone who is good at iOS development. Each year, the money you make keeps improving as you keep getting better at it.

The downside? You'll find it hard to get job offers outside of iOS development [1]

Note that this increased pay might still be less than even starting pay in some other fields, say Data Science -- but you've Golden Handcuffs on you now, don't you?

Congratulations! You've hit a local maxima!

But what if you're someone who enjoys doing iOS development? I'm super happy for you! You'll most likely not just be good at it, but great at this and enjoy it.

What is an Anti Skill?

The discussion gets more interesting when you consider that you're not just a software engineer anymore, but a mobile app developer. When you say "I'm good at iOS", the hiring market hears it as "I know only iOS development".

Unnoticeable to you, the market forces have limited you to a mobile developer. {{< tweet user="ponnappa" id="1415323073444597761" >}}

But when you started, did you know that this would happen? That learning these skills will reduce your optionality in the future?

"Anti Skill" is a set of skills which when advertised take away your future choices.

Anti Skills prevent you from adapting quickly to the changing environment around you. Anti Skills are skills that you don't want to tell people about. This has nothing to do with whether it's a good skill to learn or not. Some of these skills are actually good skills to learn e.g. React, SQL, Android/iOS from top of my mind.

What can we do?

To prevent this, keep your identity small. That includes not self-identifying by a technology, platform or worse, a JS Framework.

This will make it easier to see when/if you're stuck. Other way to say the same thing, if you think of yourself as a mobile developer, you might be better off thinking of yourself as a software developer specialising in iOS development.

This way, when something is not working, you can see what's going on.

If you're in a position where you're stuck at a local maxima -- resist the temptation to go up the ladder. Think of taking titles into your identity on as taking debt. Job titles are not your identity. They're prestige handcuffs, and you're not going to be able to get them off you easily.

It might be harder to get out of these impressions at your existing job, because they tend to be sticky. In such scenarios, moving to a different team within the same company or a different company altogether might be preferred. I know of atleast one case where someone had to change jobs to get a 'soft' career reset.

This reset, speaking from personal experience, is quite uncomfortable and requires you to fall down a skill-cliff and climb up again. If you're considering something like this, feel free to hit me up with your thoughts. I'm here to help.

The worst of these scenarios is that in a few years from now, you are great at something you don't enjoy anymore. And now, it's too late to do something else because you're stuck with it. Act now!


[1] Changing to a different role and/or adding skill sets might still be possible within the organisation you already work at. Lateral shifts are usually harder.