Skills and Qualities of Top Tier ML Researchers

By Wojciech Gryc on December 8, 2020

 

For those aspiring to work in machine learning, the “Machine Learning Researcher” title is a coveted one. It’s an entry point into research, into machine learning, and into leading teams and strategies at AI-focused companies.
 

Yet... Many people make mistakes in presenting themselves for ML Research roles or other highly technical data science roles.
 

Let’s review the case of Felix (not his real name), a candidate I recently interviewed for a Senior Machine Learning Researcher role. Felix has a PhD in computer science and has built several end-to-end models in computer vision.
 

PhD? Modeling experience? PhD? Project experience? PhD? Check.
 

Passed the screening call? No. He didn’t make it.
 

Standing Out in ML Research: Good to Great to Excellent Applications
 

I tend to group applicants for ML Research jobs into three categories:
 

  1. Good: those who know how to use existing tools and frameworks to build a model.
     
  2. Great: those who can design solutions to a non-standard problem. This specifically looks at implementing algorithms or designing an end-to-end solution to a problem that isn’t a well-documented one (i.e., avoid sentiment analysis of movie reviews, or classifying cats versus dogs).
     
  3. Excellent: those who can invent completely new algorithms, build new architectures for business problems, or convert an ill-defined problem into a well-defined one.
     

Contrary to popular belief, having a PhD or published papers is less relevant here. Both signal that you are capable of being an excellent candidate, but neither is necessary nor sufficient to get a job like this.
 

Felix was able to show he could be a good candidate, but wasn’t able to illustrate how he’d work in a novel problem space or use a new methodology.
 

It’s important to note that not every company requires someone at the excellent level. Most companies, in fact, don’t require their ML teams to build new algorithms or define previously-ill-defined problems. As an applicant, you should clarify if this is the case. Similarly, if you want to work on leading-edge research, then target companies that test for great or excellent candidates.
 

Some Examples of Excellent Projects
 

If you want to prepare for an ML Researcher role like the one I described above, there are a few strategies you can use.
 

Publishing papers at conferences, on Arxiv, or elsewhere. This might be the hardest one to achieve and will likely take the longest amount of time. Inventing or publishing novel research makes you a researcher by definition, I believe!
 

Implementing algorithms from papers or other programming languages. Similar to my earlier critique, it’s often frustrating for hiring managers to meet ML Research candidates who say they know PyTorch, Keras, or TensorFlow but have only built models outlined in tutorials. Challenge yourself to implement an algorithm in a research paper and train it. This emphasizes your ability to take and implement cutting edge research.
 

One of my first jobs required me to read a fluid dynamics paper and implement the model in MatLab; it took an entire summer. Don’t despair if it takes you a while, too.
 

Work on defining a new problem and building associated data sets. Many difficult problems are actually poorly defined; the reason we don’t make great progress against them is that we don’t have a well-aligned view on performance. Many researchers spend time building benchmarks for these problems. For hiring managers, this shows that you are able to think about an abstract and difficult problem and break it down into a solvable approach – this is sign of leadership potential.
 

There are many examples of researchers going out of their way to define a vague subject area and then make progress against their definition. Better still, some build communities around this. Examples include the ICDAR 2019 receipt OCR competition, the Fact Extraction and Verification data set and associated workshops, and the Open Graph Benchmark from NeurIPS 2020. These are all good examples of individuals defining a hard-to-define problem so that others can build models to push the field forward.
 

You can make progress and show leadership potential by picking an interesting problem, defining a benchmarking data set for it, and making progress against the problem you defined.
 

And when you’re done?
 

Make sure to put the above into a beautiful portfolio. Lead with your most complex, research-heavy work up-front in screening calls or resumes, too. Many recruiters (and some hiring managers!) are not as familiar with the nuances of research and might not appreciate what you’re presenting, so make it easy for them to see this.
 

 

 

Subscribe

Subscribe to get updates on webinars, content, and job postings