Machine Learning Interviews
Machine Learning Systems Design
@chipro huyenchip.com
Introduction
Hour-long interview, you might have time to go over only one or two questions about design a machine learning system to solve practical problems.
Interviewer generally agree that even if you can't get to a working solution, as long as you communicate your thinking process to show that you understand different constraints, trade-offs, and concerns of your system, it's good enough.
These questions often both love and hate, as they are fun, practical, flexible, and require the least amount of memoization.
Open-ended question often lack evaluation guidelines but expects only one right answer -- the answer that the interviewer is familiar with.
These questions are ambiguous. You drive the interview and choose what to focus on.
"Most candidates know the model classes (linear, decision trees, LSTM, convolutional neural networks) and memorize the relevant information, so for me the interesting bits in machine learning systems interviews are data cleaning, data preparation, logging, evaluation metrics, scalable inference, feature stores (recommenders/rankers)." -- Dmitry Kislyuk
No much know about the model classes...
Looks for the ability to divide and conquer the problem.
"When I ask such questions, what I am looking for is the following. 1. Can the candidate break down the open ended problem into simple components (building blocks) 2. Can the candidate identify which blocks require machine learning and which do not." -- Ravi Ganti
building blocks
Identify which blocks require machine learning.
TO sum as ML-system engineer
"I think this [the machine learning systems design] is the most important question. Can a person define the problem, identify relevant metrics, ideate on data sources and possible important features, understands deeply what machine learning can do. Machine learning methods change every year, solving problems stays the same." --Illia Polosukhin
Define the problem
Identify relevant metrics
Ideate on data sources and features
Understand what machine learning can do
it aims to provide a framework for approaching those questions. -- the book
Research vs production
The fundamental diffeerneces between machine learning in an academic setting and machine learning in production.
Academic care more about training
Production care more about serving.
Candidates often make the mistake of focusing entirely on training without of how it would be used.
Preformance requirements
Research -> State-of-the-art(SOTA) results on benchmarking tasks.
Edge out a small increase in prerformance that make too complex to be useful.
A technique .. is ensembling: combining "multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone."
A few precentage point increase in performance, but make system more complex, much more time to develop and train, can costs more.
Research -> leaderboard(a few precentage points), but not for users.(95% or 96%)
Compute requirements
exponentially more compute power and exponentially more data to train.
Accodring to OpenAI, "the amount of compute used in the largest AI training runs has doubled every 3.5 months."
The goals of research are very different from the goals of production.
Design a machine learning system
Designing a machine learning system is an iterative process.
Project setup
Data pipeline
Modeling(selecting, training, debugging)
Serving(testing, deploying, maintaining)
![image-20191231105100700](/Users/junyangz/Library/Application Support/typora-user-images/image-20191231105100700.png)
Project setup
Goals
What do you want to achieve with the problem?
User experience
Setp walkthrough of how end users are supposed to use the system.
Preformance constraints
How fast/good
What's more important: precision or recall?
What's more costly: FN or FP
Evaluation
Training and inferencing
Users' reactions. etc
Personalizaion
One model for all or for a group or for each user individually
Train a base model then finetune it for target users.
Project constraints
Real world may less during interviews.(time, compute power, system, talents work, etc.)
Data pipeline
Data availiability and collection
User data
Storage
Data reprocessing & representation
Challenges
Privacy
Biases
Modeling
Training
Debugging
Scaling
Serving
Last updated