To help you in your data science career, I’ve prepared the main Python concepts tested in the data science interview. Later on, I will discuss two main interview question types that cover those concepts you’re required to know as a data scientist. I’ll also show you several example questions and give you solutions to push you in the right direction.
Technical Concepts of Python Interview Questions
This guide is not company-specific. So if you have some data science interviews lined up, I strongly advise you to use this guide as a starting point of what might come up in the interview. Additionally, you should also try to find some company-specific questions and try to solve them too. Knowing general concepts and practicing them on real-life questions is a winning combination. I’ll not bother you with theoretical questions. They can come up in the interview, but they too cover the technical concepts found in the coding questions. After all, if you know how to use the concepts I’ll be talking about, you probably know to explain them too. Technical Python concepts tested in the data science job interviews are: Data types Built-in data structures User-defined data structures Built-in functions Loops and conditionals External libraries (Pandas)
1. Data Types
Data types are the concept you should be familiar with. This means you should know the most commonly used data types in Python, the difference between them, when and how to use them. Those are data-types such as integers (int), floats (float), complex (complex), strings (str), booleans (bool), null values (None).
2. Built-in Data Structures
These are list, dictionary, tuple, and sets. Knowing these four built-in data structures will help you organize and store data in a way that will allow easier access and modifications.
3. User-defined Data Structures
On top of using the built-in data structures, you should also be able to define and use some of the user-defined data structures. These are arrays, stack, queue, trees, linked lists, graphs, HashMaps.
4. Built-in Functions
Python has over 60 built-in functions. You don’t need to know them all while, of course, it’s better to know as many as possible. The built-in functions you can’t avoid are abs(), isinstance(), len(), list(), min(), max(), pow(), range(), round(), split(), sorted(), type().
5. Loops and Conditionals
Loops are used in repetitive tasks when they perform one piece of code over and over again. They do that until the conditionals (true/false tests) tell them to stop.
6. External Libraries (Pandas)
While there are several external libraries used, Pandas is probably the most popular. It is designed for practical data analysis in finance, social sciences, statistics, and engineering.
Python Interview Types of Questions
All those six technical concepts are mainly tested by only two types of interview questions. Those are: Data manipulation and analysis Algorithms Let’s have a closer look at each of them.
1. Data Manipulation and Analysis
These questions are designed to test the above technical concept by solving the ETL (extracting, transforming, and loading data) problems and performing some data analysis. Here’s one such example from Facebook: QUESTION: Facebook sends SMS texts when users attempt to 2FA (2-factor authenticate) into the platform to log in. In order to successfully 2FA they must confirm they received the SMS text message. Confirmation texts are only valid on the date they were sent. Unfortunately, there was an ETL problem with the database where friend requests and invalid confirmation records were inserted into the logs, which are stored in the ‘fb_sms_sends’ table. These message types should not be in the table. Fortunately, the ‘fb_confirmers’ table contains valid confirmation records so you can use this table to identify SMS text messages that were confirmed by the user. Calculate the percentage of confirmed SMS texts for August 4, 2020. ANSWER: One of the questions asked to test your data analysis skills is this one from Dropbox: QUESTION: Write a query that calculates the difference between the highest salaries found in the marketing and engineering departments. Output just the difference in salaries. ANSWER:
2. Algorithms
When it comes to Python algorithm interview questions, they test your problem-solving using the algorithms. Since algorithms are not limited to only one programming language, these questions test your logic and thinking, as well as coding in Python. For example, you could get this question: QUESTION: Given a string containing digits from 2-9 inclusive, return all possible letter combinations that the number could represent. Return the answer in any order. A mapping of digit to letters (just like on the telephone buttons) is given below. Note that 1 does not map to any letters. ANSWER: Or it could get even more difficult with the following question: QUESTION: “Write a program to solve a Sudoku puzzle by filling the empty cells. A sudoku solution must satisfy all of the following rules: Each of the digits 1-9 must occur exactly once in each row. Each of the digits 1-9 must occur exactly once in each column. Each of the digits 1-9 must occur exactly once in each of the 9 3×3 sub-boxes of the grid. The ‘.’ character indicates empty cells.” ANSWER: This would be quite a complex algorithm and good for you if you knew how to solve it!
Conclusion
For a data science interview, the six technical concepts I’ve mentioned are a must. Of course, it’s recommended you dive even deeper into Python and broaden your knowledge. Not only theoretically but also practicing by solving as many as possible both data manipulation and analysis and algorithm questions. For the first one, there are plenty of examples on StrataScratch. You could probably find the questions from the company where you applied for a job. And LeetCode is a good choice when you decide to practice writing Python algorithms before your interviews.