100 Python Interview questions with Answers

Python Interview questions updated daily

9/8/20239 min read

black flat screen computer monitor
black flat screen computer monitor
What is space complexity and time complexity in Python?

In computer science, "space complexity" and "time complexity" are two fundamental concepts used to analyze the performance of algorithms.

Time Complexity:

Time complexity refers to the amount of time an algorithm takes to complete as a function of the size of the input. It quantifies the number of operations performed by an algorithm. Time complexity is usually expressed using Big O notation, which provides an upper bound on the number of operations.

For example, an algorithm with a time complexity of O(n) means that the number of operations grows linearly with the size of the input. An algorithm with O(n^2) means that the number of operations grows quadratically, and so on.

Space Complexity:

Space complexity refers to the amount of memory (space) an algorithm uses as a function of the size of the input. It quantifies the amount of memory required for an algorithm to execute. Like time complexity, space complexity is also expressed using Big O notation.

For example, an algorithm with a space complexity of O(n) means that the amount of memory used grows linearly with the size of the input. An algorithm with O(n^2) space complexity means that the memory usage grows quadratically, and so on.

Example:

Let's consider an example to illustrate the difference between time and space complexity:

def example_algorithm(n):

# Time complexity: O(n)

for i in range(n):

print(i)

# Space complexity: O(1)

x = 5

return x

In this example, the time complexity of the loop is O(n) because the number of iterations is directly proportional to the input n. The space complexity for the variable x is O(1) because it takes a constant amount of memory, regardless of the size of n.

Choosing Algorithms:

Understanding time and space complexity is crucial for choosing the right algorithm for a specific problem. In practice, you often need to strike a balance between optimizing for time and space. Some algorithms are more time-efficient but may consume more memory, and vice versa. The choice depends on the specific requirements and constraints of the problem at hand.

What is dictionary in Python?

In Python, a dictionary is a collection of key-value pairs. It is also known as an associative array, map, or hash table in other programming languages. Dictionaries are incredibly versatile and are used to store and retrieve data in a way that is efficient and easy to understand.

Here are some key characteristics of dictionaries in Python:

  1. Unordered: Unlike lists, dictionaries are not ordered. This means that the items in a dictionary do not have a specific order or position.

  2. Mutable: Dictionaries can be modified after they are created. You can add, update, or remove key-value pairs.

  3. Keys: Keys are unique identifiers for the values in a dictionary. They must be immutable types (strings, numbers, or tuples containing only immutable types).

  4. Values: Values are associated with keys and can be of any data type, including lists, other dictionaries, or even custom objects.

  5. Fast Lookup: Dictionaries use a hash table to store the key-value pairs. This allows for very efficient lookup, retrieval, and assignment of values based on their keys.

  6. No Duplicate Keys: Each key in a dictionary must be unique. If you assign a value to an existing key, it will overwrite the previous value.

  7. Dictionary Syntax:

# Creating a dictionary

person = {"name": "John Doe", "age": 30, "city": "New York"}

# Accessing values using keys

print(person["name"]) # Output: John Doe

# Modifying values

person["age"] = 31

# Adding a new key-value pair

person["occupation"] = "Engineer"

# Removing a key-value pair

del person["city"]

# Checking if a key exists

if "city" in person:

print("City exists in person dictionary")

else:

print("City does not exist in person dictionary")

Dictionaries are widely used in Python for various tasks, including storing configurations, representing data records, and performing tasks like counting frequencies, caching, and more. They are a fundamental data structure in Python and a powerful tool for organizing and manipulating data.

What is Linked List in Python and How is it different from normal List?

In Python, a linked list is a linear data structure where elements are stored in nodes, and each node points to the next one in the sequence. It's different from a normal list (which is typically implemented as a dynamic array) in several key ways:

Linked List:

Node-Based Structure:

  • In a linked list, each element is stored in a node. Each node contains the data and a reference (or pointer) to the next node in the sequence.

Dynamic Size:

  • A linked list can dynamically grow or shrink in size during runtime. This is because nodes can be added or removed relatively easily.

Random Access:

  • Accessing an element in a linked list requires traversing the list from the head (start) node to the desired node. This means that random access (directly accessing an element by index) is not efficient in linked lists.

Insertion and Deletion:

  • Inserting or deleting an element in a linked list can be very efficient, especially for operations that involve adding or removing elements from the beginning or middle of the list.

Memory Usage:

  • Linked lists typically use more memory compared to normal lists. This is because each node has an overhead of storing a reference to the next node.

Normal List (Dynamic Array):

Array-Based Structure:

  • A normal list in Python (e.g., list type) is implemented as a dynamic array. Elements are stored in contiguous memory locations.

Fixed or Dynamic Size:

  • While Python lists can grow or shrink dynamically, they are ultimately implemented as dynamic arrays, which means that they might need to be reallocated if the size changes significantly.

Random Access:

  • Normal lists provide efficient random access to elements. You can directly access an element by its index.Insertion and Deletion:

  • Inserting or deleting an element in the middle of a normal list can be less efficient compared to a linked list. This is because elements may need to be shifted.

Memory Usage:

  • Normal lists may use less memory compared to linked lists, especially for smaller lists or lists with a known maximum size.

Choosing Between Linked Lists and Normal Lists:

  • Use Linked Lists When:

    • You need efficient insertion or deletion of elements, especially in the middle of the list.

    • You have a dynamic or unknown size of data.

    • You can tolerate higher memory usage.

  • Use Normal Lists When:

    • You need efficient random access to elements.

    • You have a known and relatively fixed size of data.

    • Memory efficiency is a critical concern.

In Python, the built-in list type is implemented as a dynamic array, providing efficient random access. However, Python also provides a collections.deque which is an implementation of a doubly linked list, and it's useful when you need efficient insertion and deletion at both ends of the list.


What is data wrangling?

Data wrangling, also known as data munging, is the process of cleaning, transforming, and preparing raw data into a format suitable for analysis or further processing. It involves a series of steps to convert raw data into a structured and usable form. Here are the key steps involved in data wrangling:

1. Data Ingestion:

- Collecting and importing raw data from various sources such as databases, files, APIs, and streaming platforms.

2. Data Exploration and Profiling:

- Analyzing the data to understand its characteristics, including data types, distributions, missing values, outliers, and patterns.

3. Data Cleaning:

- Handling missing values, duplicates, and outliers. This may involve imputation, removal, or transformation of problematic data points.

4. Data Transformation:

- Converting and reformatting data to align with the desired structure. This can include tasks like aggregating, filtering, merging, and creating new variables.

5. Data Enrichment:

- Adding supplementary information or attributes to the dataset from external sources to provide more context and insights.

6. Data Validation and Quality Assurance:

- Checking for data integrity, consistency, and adherence to predefined rules or constraints.

7. Data Aggregation and Summarization:

- Combining data points into summary statistics or aggregated values for analysis.

8. Feature Engineering (for Machine Learning):

- Creating new features or variables from existing data to improve model performance.

9. Handling Time Series Data (if applicable):

- Managing timestamps, handling time zones, and performing temporal calculations.

10. Data Sampling and Subset Selection:

- Creating smaller subsets of data for exploratory analysis or modeling when dealing with large datasets.

11. Handling Categorical Variables:

- Encoding categorical variables into a format suitable for machine learning models.

12. Data Reduction (if applicable):

- Applying techniques like dimensionality reduction to reduce the complexity of high-dimensional data.

13. Data Normalization and Scaling (if applicable):

- Standardizing the scale of numerical features to ensure fair comparisons in modeling.

14. Data Imputation (if applicable):

- Filling in missing values using techniques like mean imputation, median imputation, or advanced imputation methods.

15. Data Formatting for Output:

- Preparing the final dataset in a format suitable for visualization, reporting, or further analysis.

Data wrangling is a crucial step in the data analysis process, as it ensures that the data used for analysis is of high quality and well-prepared. It requires a combination of domain knowledge, data manipulation skills, and proficiency in using tools like SQL, Python, R, or specialized data wrangling platforms.

What are uppacking values ? How does it work in python?

Unpacking values in Python refers to the process of extracting individual elements from an iterable (like a tuple, list, or string) and assigning them to separate variables. This feature allows you to conveniently assign values to multiple variables in a single line of code.

Here are some examples of unpacking values in Python:

1. Unpacking a Tuple:

# Define a tuple

coordinates = (10, 20, 30)

# Unpack the tuple into individual variables

x, y, z = coordinates

print(x) # Output: 10

print(y) # Output: 20

print(z) # Output: 30

In this example, the values (10, 20, 30) are unpacked into the variables x, y, and z.

2. Unpacking a List:

# Define a list

colors = ['red', 'green', 'blue']

# Unpack the list into individual variables

first_color, second_color, third_color = colors

print(first_color) # Output: red

print(second_color) # Output: green

print(third_color) # Output: blue

Similarly, values from the list ['red', 'green', 'blue'] are unpacked into individual variables.

3. Unpacking Strings:

# Define a string

word = "hello"

# Unpack the string into individual variables

first_char, second_char, *rest_of_chars = word

print(first_char) # Output: h

print(second_char) # Output: e

print(rest_of_chars) # Output: ['l', 'l', 'o']

Here, the *rest_of_chars syntax is used to collect the remaining characters into a list.

4. Unpacking with Extended Unpacking:

# Define a tuple

coordinates = (10, 20, 30, 40, 50)

# Unpack the first two values and collect the rest in a variable

x, y, *remaining_values = coordinates

print(x) # Output: 10

print(y) # Output: 20

print(remaining_values) # Output: [30, 40, 50]

In this example, the *remaining_values syntax collects the remaining values into a list.

Unpacking is a flexible feature in Python that simplifies the assignment of values to multiple variables, especially when working with iterables. It enhances readability and can be used in various contexts within the language.

Can you write and explain the enumerate function in Python. Please give detailed explanation with some example.

The enumerate() function in Python is a built-in function that allows you to iterate over an iterable (like a list, tuple, or string) and returns both the index and the value of each item in the sequence. It provides a convenient way to access both the index and the element itself during iteration.

Here's the syntax of the enumerate() function:

enumerate(iterable, start=0)

  • iterable: The sequence or collection that you want to enumerate.

  • start: Optional. The value at which the index starts. Default is 0.

Let's go through an example to illustrate how enumerate() works:

fruits = ['apple', 'banana', 'cherry', 'date']

for index, fruit in enumerate(fruits):

print(f'Index {index} corresponds to {fruit}')

Output:

Index 0 corresponds to apple

Index 1 corresponds to banana

Index 2 corresponds to cherry

Index 3 corresponds to date

In this example, we have a list called fruits containing four elements. The for loop iterates over the enumerated fruits. enumerate(fruits) returns a sequence of tuples where each tuple consists of an index and the corresponding element from fruits.

During each iteration, index receives the index value, and fruit receives the corresponding fruit name. These values are then used in the print() statement to display the index and the associated fruit.

You can also specify a start value:

for index, fruit in enumerate(fruits, start=1):

print(f'Fruit {index}: {fruit}')

Fruit 1: apple

Fruit 2: banana

Fruit 3: cherry

Fruit 4: date

Here, enumerate(fruits, start=1) starts the index at 1 instead of the default 0.

Using enumerate() is especially useful when you need to track the position of elements in an iterable, which can be crucial in various programming tasks.

What are the different data structures within collection. Can you give examples for some of them?

The collections module in Python provides various specialized data structures. Here are some of them along with examples:

  1. namedtuple:

    1. namedtuple is a factory function for creating tuple subclasses with named fields.

      from collections import namedtuple

      # Define a named tuple called Point with x and y fields

      Point = namedtuple('Point', ['x', 'y'])

      # Create an instance of Point

      p = Point(x=1, y=2)

      print(p.x, p.y) # Output: 1 2

  2. deque:

    • deque is a double-ended queue that allows efficient appending and popping elements from both ends.

from collections import deque

# Create a deque

d = deque([1, 2, 3])

d.append(4) # [1, 2, 3, 4]

d.appendleft(0) # [0, 1, 2, 3, 4]

d.pop() # 4 (returns and removes last element)

d.popleft() # 0 (returns and removes first element)

  1. Counter:

  • Counter is a dictionary subclass for counting hashable objects. It's useful for counting the frequency of elements in a collection.

from collections import Counter

# Create a Counter

c = Counter([1, 1, 2, 3, 3, 3])

print(c) # Output: Counter({3: 3, 1: 2, 2: 1})

  1. OrderedDict:

  • OrderedDict is a dictionary subclass that maintains the order of elements based on their insertion order.

from collections import OrderedDict

# Create an ordered dictionary

d = OrderedDict()

d['a'] = 1

d['b'] = 2

d['c'] = 3

print(d) # Output: OrderedDict([('a', 1), ('b', 2), ('c', 3)])

  1. defaultdict:

  • defaultdict is a dictionary subclass that provides a default value for nonexistent keys.

from collections import defaultdict

# Create a defaultdict with a default value of 0

d = defaultdict(int)

d['a'] += 1

d['b'] += 2

print(d) # Output: defaultdict(<class 'int'>, {'a': 1, 'b': 2})

These are just a few examples of specialized data structures provided by the collections module. Each of these data structures serves a specific purpose and can be incredibly useful in different scenarios.