|
| 1 | +--- |
| 2 | +id: Drop-Missing-Data |
| 3 | +title: Drop Missing Data Solution |
| 4 | +sidebar_label: 2883 - Drop Missing Data |
| 5 | +tags: |
| 6 | + - LeetCode |
| 7 | + - Python |
| 8 | +description: "This is a solution to the Drop Missing Data problem on LeetCode." |
| 9 | +sidebar_position: 1 |
| 10 | +--- |
| 11 | + |
| 12 | +In this tutorial, we will solve the Drop Missing Data problem. We will provide the implementation of the solution in Python. |
| 13 | + |
| 14 | +## Problem Description |
| 15 | + |
| 16 | +```plaintext |
| 17 | +DataFrame students |
| 18 | ++-------------+--------+ |
| 19 | +| Column Name | Type | |
| 20 | ++-------------+--------+ |
| 21 | +| student_id | int | |
| 22 | +| name | object | |
| 23 | +| age | int | |
| 24 | ++-------------+--------+ |
| 25 | +``` |
| 26 | + |
| 27 | +There are some rows having missing values in the name column. |
| 28 | + |
| 29 | +Write a solution to remove the rows with missing values. |
| 30 | + |
| 31 | +### Examples |
| 32 | + |
| 33 | +**Example 1:** |
| 34 | + |
| 35 | +```plaintext |
| 36 | +Input: |
| 37 | ++------------+---------+-----+ |
| 38 | +| student_id | name | age | |
| 39 | ++------------+---------+-----+ |
| 40 | +| 32 | Piper | 5 | |
| 41 | +| 217 | None | 19 | |
| 42 | +| 779 | Georgia | 20 | |
| 43 | +| 849 | Willow | 14 | |
| 44 | ++------------+---------+-----+ |
| 45 | +
|
| 46 | +Output: |
| 47 | ++------------+---------+-----+ |
| 48 | +| student_id | name | age | |
| 49 | ++------------+---------+-----+ |
| 50 | +| 32 | Piper | 5 | |
| 51 | +| 779 | Georgia | 20 | |
| 52 | +| 849 | Willow | 14 | |
| 53 | ++------------+---------+-----+ |
| 54 | +
|
| 55 | +Explanation: |
| 56 | +Student with id 217 havs empty value in the name column, so it will be removed. |
| 57 | +``` |
| 58 | + |
| 59 | +### Constraints |
| 60 | + |
| 61 | +- You have to solve using python pandas only. |
| 62 | + |
| 63 | +--- |
| 64 | + |
| 65 | +## Solution for Drop Missing Data |
| 66 | + |
| 67 | +```py |
| 68 | +import pandas as pd |
| 69 | + |
| 70 | +def dropMissingData(students: pd.DataFrame) -> pd.DataFrame: |
| 71 | + return students[students['name'].notnull()] |
| 72 | +``` |
| 73 | + |
| 74 | +### Complexity Analysis |
| 75 | + |
| 76 | +- **Time Complexity:** $O(n)$ |
| 77 | +- **Space Complexity:** $O(n)$ |
| 78 | +- **Iterating over the DataFrame:** The notnull() function on the 'name' column performs an O(n) operation, iterating through each row of the DataFrame, where n is the number of rows. |
| 79 | +- **Boolean indexing:** Usually an O(n) process, this involves building a new DataFrame from the boolean mask (the output of notnull()). |
| 80 | +- **New DataFrame:** To hold the filtered data, the function builds a new DataFrame. The number of rows with complete "name" values determines the size of this new DataFrame, and in the worst scenario, that number may reach n. |
| 81 | + As a result, O(n) is also the space complexity. |
| 82 | + |
| 83 | +--- |
| 84 | + |
| 85 | +<h2>Authors:</h2> |
| 86 | + |
| 87 | +<div style={{display: 'flex', flexWrap: 'wrap', justifyContent: 'space-between', gap: '10px'}}> |
| 88 | +{['avdhut-pailwan'].map(username => ( |
| 89 | + <Author key={username} username={username} /> |
| 90 | +))} |
| 91 | +</div> |
0 commit comments