|
| 1 | +--- |
| 2 | +id: human-traffic-of-stadium |
| 3 | +title: Human-Traffic-Of-Stadium |
| 4 | +sidebar_label: Human Traffic Of Stadium |
| 5 | +tags: |
| 6 | + - Sql |
| 7 | + - Database |
| 8 | + - Pandas |
| 9 | +description: "This question solve important question of sql which gives us knowledge about writing of sql command." |
| 10 | +--- |
| 11 | + |
| 12 | +# Non-negative Integers without Consecutive Ones |
| 13 | + |
| 14 | +## 1. Problem Description |
| 15 | +``` |
| 16 | ++---------------+---------+ |
| 17 | +| Column Name | Type | |
| 18 | ++---------------+---------+ |
| 19 | +| id | int | |
| 20 | +| visit_date | date | |
| 21 | +| people | int | |
| 22 | ++---------------+---------+ |
| 23 | +visit_date is the column with unique values for this table. |
| 24 | +Each row of this table contains the visit date and visit id to the stadium with the number of people during the visit. |
| 25 | +As the id increases, the date increases as well. |
| 26 | +``` |
| 27 | +Write a solution to display the records with three or more rows with consecutive `id`'s, and the number of people is greater than or equal to 100 for each. |
| 28 | + |
| 29 | +Return the result table ordered by `visit_date` in ascending order. |
| 30 | + |
| 31 | +The result format is in the following example. |
| 32 | +## 2. Examples |
| 33 | + |
| 34 | +### Example 1: |
| 35 | +**Input:** |
| 36 | +``` |
| 37 | +Stadium table: |
| 38 | ++------+------------+-----------+ |
| 39 | +| id | visit_date | people | |
| 40 | ++------+------------+-----------+ |
| 41 | +| 1 | 2017-01-01 | 10 | |
| 42 | +| 2 | 2017-01-02 | 109 | |
| 43 | +| 3 | 2017-01-03 | 150 | |
| 44 | +| 4 | 2017-01-04 | 99 | |
| 45 | +| 5 | 2017-01-05 | 145 | |
| 46 | +| 6 | 2017-01-06 | 1455 | |
| 47 | +| 7 | 2017-01-07 | 199 | |
| 48 | +| 8 | 2017-01-09 | 188 | |
| 49 | ++------+------------+-----------+ |
| 50 | +``` |
| 51 | +**Output:** |
| 52 | +``` |
| 53 | ++------+------------+-----------+ |
| 54 | +| id | visit_date | people | |
| 55 | ++------+------------+-----------+ |
| 56 | +| 5 | 2017-01-05 | 145 | |
| 57 | +| 6 | 2017-01-06 | 1455 | |
| 58 | +| 7 | 2017-01-07 | 199 | |
| 59 | +| 8 | 2017-01-09 | 188 | |
| 60 | ++------+------------+-----------+ |
| 61 | +``` |
| 62 | +**Explanation:** |
| 63 | +The four rows with ids 5, 6, 7, and 8 have consecutive ids and each of them has >= 100 people attended. Note that row 8 was included even though the visit_date was not the next day after row 7. |
| 64 | +The rows with ids 2 and 3 are not included because we need at least three consecutive ids. |
| 65 | + |
| 66 | +### Idea |
| 67 | +I've seen pretty many solutions using join of three tables or creating temporary tables with `n^3 `rows. With my `5-years`' working experience on data analysis, I can guarantee you this method will cause you "out of spool space" issue when you deal with a large table in big data field. |
| 68 | + |
| 69 | +I recommend you to learn and master window functions like `lead`, `lag` and use them as often as you can in your codes. These functions are very fast, and whenever you find yourself creating duplicate temp tables, you should ask yourself: can I solve this with window functions. |
| 70 | + |
| 71 | + |
| 72 | +## 5. Implementation (Code for 4 Languages) |
| 73 | + |
| 74 | +<Tabs> |
| 75 | + <TabItem value="Pandas" label="Pandas" default> |
| 76 | + ```Pandas |
| 77 | + import pandas as pd |
| 78 | +
|
| 79 | +def human_traffic(stadium: pd.DataFrame) -> pd.DataFrame: |
| 80 | + stadium = stadium[stadium.people >= 100].sort_values(by='id') |
| 81 | + third = (stadium.id.diff() == 1)\ |
| 82 | + & (stadium.id.diff().shift(1) == 1) |
| 83 | + return stadium[third | third.shift(-1) | third.shift(-2)] |
| 84 | +
|
| 85 | +
|
| 86 | + ``` |
| 87 | + </TabItem> |
| 88 | + |
| 89 | + <TabItem value="SQL" label="SQL"> |
| 90 | + ```SQL |
| 91 | + SELECT ID |
| 92 | + , visit_date |
| 93 | + , people |
| 94 | +FROM ( |
| 95 | + SELECT ID |
| 96 | + , visit_date |
| 97 | + , people |
| 98 | + , LEAD(people, 1) OVER (ORDER BY id) nxt |
| 99 | + , LEAD(people, 2) OVER (ORDER BY id) nxt2 |
| 100 | + , LAG(people, 1) OVER (ORDER BY id) pre |
| 101 | + , LAG(people, 2) OVER (ORDER BY id) pre2 |
| 102 | + FROM Stadium |
| 103 | +) cte |
| 104 | +WHERE (cte.people >= 100 AND cte.nxt >= 100 AND cte.nxt2 >= 100) |
| 105 | + OR (cte.people >= 100 AND cte.nxt >= 100 AND cte.pre >= 100) |
| 106 | + OR (cte.people >= 100 AND cte.pre >= 100 AND cte.pre2 >= 100) |
| 107 | + ``` |
| 108 | + </TabItem> |
| 109 | + |
| 110 | +</Tabs> |
| 111 | + |
| 112 | +### Complexity Analysis |
| 113 | +**Time Complexity:** $O(nlogn)$ |
| 114 | + |
| 115 | + |
| 116 | +**Space Complexity:** $O(n)$ |
| 117 | + |
| 118 | +## 10. References |
| 119 | + |
| 120 | +- [LeetCode - Human-Traffic-Of-Stadium](https://leetcode.com/problems/human-traffic-of-stadium/solutions/911779/mysql-use-window-function-for-big-data/) |
| 121 | + |
| 122 | + |
0 commit comments