Skip to content

Commit b87b4b8

Browse files
authored
Merge pull request #1394 from agarwalhimanshugaya/main
add question-no-601
2 parents d4e81a9 + 79ce8db commit b87b4b8

File tree

1 file changed

+122
-0
lines changed

1 file changed

+122
-0
lines changed
Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
---
2+
id: human-traffic-of-stadium
3+
title: Human-Traffic-Of-Stadium
4+
sidebar_label: Human Traffic Of Stadium
5+
tags:
6+
- Sql
7+
- Database
8+
- Pandas
9+
description: "This question solve important question of sql which gives us knowledge about writing of sql command."
10+
---
11+
12+
# Non-negative Integers without Consecutive Ones
13+
14+
## 1. Problem Description
15+
```
16+
+---------------+---------+
17+
| Column Name | Type |
18+
+---------------+---------+
19+
| id | int |
20+
| visit_date | date |
21+
| people | int |
22+
+---------------+---------+
23+
visit_date is the column with unique values for this table.
24+
Each row of this table contains the visit date and visit id to the stadium with the number of people during the visit.
25+
As the id increases, the date increases as well.
26+
```
27+
Write a solution to display the records with three or more rows with consecutive `id`'s, and the number of people is greater than or equal to 100 for each.
28+
29+
Return the result table ordered by `visit_date` in ascending order.
30+
31+
The result format is in the following example.
32+
## 2. Examples
33+
34+
### Example 1:
35+
**Input:**
36+
```
37+
Stadium table:
38+
+------+------------+-----------+
39+
| id | visit_date | people |
40+
+------+------------+-----------+
41+
| 1 | 2017-01-01 | 10 |
42+
| 2 | 2017-01-02 | 109 |
43+
| 3 | 2017-01-03 | 150 |
44+
| 4 | 2017-01-04 | 99 |
45+
| 5 | 2017-01-05 | 145 |
46+
| 6 | 2017-01-06 | 1455 |
47+
| 7 | 2017-01-07 | 199 |
48+
| 8 | 2017-01-09 | 188 |
49+
+------+------------+-----------+
50+
```
51+
**Output:**
52+
```
53+
+------+------------+-----------+
54+
| id | visit_date | people |
55+
+------+------------+-----------+
56+
| 5 | 2017-01-05 | 145 |
57+
| 6 | 2017-01-06 | 1455 |
58+
| 7 | 2017-01-07 | 199 |
59+
| 8 | 2017-01-09 | 188 |
60+
+------+------------+-----------+
61+
```
62+
**Explanation:**
63+
The four rows with ids 5, 6, 7, and 8 have consecutive ids and each of them has >= 100 people attended. Note that row 8 was included even though the visit_date was not the next day after row 7.
64+
The rows with ids 2 and 3 are not included because we need at least three consecutive ids.
65+
66+
### Idea
67+
I've seen pretty many solutions using join of three tables or creating temporary tables with `n^3 `rows. With my `5-years`' working experience on data analysis, I can guarantee you this method will cause you "out of spool space" issue when you deal with a large table in big data field.
68+
69+
I recommend you to learn and master window functions like `lead`, `lag` and use them as often as you can in your codes. These functions are very fast, and whenever you find yourself creating duplicate temp tables, you should ask yourself: can I solve this with window functions.
70+
71+
72+
## 5. Implementation (Code for 4 Languages)
73+
74+
<Tabs>
75+
<TabItem value="Pandas" label="Pandas" default>
76+
```Pandas
77+
import pandas as pd
78+
79+
def human_traffic(stadium: pd.DataFrame) -> pd.DataFrame:
80+
stadium = stadium[stadium.people >= 100].sort_values(by='id')
81+
third = (stadium.id.diff() == 1)\
82+
& (stadium.id.diff().shift(1) == 1)
83+
return stadium[third | third.shift(-1) | third.shift(-2)]
84+
85+
86+
```
87+
</TabItem>
88+
89+
<TabItem value="SQL" label="SQL">
90+
```SQL
91+
SELECT ID
92+
, visit_date
93+
, people
94+
FROM (
95+
SELECT ID
96+
, visit_date
97+
, people
98+
, LEAD(people, 1) OVER (ORDER BY id) nxt
99+
, LEAD(people, 2) OVER (ORDER BY id) nxt2
100+
, LAG(people, 1) OVER (ORDER BY id) pre
101+
, LAG(people, 2) OVER (ORDER BY id) pre2
102+
FROM Stadium
103+
) cte
104+
WHERE (cte.people >= 100 AND cte.nxt >= 100 AND cte.nxt2 >= 100)
105+
OR (cte.people >= 100 AND cte.nxt >= 100 AND cte.pre >= 100)
106+
OR (cte.people >= 100 AND cte.pre >= 100 AND cte.pre2 >= 100)
107+
```
108+
</TabItem>
109+
110+
</Tabs>
111+
112+
### Complexity Analysis
113+
**Time Complexity:** $O(nlogn)$
114+
115+
116+
**Space Complexity:** $O(n)$
117+
118+
## 10. References
119+
120+
- [LeetCode - Human-Traffic-Of-Stadium](https://leetcode.com/problems/human-traffic-of-stadium/solutions/911779/mysql-use-window-function-for-big-data/)
121+
122+

0 commit comments

Comments
 (0)