Skip to content

Commit 346ab27

Browse files
authored
Merge pull request #13 from Subham1999/Subham1999-patch-1
Create distributed_systems.md
2 parents ce10a83 + ccc97fb commit 346ab27

File tree

1 file changed

+79
-0
lines changed

1 file changed

+79
-0
lines changed

Paper-Shelf/distributed_systems.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
Here’s a curated list of **seminal papers** in distributed systems that are frequently referenced in interviews and system design discussions. These papers cover foundational theories, real-world systems, and cutting-edge innovations:
2+
3+
---
4+
5+
### **Foundational Theory & Concepts**
6+
1. **[Time, Clocks, and the Ordering of Events in a Distributed System (Lamport, 1978)](https://lamport.azurewebsites.net/pubs/time-clocks.pdf)**
7+
- Introduces **Lamport clocks** and the concept of causality in distributed systems. A must-read for understanding event ordering.
8+
9+
2. **[Impossibility of Distributed Consensus with One Faulty Process (FLP Impossibility, 1985)](https://groups.csail.mit.edu/tds/papers/Lynch/jacm85.pdf)**
10+
- Proves that consensus is impossible in asynchronous systems with even one faulty process. Critical for understanding trade-offs in consensus algorithms.
11+
12+
3. **[CAP Theorem (Brewer, 2000)](https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed/)**
13+
- The original CAP theorem paper (with a follow-up 12 years later). Explains the trade-offs between consistency, availability, and partition tolerance.
14+
15+
4. **[PACELC: Revisiting the CAP Theorem (Abadi, 2012)](https://cs-www.cs.yale.edu/homes/dna/papers/abadi-pacelc.pdf)**
16+
- Extends CAP to include **latency** as a key trade-off in partition-free scenarios.
17+
18+
---
19+
20+
### **Consensus & Coordination**
21+
5. **[The Part-Time Parliament (Paxos, Lamport, 1998)](https://lamport.azurewebsites.net/pubs/lamport-paxos.pdf)**
22+
- The original Paxos paper. Required reading for understanding consensus in fault-tolerant systems.
23+
24+
6. **[In Search of an Understandable Consensus Algorithm (Raft, 2014)](https://raft.github.io/raft.pdf)**
25+
- Introduces **Raft**, a simpler alternative to Paxos. Widely used in systems like etcd and Kubernetes.
26+
27+
7. **[Viewstamped Replication (VR, 1988)](https://pmg.csail.mit.edu/papers/vr-revisited.pdf)**
28+
- A precursor to Raft and Paxos. Explains replication and consensus for state machine replication.
29+
30+
---
31+
32+
### **Distributed Storage Systems**
33+
8. **[Dynamo: Amazon’s Highly Available Key-Value Store (2007)](https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf)**
34+
- Introduces eventual consistency, vector clocks, and decentralized architectures. Inspired Cassandra, Riak, and more.
35+
36+
9. **[Bigtable: A Distributed Storage System for Structured Data (2006)](https://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf)**
37+
- Google’s paper on Bigtable, a foundational system for wide-column NoSQL databases (e.g., HBase, Cassandra).
38+
39+
10. **[Spanner: Google’s Globally Distributed Database (2012)](https://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf)**
40+
- Introduces **TrueTime** and a globally consistent database. Critical for understanding distributed transactions.
41+
42+
11. **[The Google File System (2003)](https://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf)**
43+
- Inspired Hadoop HDFS. Explains distributed file systems for large-scale data processing.
44+
45+
---
46+
47+
### **Distributed Computing Models**
48+
12. **[MapReduce: Simplified Data Processing on Large Clusters (2004)](https://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf)**
49+
- Google’s paper on MapReduce, the backbone of Hadoop and batch processing systems.
50+
51+
13. **[Resilient Distributed Datasets (Spark, 2012)](https://www.usenix.org/system/files/conference/nsdi12/nsdi12-final138.pdf)**
52+
- Introduces Apache Spark’s in-memory computing model. Key for modern data processing.
53+
54+
---
55+
56+
### **Other Notable Papers**
57+
14. **[Kafka: A Distributed Messaging System (2011)](https://www.kai-waehner.de/blog/2021/04/20/apache-kafka-10-years-later-linkedin-original-paper-kafka-connect-turbine/)**: LinkedIn’s original design for Kafka.
58+
15. **[The Chubby Lock Service (2006)](https://static.googleusercontent.com/media/research.google.com/en//archive/chubby-osdi06.pdf):** Google’s lock service for loosely coupled systems.
59+
16. **[Cassandra: Decentralized Structured Storage System (2009)](https://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf)**
60+
17. **[Bitcoin: A Peer-to-Peer Electronic Cash System (Nakamoto, 2008)](https://bitcoin.org/bitcoin.pdf)**
61+
- Introduces blockchain and proof-of-work consensus.
62+
63+
---
64+
65+
### **Where to Find More Papers**
66+
- **[MIT’s Distributed Systems Reading List](https://pdos.csail.mit.edu/6.824/schedule.html)**
67+
- **[University of Cambridge’s Systems Reading Group](https://www.cst.cam.ac.uk/systems-seminars)**
68+
- **[Papers We Love (Distributed Systems)](https://github.com/papers-we-love/papers-we-love/tree/master/distributed_systems)**
69+
70+
---
71+
72+
### **Bonus Resources**
73+
- **Books**:
74+
- [*Designing Data-Intensive Applications* by Martin Kleppmann](https://dataintensive.net/) (cites many of these papers).
75+
- **Courses**:
76+
- [MIT 6.824: Distributed Systems](https://pdos.csail.mit.edu/6.824/) (uses papers like Raft, MapReduce, and Spanner in labs).
77+
78+
These papers will help you internalize the principles behind systems like DynamoDB, Kubernetes, Kafka, and more. For interviews, focus on understanding the **trade-offs** and **design motivations** in these papers.
79+

0 commit comments

Comments
 (0)