|
@@ -13,6 +13,9 @@ There are extra items I added at the bottom that may come up in the interview or
|
|
|
Steve Yegge's "[Get that job at Google](http://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html)" and are reflected
|
|
|
sometimes word-for-word in Google's coaching notes.
|
|
|
|
|
|
+I've pared down what you need to know from what Yegge says for new software engineers. If you have many years of experience, expect a harder interview.
|
|
|
+[Read more here](https://googleyasheck.com/what-you-need-to-know-for-your-google-interview-and-what-you-dont/).
|
|
|
+
|
|
|
---
|
|
|
|
|
|
## Table of Contents
|
|
@@ -46,11 +49,20 @@ sometimes word-for-word in Google's coaching notes.
|
|
|
- [Trees - Notes & Background](#trees---notes--background)
|
|
|
- [Binary search trees: BSTs](#binary-search-trees-bsts)
|
|
|
- [Heap / Priority Queue / Binary Heap](#heap--priority-queue--binary-heap)
|
|
|
- - [Tries](#tries)
|
|
|
- - [Balanced search trees](#balanced-search-trees)
|
|
|
- - [N-ary (K-ary, M-ary) trees](#n-ary-k-ary-m-ary-trees)
|
|
|
+ - balanced search trees (general concept, not details)
|
|
|
+ - traversals: preorder, inorder, postorder, BFS, DFS
|
|
|
- [Sorting](#sorting)
|
|
|
+ - selection
|
|
|
+ - insertion
|
|
|
+ - heapsort
|
|
|
+ - quicksort
|
|
|
+ - merge sort
|
|
|
- [Graphs](#graphs)
|
|
|
+ - directed
|
|
|
+ - undirected
|
|
|
+ - adjacency matrix
|
|
|
+ - adjacency list
|
|
|
+ - traversals: BFS, DFS
|
|
|
- [Even More Knowledge](#even-more-knowledge)
|
|
|
- [Recursion](#recursion)
|
|
|
- [Dynamic Programming](#dynamic-programming)
|
|
@@ -58,12 +70,12 @@ sometimes word-for-word in Google's coaching notes.
|
|
|
- [NP, NP-Complete and Approximation Algorithms](#np-np-complete-and-approximation-algorithms)
|
|
|
- [Caches](#caches)
|
|
|
- [Processes and Threads](#processes-and-threads)
|
|
|
- - [System Design, Scalability, Data Handling](#system-design-scalability-data-handling)
|
|
|
- [Papers](#papers)
|
|
|
- [Testing](#testing)
|
|
|
- [Scheduling](#scheduling)
|
|
|
- [Implement system routines](#implement-system-routines)
|
|
|
- [String searching & manipulations](#string-searching--manipulations)
|
|
|
+- [System Design, Scalability, Data Handling](#system-design-scalability-data-handling) (if you have 4+ years experience)
|
|
|
- [Final Review](#final-review)
|
|
|
- [Coding Question Practice](#coding-question-practice)
|
|
|
- [Coding exercises/challenges](#coding-exerciseschallenges)
|
|
@@ -88,7 +100,7 @@ sometimes word-for-word in Google's coaching notes.
|
|
|
- [Entropy](#entropy)
|
|
|
- [Cryptography](#cryptography)
|
|
|
- [Compression](#compression)
|
|
|
- - [Networking](#networking)
|
|
|
+ - [Networking](#networking) (if you have networking experience, expect questions)
|
|
|
- [Computer Security](#computer-security)
|
|
|
- [Garbage collection](#garbage-collection)
|
|
|
- [Parallel Programming](#parallel-programming)
|
|
@@ -100,6 +112,16 @@ sometimes word-for-word in Google's coaching notes.
|
|
|
- [Locality-Sensitive Hashing](#locality-sensitive-hashing)
|
|
|
- [van Emde Boas Trees](#van-emde-boas-trees)
|
|
|
- [Augmented Data Structures](#augmented-data-structures)
|
|
|
+ - [Tries](#tries)
|
|
|
+ - [N-ary (K-ary, M-ary) trees](#n-ary-k-ary-m-ary-trees)
|
|
|
+ - [Balanced search trees](#balanced-search-trees)
|
|
|
+ - AVL trees
|
|
|
+ - Splay trees
|
|
|
+ - Red/black trees
|
|
|
+ - 2-3 search trees
|
|
|
+ - 2-3-4 Trees (aka 2-4 trees)
|
|
|
+ - N-ary (K-ary, M-ary) trees
|
|
|
+ - B-Trees
|
|
|
- [k-D Trees](#k-d-trees)
|
|
|
- [Skip lists](#skip-lists)
|
|
|
- [Network Flows](#network-flows)
|
|
@@ -645,113 +667,6 @@ Write code on a whiteboard or paper, not a computer. Test with some sample input
|
|
|
- [ ] heap_sort() - take an unsorted array and turn it into a sorted array in-place using a max heap
|
|
|
- note: using a min heap instead would save operations, but double the space needed (cannot do in-place).
|
|
|
|
|
|
-- ### Tries
|
|
|
- - Note there are different kinds of tries. Some have prefixes, some don't, and some use string instead of bits
|
|
|
- to track the path.
|
|
|
- - I read through code, but will not implement.
|
|
|
- - [ ] [Sedgewick - Tries (3 videos)](https://www.youtube.com/playlist?list=PLe-ggMe31CTe9IyG9MB8vt5xUJeYgOYRQ)
|
|
|
- - [ ] [1. R Way Tries](https://www.youtube.com/watch?v=buq2bn8x3Vo&index=3&list=PLe-ggMe31CTe9IyG9MB8vt5xUJeYgOYRQ)
|
|
|
- - [ ] [2. Ternary Search Tries](https://www.youtube.com/watch?v=LelV-kkYMIg&index=2&list=PLe-ggMe31CTe9IyG9MB8vt5xUJeYgOYRQ)
|
|
|
- - [ ] [3. Character Based Operations](https://www.youtube.com/watch?v=00YaFPcC65g&list=PLe-ggMe31CTe9IyG9MB8vt5xUJeYgOYRQ&index=1)
|
|
|
- - [ ] [Notes on Data Structures and Programming Techniques](http://www.cs.yale.edu/homes/aspnes/classes/223/notes.html#Tries)
|
|
|
- - [ ] Short course videos:
|
|
|
- - [ ] [Introduction To Tries (video)](https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/08Xyf/core-introduction-to-tries)
|
|
|
- - [ ] [Performance Of Tries (video)](https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/PvlZW/core-performance-of-tries)
|
|
|
- - [ ] [Implementing A Trie (video)](https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/DFvd3/core-implementing-a-trie)
|
|
|
- - [ ] [The Trie: A Neglected Data Structure](https://www.toptal.com/java/the-trie-a-neglected-data-structure)
|
|
|
- - [ ] [TopCoder - Using Tries](https://www.topcoder.com/community/data-science/data-science-tutorials/using-tries/)
|
|
|
- - [ ] [Stanford Lecture (real world use case) (video)](https://www.youtube.com/watch?v=TJ8SkcUSdbU)
|
|
|
- - [ ] [MIT, Advanced Data Structures, Strings (can get pretty obscure about halfway through)](https://www.youtube.com/watch?v=NinWEPPrkDQ&index=16&list=PLUl4u3cNGP61hsJNdULdudlRL493b-XZf)
|
|
|
-
|
|
|
-- ### Balanced search trees
|
|
|
- - Know least one type of balanced binary tree (and know how it's implemented):
|
|
|
- - "Among balanced search trees, AVL and 2/3 trees are now passé, and red-black trees seem to be more popular.
|
|
|
- A particularly interesting self-organizing data structure is the splay tree, which uses rotations
|
|
|
- to move any accessed key to the root." - Skiena
|
|
|
- - Of these, I chose to implement a splay tree. From what I've read, you won't implement a
|
|
|
- balanced search tree in your interview. But I wanted exposure to coding one up
|
|
|
- and let's face it, splay trees are the bee's knees. I did read a lot of red-black tree code.
|
|
|
- - splay tree: insert, search, delete functions
|
|
|
- If you end up implementing red/black tree try just these:
|
|
|
- - search and insertion functions, skipping delete
|
|
|
- - I want to learn more about B-Tree since it's used so widely with very large data sets.
|
|
|
- - [ ] [Self-balancing binary search tree](https://en.wikipedia.org/wiki/Self-balancing_binary_search_tree)
|
|
|
-
|
|
|
- - [ ] **AVL trees**
|
|
|
- - In practice:
|
|
|
- From what I can tell, these aren't used much in practice, but I could see where they would be:
|
|
|
- The AVL tree is another structure supporting O(log n) search, insertion, and removal. It is more rigidly
|
|
|
- balanced than red–black trees, leading to slower insertion and removal but faster retrieval. This makes it
|
|
|
- attractive for data structures that may be built once and loaded without reconstruction, such as language
|
|
|
- dictionaries (or program dictionaries, such as the opcodes of an assembler or interpreter).
|
|
|
- - [ ] [MIT AVL Trees / AVL Sort (video)](https://www.youtube.com/watch?v=FNeL18KsWPc&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=6)
|
|
|
- - [ ] [AVL Trees (video)](https://www.coursera.org/learn/data-structures/lecture/Qq5E0/avl-trees)
|
|
|
- - [ ] [AVL Tree Implementation (video)](https://www.coursera.org/learn/data-structures/lecture/PKEBC/avl-tree-implementation)
|
|
|
- - [ ] [Split And Merge](https://www.coursera.org/learn/data-structures/lecture/22BgE/split-and-merge)
|
|
|
-
|
|
|
- - [ ] **Splay trees**
|
|
|
- - In practice:
|
|
|
- Splay trees are typically used in the implementation of caches, memory allocators, routers, garbage collectors,
|
|
|
- data compression, ropes (replacement of string used for long text strings), in Windows NT (in the virtual memory,
|
|
|
- networking, and file system code) etc.
|
|
|
- - [ ] [CS 61B: Splay Trees (video)](https://www.youtube.com/watch?v=Najzh1rYQTo&index=23&list=PL-XXv-cvA_iAlnI-BQr9hjqADPBtujFJd)
|
|
|
- - [ ] MIT Lecture: Splay Trees:
|
|
|
- - Gets very mathy, but watch the last 10 minutes for sure.
|
|
|
- - [Video](https://www.youtube.com/watch?v=QnPl_Y6EqMo)
|
|
|
-
|
|
|
- - [ ] **2-3 search trees**
|
|
|
- - In practice:
|
|
|
- 2-3 trees have faster inserts at the expense of slower searches (since height is more compared to AVL trees).
|
|
|
- - You would use 2-3 tree very rarely because its implementation involves different types of nodes. Instead, people use Red Black trees.
|
|
|
- - [ ] [23-Tree Intuition and Definition (video)](https://www.youtube.com/watch?v=C3SsdUqasD4&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6&index=2)
|
|
|
- - [ ] [Binary View of 23-Tree](https://www.youtube.com/watch?v=iYvBtGKsqSg&index=3&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6)
|
|
|
- - [ ] [2-3 Trees (student recitation) (video)](https://www.youtube.com/watch?v=TOb1tuEZ2X4&index=5&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp)
|
|
|
-
|
|
|
- - [ ] **2-3-4 Trees (aka 2-4 trees)**
|
|
|
- - In practice:
|
|
|
- For every 2-4 tree, there are corresponding red–black trees with data elements in the same order. The insertion and deletion
|
|
|
- operations on 2-4 trees are also equivalent to color-flipping and rotations in red–black trees. This makes 2-4 trees an
|
|
|
- important tool for understanding the logic behind red–black trees, and this is why many introductory algorithm texts introduce
|
|
|
- 2-4 trees just before red–black trees, even though **2-4 trees are not often used in practice**.
|
|
|
- - [ ] [CS 61B Lecture 26: Balanced Search Trees (video)](https://www.youtube.com/watch?v=zqrqYXkth6Q&index=26&list=PL4BBB74C7D2A1049C)
|
|
|
- - [ ] [Bottom Up 234-Trees (video)](https://www.youtube.com/watch?v=DQdMYevEyE4&index=4&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6)
|
|
|
- - [ ] [Top Down 234-Trees (video)](https://www.youtube.com/watch?v=2679VQ26Fp4&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6&index=5)
|
|
|
-
|
|
|
- - [ ] **B-Trees**
|
|
|
- - fun fact: it's a mystery, but the B could stand for Boeing, Balanced, or Bayer (co-inventor)
|
|
|
- - In Practice:
|
|
|
- B-Trees are widely used in databases. Most modern filesystems use B-trees (or Variants). In addition to
|
|
|
- its use in databases, the B-tree is also used in filesystems to allow quick random access to an arbitrary
|
|
|
- block in a particular file. The basic problem is turning the file block i address into a disk block
|
|
|
- (or perhaps to a cylinder-head-sector) address.
|
|
|
- - [ ] [B-Tree](https://en.wikipedia.org/wiki/B-tree)
|
|
|
- - [ ] [Introduction to B-Trees (video)](https://www.youtube.com/watch?v=I22wEC1tTGo&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6&index=6)
|
|
|
- - [ ] [B-Tree Definition and Insertion (video)](https://www.youtube.com/watch?v=s3bCdZGrgpA&index=7&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6)
|
|
|
- - [ ] [B-Tree Deletion (video)](https://www.youtube.com/watch?v=svfnVhJOfMc&index=8&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6)
|
|
|
- - [ ] [MIT 6.851 - Memory Hierarchy Models (video)](https://www.youtube.com/watch?v=V3omVLzI0WE&index=7&list=PLUl4u3cNGP61hsJNdULdudlRL493b-XZf)
|
|
|
- - covers cache-oblivious B-Trees, very interesting data structures
|
|
|
- - the first 37 minutes are very technical, may be skipped (B is block size, cache line size)
|
|
|
-
|
|
|
- - [ ] **Red/black trees**
|
|
|
- - In practice:
|
|
|
- Red–black trees offer worst-case guarantees for insertion time, deletion time, and search time.
|
|
|
- Not only does this make them valuable in time-sensitive applications such as real-time applications,
|
|
|
- but it makes them valuable building blocks in other data structures which provide worst-case guarantees;
|
|
|
- for example, many data structures used in computational geometry can be based on red–black trees, and
|
|
|
- the Completely Fair Scheduler used in current Linux kernels uses red–black trees. In the version 8 of Java,
|
|
|
- the Collection HashMap has been modified such that instead of using a LinkedList to store identical elements with poor
|
|
|
- hashcodes, a Red-Black tree is used.
|
|
|
- - [ ] [Aduni - Algorithms - Lecture 4 (link jumps to starting point) (video)](https://youtu.be/1W3x0f_RmUo?list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&t=3871)
|
|
|
- - [ ] [Aduni - Algorithms - Lecture 5 (video)](https://www.youtube.com/watch?v=hm2GHwyKF1o&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&index=5)
|
|
|
- - [ ] [Black Tree](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree)
|
|
|
- - [ ] [An Introduction To Binary Search And Red Black Tree](https://www.topcoder.com/community/data-science/data-science-tutorials/an-introduction-to-binary-search-and-red-black-trees/)
|
|
|
-
|
|
|
-- ### N-ary (K-ary, M-ary) trees
|
|
|
- - note: the N or K is the branching factor (max branches)
|
|
|
- - binary trees are a 2-ary tree, with branching factor = 2
|
|
|
- - 2-3 trees are 3-ary
|
|
|
- - [ ] [K-Ary Tree](https://en.wikipedia.org/wiki/K-ary_tree)
|
|
|
-
|
|
|
## Sorting
|
|
|
|
|
|
- [ ] Notes:
|
|
@@ -1002,9 +917,78 @@ You'll get more graph practice in Skiena's book (see Books section below) and th
|
|
|
- [ ] [Keynote David Beazley - Topics of Interest (Python Asyncio)](https://www.youtube.com/watch?v=ZzfHjytDceU)
|
|
|
- [ ] [Mutex in Python](https://www.youtube.com/watch?v=0zaPs8OtyKY)
|
|
|
|
|
|
+- ### Papers
|
|
|
+ - These are Google papers and well-known papers.
|
|
|
+ - Reading all from end to end with full comprehension will likely take more time than you have. I recommend being selective on papers and their sections.
|
|
|
+ - [ ] [1978: Communicating Sequential Processes](http://spinroot.com/courses/summer/Papers/hoare_1978.pdf)
|
|
|
+ - [implemented in Go](https://godoc.org/github.com/thomas11/csp)
|
|
|
+ - [Love classic papers?](https://www.cs.cmu.edu/~crary/819-f09/)
|
|
|
+ - [ ] [2003: The Google File System](http://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf)
|
|
|
+ - replaced by Colossus in 2012
|
|
|
+ - [ ] [2004: MapReduce: Simplified Data Processing on Large Clusters]( http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf)
|
|
|
+ - mostly replaced by Cloud Dataflow?
|
|
|
+ - [ ] [2007: What Every Programmer Should Know About Memory (very long, and the author encourages skipping of some sections)](https://www.akkadia.org/drepper/cpumemory.pdf)
|
|
|
+ - [ ] [2012: Google's Colossus](https://www.wired.com/2012/07/google-colossus/)
|
|
|
+ - paper not available
|
|
|
+ - [ ] 2012: AddressSanitizer: A Fast Address Sanity Checker:
|
|
|
+ - [paper](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/37752.pdf)
|
|
|
+ - [video](https://www.usenix.org/conference/atc12/technical-sessions/presentation/serebryany)
|
|
|
+ - [ ] 2013: Spanner: Google’s Globally-Distributed Database:
|
|
|
+ - [paper](http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf)
|
|
|
+ - [video](https://www.usenix.org/node/170855)
|
|
|
+ - [ ] [2014: Machine Learning: The High-Interest Credit Card of Technical Debt](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43146.pdf)
|
|
|
+ - [ ] [2015: Continuous Pipelines at Google](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43790.pdf)
|
|
|
+ - [ ] [2015: High-Availability at Massive Scale: Building Google’s Data Infrastructure for Ads](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44686.pdf)
|
|
|
+ - [ ] [2015: TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems](http://download.tensorflow.org/paper/whitepaper2015.pdf )
|
|
|
+ - [ ] [2015: How Developers Search for Code: A Case Study](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43835.pdf)
|
|
|
+ - [ ] [2016: Borg, Omega, and Kubernetes](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44843.pdf)
|
|
|
+
|
|
|
+- ### Testing
|
|
|
+ - To cover:
|
|
|
+ - how unit testing works
|
|
|
+ - what are mock objects
|
|
|
+ - what is integration testing
|
|
|
+ - what is dependency injection
|
|
|
+ - [ ] [Agile Software Testing with James Bach (video)](https://www.youtube.com/watch?v=SAhJf36_u5U)
|
|
|
+ - [ ] [Open Lecture by James Bach on Software Testing (video)](https://www.youtube.com/watch?v=ILkT_HV9DVU)
|
|
|
+ - [ ] [Steve Freeman - Test-Driven Development (that’s not what we meant) (video)](https://vimeo.com/83960706)
|
|
|
+ - [slides](http://gotocon.com/dl/goto-berlin-2013/slides/SteveFreeman_TestDrivenDevelopmentThatsNotWhatWeMeant.pdf)
|
|
|
+ - [ ] [TDD is dead. Long live testing.](http://david.heinemeierhansson.com/2014/tdd-is-dead-long-live-testing.html)
|
|
|
+ - [ ] [Is TDD dead? (video)](https://www.youtube.com/watch?v=z9quxZsLcfo)
|
|
|
+ - [ ] [Video series (152 videos) - not all are needed (video)](https://www.youtube.com/watch?v=nzJapzxH_rE&list=PLAwxTw4SYaPkWVHeC_8aSIbSxE_NXI76g)
|
|
|
+ - [ ] [Test-Driven Web Development with Python](http://www.obeythetestinggoat.com/pages/book.html#toc)
|
|
|
+ - [ ] Dependency injection:
|
|
|
+ - [ ] [video](https://www.youtube.com/watch?v=IKD2-MAkXyQ)
|
|
|
+ - [ ] [Tao Of Testing](http://jasonpolites.github.io/tao-of-testing/ch3-1.1.html)
|
|
|
+ - [ ] [How to write tests](http://jasonpolites.github.io/tao-of-testing/ch4-1.1.html)
|
|
|
+
|
|
|
+- ### Scheduling
|
|
|
+ - in an OS, how it works
|
|
|
+ - can be gleaned from Operating System videos
|
|
|
+
|
|
|
+- ### Implement system routines
|
|
|
+ - understand what lies beneath the programming APIs you use
|
|
|
+ - can you implement them?
|
|
|
+
|
|
|
+- ### String searching & manipulations
|
|
|
+ - [ ] [Sedgewick - Suffix Arrays (video)](https://www.youtube.com/watch?v=HKPrVm5FWvg)
|
|
|
+ - [ ] [Sedgewick - Substring Search (videos)](https://www.youtube.com/watch?v=2LvvVFCEIv8&list=PLe-ggMe31CTdAdjXB3lIuf2maubzo9t66&index=5)
|
|
|
+ - [ ] [1. Introduction to Substring Search](https://www.youtube.com/watch?v=2LvvVFCEIv8&list=PLe-ggMe31CTdAdjXB3lIuf2maubzo9t66&index=5)
|
|
|
+ - [ ] [2. Brute-Force Substring Search](https://www.youtube.com/watch?v=CcDXwIGEXYU&list=PLe-ggMe31CTdAdjXB3lIuf2maubzo9t66&index=4)
|
|
|
+ - [ ] [3. Knuth-Morris Pratt](https://www.youtube.com/watch?v=n-7n-FDEWzc&index=3&list=PLe-ggMe31CTdAdjXB3lIuf2maubzo9t66)
|
|
|
+ - [ ] [4. Boyer-Moore](https://www.youtube.com/watch?v=fI7Ch6pZXfM&list=PLe-ggMe31CTdAdjXB3lIuf2maubzo9t66&index=2)
|
|
|
+ - [ ] [5. Rabin-Karp](https://www.youtube.com/watch?v=QzI0p6zDjK4&index=1&list=PLe-ggMe31CTdAdjXB3lIuf2maubzo9t66)
|
|
|
+ - [ ] [Search pattern in text (video)](https://www.coursera.org/learn/data-structures/lecture/tAfHI/search-pattern-in-text)
|
|
|
+
|
|
|
+ If you need more detail on this subject, see "String Matching" section in [Additional Detail on Some Subjects](#additional-detail-on-some-subjects)
|
|
|
+
|
|
|
+---
|
|
|
|
|
|
Scalability and System Design are very large topics with many topics and resources, since there is a lot to consider
|
|
|
when designing a software/hardware system that can scale. Expect to spend quite a bit of time on this.
|
|
|
+
|
|
|
+ You can expect system design questions if you have 4+ years of experience
|
|
|
+
|
|
|
|
|
|
- ### System Design, Scalability, Data Handling
|
|
|
- Considerations from Yegge:
|
|
@@ -1147,71 +1131,6 @@ You'll get more graph practice in Skiena's book (see Books section below) and th
|
|
|
- [Design a URL-shortener system: copied from above](http://www.hiredintech.com/system-design/the-system-design-process/)
|
|
|
- [Design a cache system](https://www.adayinthelifeof.nl/2011/02/06/memcache-internals/)
|
|
|
|
|
|
-- ### Papers
|
|
|
- - These are Google papers and well-known papers.
|
|
|
- - Reading all from end to end with full comprehension will likely take more time than you have. I recommend being selective on papers and their sections.
|
|
|
- - [ ] [1978: Communicating Sequential Processes](http://spinroot.com/courses/summer/Papers/hoare_1978.pdf)
|
|
|
- - [implemented in Go](https://godoc.org/github.com/thomas11/csp)
|
|
|
- - [Love classic papers?](https://www.cs.cmu.edu/~crary/819-f09/)
|
|
|
- - [ ] [2003: The Google File System](http://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf)
|
|
|
- - replaced by Colossus in 2012
|
|
|
- - [ ] [2004: MapReduce: Simplified Data Processing on Large Clusters]( http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf)
|
|
|
- - mostly replaced by Cloud Dataflow?
|
|
|
- - [ ] [2007: What Every Programmer Should Know About Memory (very long, and the author encourages skipping of some sections)](https://www.akkadia.org/drepper/cpumemory.pdf)
|
|
|
- - [ ] [2012: Google's Colossus](https://www.wired.com/2012/07/google-colossus/)
|
|
|
- - paper not available
|
|
|
- - [ ] 2012: AddressSanitizer: A Fast Address Sanity Checker:
|
|
|
- - [paper](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/37752.pdf)
|
|
|
- - [video](https://www.usenix.org/conference/atc12/technical-sessions/presentation/serebryany)
|
|
|
- - [ ] 2013: Spanner: Google’s Globally-Distributed Database:
|
|
|
- - [paper](http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf)
|
|
|
- - [video](https://www.usenix.org/node/170855)
|
|
|
- - [ ] [2014: Machine Learning: The High-Interest Credit Card of Technical Debt](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43146.pdf)
|
|
|
- - [ ] [2015: Continuous Pipelines at Google](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43790.pdf)
|
|
|
- - [ ] [2015: High-Availability at Massive Scale: Building Google’s Data Infrastructure for Ads](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44686.pdf)
|
|
|
- - [ ] [2015: TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems](http://download.tensorflow.org/paper/whitepaper2015.pdf )
|
|
|
- - [ ] [2015: How Developers Search for Code: A Case Study](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43835.pdf)
|
|
|
- - [ ] [2016: Borg, Omega, and Kubernetes](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44843.pdf)
|
|
|
-
|
|
|
-- ### Testing
|
|
|
- - To cover:
|
|
|
- - how unit testing works
|
|
|
- - what are mock objects
|
|
|
- - what is integration testing
|
|
|
- - what is dependency injection
|
|
|
- - [ ] [Agile Software Testing with James Bach (video)](https://www.youtube.com/watch?v=SAhJf36_u5U)
|
|
|
- - [ ] [Open Lecture by James Bach on Software Testing (video)](https://www.youtube.com/watch?v=ILkT_HV9DVU)
|
|
|
- - [ ] [Steve Freeman - Test-Driven Development (that’s not what we meant) (video)](https://vimeo.com/83960706)
|
|
|
- - [slides](http://gotocon.com/dl/goto-berlin-2013/slides/SteveFreeman_TestDrivenDevelopmentThatsNotWhatWeMeant.pdf)
|
|
|
- - [ ] [TDD is dead. Long live testing.](http://david.heinemeierhansson.com/2014/tdd-is-dead-long-live-testing.html)
|
|
|
- - [ ] [Is TDD dead? (video)](https://www.youtube.com/watch?v=z9quxZsLcfo)
|
|
|
- - [ ] [Video series (152 videos) - not all are needed (video)](https://www.youtube.com/watch?v=nzJapzxH_rE&list=PLAwxTw4SYaPkWVHeC_8aSIbSxE_NXI76g)
|
|
|
- - [ ] [Test-Driven Web Development with Python](http://www.obeythetestinggoat.com/pages/book.html#toc)
|
|
|
- - [ ] Dependency injection:
|
|
|
- - [ ] [video](https://www.youtube.com/watch?v=IKD2-MAkXyQ)
|
|
|
- - [ ] [Tao Of Testing](http://jasonpolites.github.io/tao-of-testing/ch3-1.1.html)
|
|
|
- - [ ] [How to write tests](http://jasonpolites.github.io/tao-of-testing/ch4-1.1.html)
|
|
|
-
|
|
|
-- ### Scheduling
|
|
|
- - in an OS, how it works
|
|
|
- - can be gleaned from Operating System videos
|
|
|
-
|
|
|
-- ### Implement system routines
|
|
|
- - understand what lies beneath the programming APIs you use
|
|
|
- - can you implement them?
|
|
|
-
|
|
|
-- ### String searching & manipulations
|
|
|
- - [ ] [Sedgewick - Suffix Arrays (video)](https://www.youtube.com/watch?v=HKPrVm5FWvg)
|
|
|
- - [ ] [Sedgewick - Substring Search (videos)](https://www.youtube.com/watch?v=2LvvVFCEIv8&list=PLe-ggMe31CTdAdjXB3lIuf2maubzo9t66&index=5)
|
|
|
- - [ ] [1. Introduction to Substring Search](https://www.youtube.com/watch?v=2LvvVFCEIv8&list=PLe-ggMe31CTdAdjXB3lIuf2maubzo9t66&index=5)
|
|
|
- - [ ] [2. Brute-Force Substring Search](https://www.youtube.com/watch?v=CcDXwIGEXYU&list=PLe-ggMe31CTdAdjXB3lIuf2maubzo9t66&index=4)
|
|
|
- - [ ] [3. Knuth-Morris Pratt](https://www.youtube.com/watch?v=n-7n-FDEWzc&index=3&list=PLe-ggMe31CTdAdjXB3lIuf2maubzo9t66)
|
|
|
- - [ ] [4. Boyer-Moore](https://www.youtube.com/watch?v=fI7Ch6pZXfM&list=PLe-ggMe31CTdAdjXB3lIuf2maubzo9t66&index=2)
|
|
|
- - [ ] [5. Rabin-Karp](https://www.youtube.com/watch?v=QzI0p6zDjK4&index=1&list=PLe-ggMe31CTdAdjXB3lIuf2maubzo9t66)
|
|
|
- - [ ] [Search pattern in text (video)](https://www.coursera.org/learn/data-structures/lecture/tAfHI/search-pattern-in-text)
|
|
|
-
|
|
|
- If you need more detail on this subject, see "String Matching" section in [Additional Detail on Some Subjects](#additional-detail-on-some-subjects)
|
|
|
-
|
|
|
---
|
|
|
|
|
|
## Final Review
|
|
@@ -1678,6 +1597,115 @@ You're never really done.
|
|
|
- ### Augmented Data Structures
|
|
|
- [ ] [CS 61B Lecture 39: Augmenting Data Structures](https://youtu.be/zksIj9O8_jc?list=PL4BBB74C7D2A1049C&t=950)
|
|
|
|
|
|
+- ### Tries
|
|
|
+ - Note there are different kinds of tries. Some have prefixes, some don't, and some use string instead of bits
|
|
|
+ to track the path.
|
|
|
+ - I read through code, but will not implement.
|
|
|
+ - [ ] [Sedgewick - Tries (3 videos)](https://www.youtube.com/playlist?list=PLe-ggMe31CTe9IyG9MB8vt5xUJeYgOYRQ)
|
|
|
+ - [ ] [1. R Way Tries](https://www.youtube.com/watch?v=buq2bn8x3Vo&index=3&list=PLe-ggMe31CTe9IyG9MB8vt5xUJeYgOYRQ)
|
|
|
+ - [ ] [2. Ternary Search Tries](https://www.youtube.com/watch?v=LelV-kkYMIg&index=2&list=PLe-ggMe31CTe9IyG9MB8vt5xUJeYgOYRQ)
|
|
|
+ - [ ] [3. Character Based Operations](https://www.youtube.com/watch?v=00YaFPcC65g&list=PLe-ggMe31CTe9IyG9MB8vt5xUJeYgOYRQ&index=1)
|
|
|
+ - [ ] [Notes on Data Structures and Programming Techniques](http://www.cs.yale.edu/homes/aspnes/classes/223/notes.html#Tries)
|
|
|
+ - [ ] Short course videos:
|
|
|
+ - [ ] [Introduction To Tries (video)](https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/08Xyf/core-introduction-to-tries)
|
|
|
+ - [ ] [Performance Of Tries (video)](https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/PvlZW/core-performance-of-tries)
|
|
|
+ - [ ] [Implementing A Trie (video)](https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/DFvd3/core-implementing-a-trie)
|
|
|
+ - [ ] [The Trie: A Neglected Data Structure](https://www.toptal.com/java/the-trie-a-neglected-data-structure)
|
|
|
+ - [ ] [TopCoder - Using Tries](https://www.topcoder.com/community/data-science/data-science-tutorials/using-tries/)
|
|
|
+ - [ ] [Stanford Lecture (real world use case) (video)](https://www.youtube.com/watch?v=TJ8SkcUSdbU)
|
|
|
+ - [ ] [MIT, Advanced Data Structures, Strings (can get pretty obscure about halfway through)](https://www.youtube.com/watch?v=NinWEPPrkDQ&index=16&list=PLUl4u3cNGP61hsJNdULdudlRL493b-XZf)
|
|
|
+
|
|
|
+- ### Balanced search trees
|
|
|
+ - Know least one type of balanced binary tree (and know how it's implemented):
|
|
|
+ - "Among balanced search trees, AVL and 2/3 trees are now passé, and red-black trees seem to be more popular.
|
|
|
+ A particularly interesting self-organizing data structure is the splay tree, which uses rotations
|
|
|
+ to move any accessed key to the root." - Skiena
|
|
|
+ - Of these, I chose to implement a splay tree. From what I've read, you won't implement a
|
|
|
+ balanced search tree in your interview. But I wanted exposure to coding one up
|
|
|
+ and let's face it, splay trees are the bee's knees. I did read a lot of red-black tree code.
|
|
|
+ - splay tree: insert, search, delete functions
|
|
|
+ If you end up implementing red/black tree try just these:
|
|
|
+ - search and insertion functions, skipping delete
|
|
|
+ - I want to learn more about B-Tree since it's used so widely with very large data sets.
|
|
|
+ - [ ] [Self-balancing binary search tree](https://en.wikipedia.org/wiki/Self-balancing_binary_search_tree)
|
|
|
+
|
|
|
+ - [ ] **AVL trees**
|
|
|
+ - In practice:
|
|
|
+ From what I can tell, these aren't used much in practice, but I could see where they would be:
|
|
|
+ The AVL tree is another structure supporting O(log n) search, insertion, and removal. It is more rigidly
|
|
|
+ balanced than red–black trees, leading to slower insertion and removal but faster retrieval. This makes it
|
|
|
+ attractive for data structures that may be built once and loaded without reconstruction, such as language
|
|
|
+ dictionaries (or program dictionaries, such as the opcodes of an assembler or interpreter).
|
|
|
+ - [ ] [MIT AVL Trees / AVL Sort (video)](https://www.youtube.com/watch?v=FNeL18KsWPc&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=6)
|
|
|
+ - [ ] [AVL Trees (video)](https://www.coursera.org/learn/data-structures/lecture/Qq5E0/avl-trees)
|
|
|
+ - [ ] [AVL Tree Implementation (video)](https://www.coursera.org/learn/data-structures/lecture/PKEBC/avl-tree-implementation)
|
|
|
+ - [ ] [Split And Merge](https://www.coursera.org/learn/data-structures/lecture/22BgE/split-and-merge)
|
|
|
+
|
|
|
+ - [ ] **Splay trees**
|
|
|
+ - In practice:
|
|
|
+ Splay trees are typically used in the implementation of caches, memory allocators, routers, garbage collectors,
|
|
|
+ data compression, ropes (replacement of string used for long text strings), in Windows NT (in the virtual memory,
|
|
|
+ networking, and file system code) etc.
|
|
|
+ - [ ] [CS 61B: Splay Trees (video)](https://www.youtube.com/watch?v=Najzh1rYQTo&index=23&list=PL-XXv-cvA_iAlnI-BQr9hjqADPBtujFJd)
|
|
|
+ - [ ] MIT Lecture: Splay Trees:
|
|
|
+ - Gets very mathy, but watch the last 10 minutes for sure.
|
|
|
+ - [Video](https://www.youtube.com/watch?v=QnPl_Y6EqMo)
|
|
|
+
|
|
|
+ - [ ] **Red/black trees**
|
|
|
+ - these are a translation of a 2-3 tree (see below)
|
|
|
+ - In practice:
|
|
|
+ Red–black trees offer worst-case guarantees for insertion time, deletion time, and search time.
|
|
|
+ Not only does this make them valuable in time-sensitive applications such as real-time applications,
|
|
|
+ but it makes them valuable building blocks in other data structures which provide worst-case guarantees;
|
|
|
+ for example, many data structures used in computational geometry can be based on red–black trees, and
|
|
|
+ the Completely Fair Scheduler used in current Linux kernels uses red–black trees. In the version 8 of Java,
|
|
|
+ the Collection HashMap has been modified such that instead of using a LinkedList to store identical elements with poor
|
|
|
+ hashcodes, a Red-Black tree is used.
|
|
|
+ - [ ] [Aduni - Algorithms - Lecture 4 (link jumps to starting point) (video)](https://youtu.be/1W3x0f_RmUo?list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&t=3871)
|
|
|
+ - [ ] [Aduni - Algorithms - Lecture 5 (video)](https://www.youtube.com/watch?v=hm2GHwyKF1o&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&index=5)
|
|
|
+ - [ ] [Black Tree](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree)
|
|
|
+ - [ ] [An Introduction To Binary Search And Red Black Tree](https://www.topcoder.com/community/data-science/data-science-tutorials/an-introduction-to-binary-search-and-red-black-trees/)
|
|
|
+
|
|
|
+ - [ ] **2-3 search trees**
|
|
|
+ - In practice:
|
|
|
+ 2-3 trees have faster inserts at the expense of slower searches (since height is more compared to AVL trees).
|
|
|
+ - You would use 2-3 tree very rarely because its implementation involves different types of nodes. Instead, people use Red Black trees.
|
|
|
+ - [ ] [23-Tree Intuition and Definition (video)](https://www.youtube.com/watch?v=C3SsdUqasD4&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6&index=2)
|
|
|
+ - [ ] [Binary View of 23-Tree](https://www.youtube.com/watch?v=iYvBtGKsqSg&index=3&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6)
|
|
|
+ - [ ] [2-3 Trees (student recitation) (video)](https://www.youtube.com/watch?v=TOb1tuEZ2X4&index=5&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp)
|
|
|
+
|
|
|
+ - [ ] **2-3-4 Trees (aka 2-4 trees)**
|
|
|
+ - In practice:
|
|
|
+ For every 2-4 tree, there are corresponding red–black trees with data elements in the same order. The insertion and deletion
|
|
|
+ operations on 2-4 trees are also equivalent to color-flipping and rotations in red–black trees. This makes 2-4 trees an
|
|
|
+ important tool for understanding the logic behind red–black trees, and this is why many introductory algorithm texts introduce
|
|
|
+ 2-4 trees just before red–black trees, even though **2-4 trees are not often used in practice**.
|
|
|
+ - [ ] [CS 61B Lecture 26: Balanced Search Trees (video)](https://www.youtube.com/watch?v=zqrqYXkth6Q&index=26&list=PL4BBB74C7D2A1049C)
|
|
|
+ - [ ] [Bottom Up 234-Trees (video)](https://www.youtube.com/watch?v=DQdMYevEyE4&index=4&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6)
|
|
|
+ - [ ] [Top Down 234-Trees (video)](https://www.youtube.com/watch?v=2679VQ26Fp4&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6&index=5)
|
|
|
+
|
|
|
+ - [ ] **N-ary (K-ary, M-ary) trees**
|
|
|
+ - note: the N or K is the branching factor (max branches)
|
|
|
+ - binary trees are a 2-ary tree, with branching factor = 2
|
|
|
+ - 2-3 trees are 3-ary
|
|
|
+ - [ ] [K-Ary Tree](https://en.wikipedia.org/wiki/K-ary_tree)
|
|
|
+
|
|
|
+ - [ ] **B-Trees**
|
|
|
+ - fun fact: it's a mystery, but the B could stand for Boeing, Balanced, or Bayer (co-inventor)
|
|
|
+ - In Practice:
|
|
|
+ B-Trees are widely used in databases. Most modern filesystems use B-trees (or Variants). In addition to
|
|
|
+ its use in databases, the B-tree is also used in filesystems to allow quick random access to an arbitrary
|
|
|
+ block in a particular file. The basic problem is turning the file block i address into a disk block
|
|
|
+ (or perhaps to a cylinder-head-sector) address.
|
|
|
+ - [ ] [B-Tree](https://en.wikipedia.org/wiki/B-tree)
|
|
|
+ - [ ] [Introduction to B-Trees (video)](https://www.youtube.com/watch?v=I22wEC1tTGo&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6&index=6)
|
|
|
+ - [ ] [B-Tree Definition and Insertion (video)](https://www.youtube.com/watch?v=s3bCdZGrgpA&index=7&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6)
|
|
|
+ - [ ] [B-Tree Deletion (video)](https://www.youtube.com/watch?v=svfnVhJOfMc&index=8&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6)
|
|
|
+ - [ ] [MIT 6.851 - Memory Hierarchy Models (video)](https://www.youtube.com/watch?v=V3omVLzI0WE&index=7&list=PLUl4u3cNGP61hsJNdULdudlRL493b-XZf)
|
|
|
+ - covers cache-oblivious B-Trees, very interesting data structures
|
|
|
+ - the first 37 minutes are very technical, may be skipped (B is block size, cache line size)
|
|
|
+
|
|
|
+
|
|
|
- ### k-D Trees
|
|
|
- great for finding number of points in a rectangle or higher dimension object
|
|
|
- a good fit for k-nearest neighbors
|