I originally created this as a short to-do list of study topics for becoming a software engineer,
but it grew to the large list you see today. After going through this study plan, I got hired
as a Software Development Engineer at Amazon!
You probably won't have to study as much as I did. Anyway, everything you need is here.
Please Note: You won't need to study as much as I did. I wasted a lot of time on things I didn't need to know. More info about that is below. I'll help you get there without wasting your precious time.
The items listed here will prepare you well for a technical interview at just about any software company,
including the giants: Amazon, Facebook, Google, and Microsoft.
This is my multi-month study plan for becoming a software engineer for a large company.
Required:
A little experience with coding (variables, loops, methods/functions, etc)
Patience
Time
Note this is a study plan for software engineering, not frontend engineering or full-stack development. There are really
super roadmaps and coursework for those career paths elsewhere (see https://roadmap.sh/ for more info).
There is a lot to learn in a university Computer Science program, but only knowing about 75% is good enough for an interview, so that's what I cover here.
For a complete CS self-taught program, the resources for my study plan have been included in Kamran Ahmed's Computer Science Roadmap: https://roadmap.sh/computer-science
If you want to work as a software engineer for a large company, these are the things you have to know.
If you missed out on getting a degree in computer science, like I did, this will catch you up and save four years of your life.
When I started this project, I didn't know a stack from a heap, didn't know Big-O anything, or anything about trees, or how to
traverse a graph. If I had to code a sorting algorithm, I can tell ya it would have been terrible.
Every data structure I had ever used was built into the language, and I didn't know how they worked
under the hood at all. I never had to manage memory unless a process I was running would give an "out of
memory" error, and then I'd have to find a workaround. I used a few multidimensional arrays in my life and
thousands of associative arrays, but I never created data structures from scratch.
It's a long plan. It may take you months. If you are familiar with a lot of this already it will take you a lot less time.
On this page, click the Code button near the top, then click "Download ZIP". Unzip the file and you can work with the text files.
If you're open in a code editor that understands markdown, you'll see everything formatted nicely.
If you're comfortable with git
Create a new branch so you can check items like this, just put an x in the brackets: [x]
Fork the GitHub repo:https://github.com/jwasham/coding-interview-university by clicking on the Fork button.
Clone to your local repo:
git clone https://github.com/<YOUR_GITHUB_USERNAME>/coding-interview-university.git
cd coding-interview-university
git remote add upstream https://github.com/jwasham/coding-interview-university.git
git remote set-url --push upstream DISABLE # so that you don't push your personal progress back to the original repo
Mark all boxes with X after you completed your changes:
git commit -am "Marked personal progress"
git pull upstream main # keep your fork up-to-date with changes from the original repo
git push # just pushes to your fork
Some videos are available only by enrolling in a Coursera or EdX class. These are called MOOCs.
Sometimes the classes are not in session so you have to wait a couple of months, so you have no access.
It would be great to replace the online course resources with free and always-available public sources,
such as YouTube videos (preferably university lectures), so that you people can study these anytime,
not just when a specific online course is in session.
You'll need to choose a programming language for the coding interviews you do,
but you'll also need to find a language that you can use to study computer science concepts.
Preferably the language would be the same, so that you only need to be proficient in one.
For this Study Plan
When I did the study plan, I used 2 languages for most of it: C and Python
C: Very low level. Allows you to deal with pointers and memory allocation/deallocation, so you feel the data structures
and algorithms in your bones. In higher-level languages like Python or Java, these are hidden from you. In day-to-day work, that's terrific,
but when you're learning how these low-level data structures are built, it's great to feel close to the metal.
C is everywhere. You'll see examples in books, lectures, videos, everywhere while you're studying.
This is a short book, but it will give you a great handle on the C language and if you practice it a little
you'll quickly get proficient. Understanding C helps you understand how programs and memory work.
You don't need to go super deep in the book (or even finish it). Just get to where you're comfortable reading and writing in C.
Python: Modern and very expressive, I learned it because it's just super useful and also allows me to write less code in an interview.
This is my preference. You do what you like, of course.
You may not need it, but here are some sites for learning a new language:
This list grew over many months, and yes, it got out of hand.
Here are some mistakes I made so you'll have a better experience. And you'll save months of time.
1. You Won't Remember it All
I watched hours of videos and took copious notes, and months later there was much I didn't remember. I spent 3 days going
through my notes and making flashcards, so I could review. I didn't need all of that knowledge.
To solve the problem, I made a little flashcard site where I could add flashcards of 2 types: general and code.
Each card has a different formatting. I made a mobile-first website, so I could review on my phone or tablet, wherever I am.
Keep in mind I went overboard and have cards covering everything from assembly language and Python trivia to machine learning and statistics.
It's way too much for what's required.
Note on flashcards: The first time you recognize you know the answer, don't mark it as known. You have to see the
same card and answer it several times correctly before you really know it. Repetition will put that knowledge deeper in
your brain.
An alternative to using my flashcard site is Anki, which has been recommended to me numerous times.
It uses a repetition system to help you remember. It's user-friendly, available on all platforms, and has a cloud sync system.
It costs $25 on iOS but is free on other platforms.
Some students have mentioned formatting issues with white space that can be fixed by doing the following: open the deck, edit the card, click cards, select the "styling" radio button, and add the member "white-space: pre;" to the card class.
3. Do Coding Interview Questions While You're Learning
THIS IS VERY IMPORTANT.
Start doing coding interview questions while you're learning data structures and algorithms.
You need to apply what you're learning to solve problems, or you'll forget. I made this mistake.
Once you've learned a topic, and feel somewhat comfortable with it, for example, linked lists:
Later, go back and do another 2 or 3 linked list problems.
Do this with each new topic you learn.
Keep doing problems while you're learning all this stuff, not after.
You're not being hired for knowledge, but how you apply the knowledge.
There are many resources for this, listed below. Keep going.
4. Focus
There are a lot of distractions that can take up valuable time. Focus and concentration are hard. Turn on some music
without lyrics and you'll be able to focus pretty well.
This course goes over a lot of subjects. Each will probably take you a few days, or maybe even a week or more. It depends on your schedule.
Each day, take the next subject in the list, watch some videos about that subject, and then write an implementation
of that data structure or algorithm in the language you chose for this course.
Why you need to practice doing programming problems:
Problem recognition, and where the right data structures and algorithms fit in
Gathering requirements for the problem
Talking your way through the problem like you will in the interview
Coding on a whiteboard or paper, not a computer
Coming up with time and space complexity for your solutions (see Big-O below)
Testing your solutions
There is a great intro for methodical, communicative problem-solving in an interview. You'll get this from the programming
interview books, too, but I found this outstanding:
Algorithm design canvas
Write code on a whiteboard or paper, not a computer. Test with some sample inputs. Then type it and test it out on a computer.
If you don't have a whiteboard at home, pick up a large drawing pad from an art store. You can sit on the couch and practice.
This is my "sofa whiteboard". I added the pen in the photo just for scale. If you use a pen, you'll wish you could erase.
Gets messy quickly. I use a pencil and eraser.
Coding question practice is not about memorizing answers to programming problems.
When you go through "Cracking the Coding Interview", there is a chapter on this, and at the end there is a quiz to see
if you can identify the runtime complexity of different algorithms. It's a super review and test.
Gotcha: you need pointer to pointer knowledge:
(for when you pass a pointer to a function that may change the address where that pointer points)
This page is just to get a grasp on ptr to ptr. I don't recommend this list traversal style. Readability and maintainability suffer due to cleverness.
enqueue(value) - adds value at a position at the tail
dequeue() - returns value and removes least recently added element (front)
empty()
Implement using a fixed-sized array:
enqueue(value) - adds item at end of available storage
dequeue() - returns value and removes least recently added element
empty()
full()
Cost:
a bad implementation using a linked list where you enqueue at the head and dequeue at the tail would be O(n)
because you'd need the next to last element, causing a full traversal of each dequeue
enqueue: O(1) (amortized, linked list and array [probing])
You probably won't see any dynamic programming problems in your interview, but it's worth being able to recognize a
problem as being a candidate for dynamic programming.
This subject can be pretty difficult, as each DP soluble problem must be defined as a recursion relation, and coming up with it can be tricky.
I suggest looking at many examples of DP problems until you have a solid understanding of the pattern involved.
Know about the most famous classes of NP-complete problems, such as the traveling salesman and the knapsack problem,
and be able to recognize them when an interviewer asks you them in disguise.
This section will have shorter videos that you can watch pretty quickly to review most of the important concepts.
It's nice if you want a refresher often.
Series of 2-3 minutes short subject videos (23 videos)
Note by the author: "This is for a US-focused resume. CVs for India and other countries have different expectations, although many of the points will be the same."
Get hands-on practice with over 100 data structures and algorithm exercises and guidance from a dedicated mentor to help prepare you for interviews and on-the-job scenarios.
Think of about 20 interview questions you'll get, along with the lines of the items below. Have at least one answer for each.
Have a story, not just data, about something you accomplished.
Why do you want this job?
What's a tough problem you've solved?
Biggest challenges faced?
Best/worst designs seen?
Ideas for improving an existing product
How do you work best, as an individual and as part of a team?
Which of your skills or experiences would be assets in the role and why?
What did you most enjoy at [job x / project y]?
What was the biggest challenge you faced at [job x / project y]?
What was the hardest bug you faced at [job x / project y]?
What did you learn at [job x / project y]?
What would you have done better at [job x / project y]?
*****************************************************************************************************
*****************************************************************************************************
Everything below this point is optional. It is NOT needed for an entry-level interview.
However, by studying these, you'll get greater exposure to more CS concepts and will be better prepared for
any software engineering job. You'll be a much more well-rounded software engineer.
*****************************************************************************************************
*****************************************************************************************************
Important: Reading this book will only have limited value. This book is a great review of algorithms and data structures, but won't teach you how to write good code. You have to be able to code a decent solution efficiently
AKA CLR, sometimes CLRS, because Stein was late to the game
You can expect system design questions if you have 4+ years of experience.
Scalability and System Design are very large topics with many topics and resources, since
there is a lot to consider when designing a software/hardware system that can scale.
Expect to spend quite a bit of time on this
For even more, see the "Mining Massive Datasets" video series in the Video Series section
Practicing the system design process: Here are some ideas to try working through on paper, each with some documentation on how it was handled in the real world:
I added them to help you become a well-rounded software engineer and to be aware of certain
technologies and algorithms, so you'll have a bigger toolbox.
Know at least one type of balanced binary tree (and know how it's implemented):
"Among balanced search trees, AVL and 2/3 trees are now passé and red-black trees seem to be more popular.
A particularly interesting self-organizing data structure is the splay tree, which uses rotations
to move any accessed key to the root." - Skiena
Of these, I chose to implement a splay tree. From what I've read, you won't implement a
balanced search tree in your interview. But I wanted exposure to coding one up
and let's face it, splay trees are the bee's knees. I did read a lot of red-black tree code
Splay tree: insert, search, delete functions
If you end up implementing a red/black tree try just these:
Search and insertion functions, skipping delete
I want to learn more about B-Tree since it's used so widely with very large data sets
In practice:
From what I can tell, these aren't used much in practice, but I could see where they would be:
The AVL tree is another structure supporting O(log n) search, insertion, and removal. It is more rigidly
balanced than red–black trees, leading to slower insertion and removal but faster retrieval. This makes it
attractive for data structures that may be built once and loaded without reconstruction, such as language
dictionaries (or program dictionaries, such as the opcodes of an assembler or interpreter)
In practice:
Splay trees are typically used in the implementation of caches, memory allocators, routers, garbage collectors,
data compression, ropes (replacement of string used for long text strings), in Windows NT (in the virtual memory,
networking and file system code) etc
These are a translation of a 2-3 tree (see below).
In practice:
Red–black trees offer worst-case guarantees for insertion time, deletion time, and search time.
Not only does this make them valuable in time-sensitive applications such as real-time applications,
but it makes them valuable building blocks in other data structures that provide worst-case guarantees;
for example, many data structures used in computational geometry can be based on red-black trees, and
the Completely Fair Scheduler used in current Linux kernels uses red–black trees. In version 8 of Java,
the Collection HashMap has been modified such that instead of using a LinkedList to store identical elements with poor
hashcodes, a Red-Black tree is used
In practice:
For every 2-4 trees, there are corresponding red–black trees with data elements in the same order. The insertion and deletion
operations on 2-4 trees are also equivalent to color-flipping and rotations in red–black trees. This makes 2-4 trees an
important tool for understanding the logic behind red-black trees, and this is why many introductory algorithm texts introduce
2-4 trees just before red–black trees, even though 2-4 trees are not often used in practice.
Fun fact: it's a mystery, but the B could stand for Boeing, Balanced, or Bayer (co-inventor).
In Practice:
B-trees are widely used in databases. Most modern filesystems use B-trees (or Variants). In addition to
its use in databases, the B-tree is also used in filesystems to allow quick random access to an arbitrary
block in a particular file. The basic problem is turning the file block address into a disk block
(or perhaps to a cylinder head sector) address
MIT 6.851 - Memory Hierarchy Models (video)
- covers cache-oblivious B-Trees, very interesting data structures
- the first 37 minutes are very technical, and may be skipped (B is block size, cache line size)
I added these to reinforce some ideas already presented above, but didn't want to include them
above because it's just too much. It's easy to overdo it on a subject.
You want to get hired in this century, right?