System Design general concepts
A. Requirement specification:
This is one of the most important part where you will clarify the features which need to be implemented. Ask lot of questions to the interviewer about what “exactly” need to be build? List some features and see what “exact” feature interviewer want you to work on.
B. Capacity estimation: Figure our how many “# of request will your system have to handle per second.”
This should be the request your system need to handle for peak load.
usually this number is derived in following way.
- figure out (Ask interviewer) how many active users per month your system will have. eg. 500 M active users per month.
2. From the number given in #1 figure out how many active user you will have in a day. let say 70% of the users are doing some activity daily. So it will be 500M * 0.7 = 350 M daily user
3. Figure out how much peak load is there for a given peak hour. Let say out of 350 M user 70% are clustered in same geographic region (timezone). So most of these users will login at the roughly the same time (say same hours). Hence you have hourly users 350*0.7 = 250M users
4. From the above you can say per second request load is
(250* 1000000)/3600 = 69K request/sec
How many servers do you need? let say your server can handle 1K request per second. This 1K number is modest goal for single server and most of servers now can handle upto 10K request (of course it will heavily depend on the implementation and the service type like video streaming can handle have less connections than just a text based service)
Hence we can have 70 servers to handle our peak load.
Can we estimate the network capactity bandwidth? Ofcouse we need to assume per request size!
Can we estimate the database capacity needed for 1 year?
Is our service read heavy or write heavy? or what is the read-write request ration (10:1, 1:1 … etc) Ask the interviewer if you have doubt!
Following video is much more eloburate description (and rant) of what i discussed in brief above
C. Generate some basic data model
Get comfortable with data modelling (SQL and no-SQL). Know what to choose when?