System Design Interview and beyound Note2 - Hardware resources

Published on February 19, 2024

How to achieve certain system quantities with the help of hardware

Regions, availability zones, data centers, racks, servers

A example :
1. US Start up -> a web application -> deploy a single general-purpose server in the garage; it accepts HTTP requests all over the world and store the data in the database (in the same machine);
2. As the app grows and the data and request increase, you find single server are not enough to store data(CPU bound, memory bound and IO bound), a more powerful machine is pricy, so you bought another machine -> for database dedicated ;
3. One day your dog accidentally turned off the application server;
4. So you decide to make system more reliable so you lock the servers in the rack;
5. And you also bought a new server and a small server to work as load balancer (distribute requests among application servers);
6. And you added a new database server two work as backup and sync up database ;
7. One day, an power outrage happened all server died;
8. You bought all the same servers in your friends garage to improve availability; (additional cost for load balance and so on)
9. One day, users in India complain the performance is bad, so in India you build a same infrastructure built in the US. So India users are served by the local infra.
10. Still can be improved the infra: Database is single point failure. If the main database dies, the backup database server should somehow take over and continue serving;
Servers: CPU, memory size, disk size, and network bandwidth, we choose servers based on what resources they need most. -> this simplify future scaling and reduces costs.
Server racks : many servers are physically place in the same racks
1. -> easy to each examine and manipulate
2. Simplifies cooling and increase security
3. Has its own network and power source
But sometimes we want servers in different racks so to improve availability ;
We can place server in the same rack to reduce the latency ;
Data center : has lots of racks -> independent peer, cooling and security -> but still be unavailable due to natural disaster
Availability zone: one or more discrete data centers across zones
1. Increate availability as hardware is distributed across multiple data centers;
2. Increate scalability as there are multiple places to allocate hardware from ;
Region:A group isolated AZs graphically distribute in an area;(US and India case above), there are at least 2 AZs in a region and interconnected with high bandwidth and low latency network;(less than 2ms)
Deploy a copy of the system to every region worldwide
1. Improve performance -> system is physically close to users;
2. Increase availability -> requests are forwarded to other regions;
3. Improves scalability -> more hardware is available for allocation.
Deployed system to servers in each availability zones:
1. Increase durability -> data replicated quickly between zones;
2. Within data center servers are deployed among racks;
3. Server type chooses based on the workload expected to run

Physical servers, virtual machines, containers, serverless

Physical machine
- Pros:
  - gives complete control of software stack and hardware;
  - powerful CPU(processing powver);
  - Provides isolation between tenants (dedication server)
  - Eliminate the noise neighbor phenomenon
  - High security
- Cons
  - Expensive
  - Hard to manage
  - Hard to scale
  - Hard to port
  - Slow to provision and boot
Physical machines are good when we need highly productive hardware to run the application(data intensive workloads) -> high demand and control of hardware
Or when we need bare-meta (dedication) due to high security, regulation, compliance requirements

Virtual machines
- Pros
  - Cheap than physical machines
  - Easy to maintain, scale and port
  - Faster to provision and boot
- Cons
  - Expose to the noise neighbor problem due to shared hardware
  - Less security due to potential vulnerabilities in a hypervisor
Hypervisor : virtual machine monitor : create and run vVMs
There are many types of machines in the market and we can create different types of VMs according to needs so it can also deal with high workload.
Containers : lightweight, standalone executable package of software that includes everything needed to run an application: code, runtime, system , libs and settings .
- Pros
  - Lightweight and require less hardware resources
  - (single server can host more containers )More scalable and portable
  - Easier to deploy and maintain at scale
  - Faster to start
- Cons
  - Less secure (malicious code can access system)
  - Less flexible in terms of multiple OSs
Containers are good for any workload.
Serverless (cloud): Engineer only write code and deploy on the cloud ( even some dependencies are built in)
- Pros:
  - No servers to manager provision and upgrade ...
  - Pay as you go;
  - Automated scaling based on the load without. Do it yourself hussle(fast iteration);
- Cons
  - Limitations (cod stare invocation duration, memory size)
  - Expensive when scaling
Good for (unpredictable load so autoscaling )
1. handling api request
2. Realtime file/stream processing
3. Small task that performed in response to an event
  1. File upload to S3
  2. Message to a queue
  3. Request to an endpoint ...