How to achieve certain system quantities with the help of hardware
Regions, availability zones, data centers, racks, servers
- A example :
- US Start up -> a web application -> deploy a single general-purpose server in the garage; it accepts HTTP requests all over the world and store the data in the database (in the same machine);
- As the app grows and the data and request increase, you find single server are not enough to store data(CPU bound, memory bound and IO bound), a more powerful machine is pricy, so you bought another machine -> for database dedicated ;
- One day your dog accidentally turned off the application server;
- So you decide to make system more reliable so you lock the servers in the rack;
- And you also bought a new server and a small server to work as load balancer (distribute requests among application servers);
- And you added a new database server two work as backup and sync up database ;
- One day, an power outrage happened all server died;
- You bought all the same servers in your friends garage to improve availability; (additional cost for load balance and so on)
- One day, users in India complain the performance is bad, so in India you build a same infrastructure built in the US. So India users are served by the local infra.
- Still can be improved the infra: Database is single point failure. If the main database dies, the backup database server should somehow take over and continue serving;
- Servers: CPU, memory size, disk size, and network bandwidth, we choose servers based on what resources they need most. -> this simplify future scaling and reduces costs.
- Server racks : many servers are physically place in the same racks
- -> easy to each examine and manipulate
- Simplifies cooling and increase security
- Has its own network and power source
- But sometimes we want servers in different racks so to improve availability ;
- We can place server in the same rack to reduce the latency ;
- Data center : has lots of racks -> independent peer, cooling and security -> but still be unavailable due to natural disaster
- Availability zone: one or more discrete data centers across zones
- Increate availability as hardware is distributed across multiple data centers;
- Increate scalability as there are multiple places to allocate hardware from ;
- Region:A group isolated AZs graphically distribute in an area;(US and India case above), there are at least 2 AZs in a region and interconnected with high bandwidth and low latency network;(less than 2ms)
- Deploy a copy of the system to every region worldwide
- Improve performance -> system is physically close to users;
- Increase availability -> requests are forwarded to other regions;
- Improves scalability -> more hardware is available for allocation.
- Deployed system to servers in each availability zones:
- Increase durability -> data replicated quickly between zones;
- Within data center servers are deployed among racks;
- Server type chooses based on the workload expected to run
Physical servers, virtual machines, containers, serverless
-
Physical machine
- Pros:
- gives complete control of software stack and hardware;
- powerful CPU(processing powver);
- Provides isolation between tenants (dedication server)
- Eliminate the noise neighbor phenomenon
- High security
- Cons
- Expensive
- Hard to manage
- Hard to scale
- Hard to port
- Slow to provision and boot
- Pros:
-
Physical machines are good when we need highly productive hardware to run the application(data intensive workloads) -> high demand and control of hardware
- Or when we need bare-meta (dedication) due to high security, regulation, compliance requirements
-
Virtual machines
- Pros
- Cheap than physical machines
- Easy to maintain, scale and port
- Faster to provision and boot
- Cons
- Expose to the noise neighbor problem due to shared hardware
- Less security due to potential vulnerabilities in a hypervisor
- Pros
-
Hypervisor : virtual machine monitor : create and run vVMs
-
There are many types of machines in the market and we can create different types of VMs according to needs so it can also deal with high workload.
-
Containers : lightweight, standalone executable package of software that includes everything needed to run an application: code, runtime, system , libs and settings .
- Pros
- Lightweight and require less hardware resources
- (single server can host more containers )More scalable and portable
- Easier to deploy and maintain at scale
- Faster to start
- Cons
- Less secure (malicious code can access system)
- Less flexible in terms of multiple OSs
- Pros
-
Containers are good for any workload.
-
Serverless (cloud): Engineer only write code and deploy on the cloud ( even some dependencies are built in)
- Pros:
- No servers to manager provision and upgrade ...
- Pay as you go;
- Automated scaling based on the load without. Do it yourself hussle(fast iteration);
- Cons
- Limitations (cod stare invocation duration, memory size)
- Expensive when scaling
- Pros:
-
Good for (unpredictable load so autoscaling )
- handling api request
- Realtime file/stream processing
- Small task that performed in response to an event
- File upload to S3
- Message to a queue
- Request to an endpoint ...