- Modularize application to functional units.
- Each module
- Handles portion of application's overall functionality
- Represents set of related concerns.
- Why?
- Easier to design both current & future iterations of your application.
- Modules can also be tested & distributed and otherwise verified in isolation.
- Application traffic or load is distributed among various endpoints by using algorithms.
- Allows
- Multiple instances of the website can be created
- They can behave in a predictable manner
- Flexibility to grow or shrink the number of instances in application without changing the expected behavior
- Load balancing strategy considerations
- Physical vs Virtual Load balancers
- Use Virtual Load Balancers (hosted in VMs) if company requires a very specific configuration.
- Load balancing algorithm
- round robin => Selects next instance for each request based on a predetermined order that includes all of the instances.
- random choice
- Configurations
- Affinity/stickiness: If subsequent requests from the same client machine should be routed to the same service instance.
- Required when application has state.
- Physical vs Virtual Load balancers
- Leads to more resilient applications.
- Implemented in .NET libraries (Entity Framework, Azure SDK etc)
- Transient errors=> occur due to temporary interruptions in the service or to excess latency.
- Many are self-healing and can be resolved with retry policy
- Retry policy
- Retry when a temporary failure occurs.
- A break in the circuit => abort retries if it's a serious issue.
- Provides a degree of consistency regardless of the behavior of the modules.
- Direct method invocation
- Connection is severed on transient errors
- Use 3rd party queue to persist the requests beyond a temporary failure.
- Allows you to audit failing requests independently.
- Cloud applications must be sensitive to transient faults.
- E.g. loss of network connectivity, the temporary unavailability of a service, timeouts that arise when a service is busy.
- They're typically self-correcting, if the action that triggered a fault is repeated after a suitable delay, it's likely to be successful.
- DB with too many concurrent requests can have throttling (fails until workload is eased). Fixes itself after some delay.
- Solution: Retry for temporarily fails.
- Remote service => retry after short wait.
- Fails again => Limit attempts to avoid brute forcing retry again until maximum tries are reached.
- to spread requests from multiple instances of the application as evenly as possible.
- Fails again => Limit attempts to avoid brute forcing retry again until maximum tries are reached.
- Remote service => retry after short wait.
- Sudden large number of requests may cause unpredictable workload.
- Single consumer => risk of being flooded, or messaging system being overloaded.
- Solution: asynchronous messaging with variable quantities of message producers and consumers
- Business logic in the application is not blocked while the requests are being processed.
- Handle fluctuating workloads => system can run multiple instances of the consumer service.
- Problem: Cached data consistency
- A strategy is needed to ensure that the data is up-to-date & handle situations where the data in cache has become stale.
- Solution: read-through and write-through caching
- Cache-aside => Effectively loads data into the cache on demand if it's not already available in the cache.
- Not in cache? Fetch & add it to cache, modifications on cache=> write to data store.
- Cache-aside => Effectively loads data into the cache on demand if it's not already available in the cache.
- Problem: hosting large volumes of data in a traditional singe-instance store
- Some limitations
- Storage space: Upgrading disks is not easy.
- Computing resources: It's not possible to always increase more
- Network bandwidth: Network traffic might exceed
- Geography: Reduce latency of data access for different across regions.
- Scaling up can postpone affects but only temporary solution
- Solution: partitioning data horizontally across many nodes
- Divide data store into horizontal partitions, or shards.
- Shard: Same schema but distinct subset of data.
- Sharding can be in data access code, or storage system with transparent sharding
- Abstracting physical location => High level of control over which shard contain which data.
- Easier to migrate between shard without touching application logic.
- Tradeoff => Additional data access overhead to determine the location of each data item as it's retrieved
- Abstracting physical location => High level of control over which shard contain which data.
- For optimal performance & scalability
- Split data in a way that's appropriate for the types of queries the application performs.
- Sharding schema will exactly match requirements of every query.
- E.g.:
- In multi-tenant system => You lookup with tenant id + e.g. tenant's name, Tenant's name = sharding key