Update Home

2024-12-14 22:17:02 +00:00 · 2024-12-14 22:17:02 +00:00 · ba6ba67042
commit ba6ba67042
parent 947558d3a1
1 changed files with 18 additions and 3 deletions
--- a/Home.md
+++ b/Home.md
@ -3,11 +3,9 @@
 ## THIS IS STILL WIP

 ## Background
-
 In this project, I investigate the performance characteristics of running web applications in a distributed manner. The core of the issue here is the idea that physical distance in networking introduces the issue of latency, which in this context is the time taken to transmit or receive data from a web server. The web server here, in many cases, tends to live in North America, and more specifically, in the USA. This poses an issue for users on the other side of the world. For instance, the experience for a user in New Zealand will be significantly worse than a user in Canada when accessing services hosted in common locations such as US-WEST-1 via AWS. This can be seen in a more concrete example, where it can take nearly a second to load my [personal website](https://atri.dad) from western Canada, since it is hosted in Germany. This inequality of the web is something that I took to investigate. Distributed systems over the globe is not new in practice. The goal here was to thoroughly test the performance characteristics of these systems once you scale out all of its moving pieces.

 ## Methodology
-
 Before proceeding, we need a model of a "traditional" web application. The architecture I propose here is common and quite simple. There will be three layers:

 1. A web server to process requests
@ -31,6 +29,7 @@ The strategy I chose was one of incremental distribution. Instead of scaling eve

 These tests were done with a load testing tool that runs a pattern of 1 POST followed by 5 GET requests. The tool I built called Loadr, and can be found [here](https://git.atri.dad/atridad/loadr). Loadr ran at 50 requests per second as a target, and ran for up to 10000 requests per test. All tests were run from microVMs in the same three regions as "clients".

+## Results
 There are more detailed images in this repository, but here are the final results for P50, P90, and P99 latencies from the client's perspective:

 ![Client - 50th Percentile Latencies](https://admin.s3.atri.dad/api/v1/buckets/personal/objects/download?preview=true&prefix=cmpt815perf%2Fresults%2Fclient-50-percent.png&version_id=null)
@ -45,4 +44,20 @@ There are more detailed images in this repository, but here are the final result

 ![Client - Average Latencies](https://admin.s3.atri.dad/api/v1/buckets/personal/objects/download?preview=true&prefix=cmpt815perf%2Fresults%2Fclient-avg.png&version_id=null)

-There are a few interesting things here.
+There are a few interesting things here. One: Singapore performed poorly in the second test (scaled app server with centralized DB and cache), which is expected as any request that requires dynamic data will have a large round trip time between Singapore and Chicago. This will be significantly worse if there is a cache miss, since you then end up with 3 round trips between Singapore and Chicago per-request. Another item of note: the single region test did not have the lowest latency numbers. That honour goes to the full-scale tests. The less intuitive part of this is that while scaling every tier of the application  had the best minimum latencies, the best average latencies belong to the single region tests. Looking into the detailed measure, the lowest values tend to come from GET requests. This is because both Turso (DB) and Upstash (Cache) forward all writes to the primary region in Chicago, while reads happen from the closest replica. This is common strategy to maintain data consistency which hurts performance in read-heavy scenarios.
+
+This mirrors my assumptions going into the experiment: for highly dynamic applications, single region deployments will be more effective on average vs multi-region deployments. One strategy often employed to get the best of both worlds is to ensure that all static content on the page is delivered using content delivery networks (CDNs), while dynamic content is sent from the single origin server. Another solution is to scale all three components without replication, allowing for multiple regions a user can select. This only works if there is no requirement for collaboration across regions.
+
+## Limitations
+There are a number of limitations I do not account for in my research. In no particular order:
+
+1. Anycast DNS routing overhead
+2. Shared CPU noisy neighbour interference
+3. One client sending requests at a time
+4. Limitations of Turso and Upstash scaling (Primary vs Replica node behaviour)
+5. Used only a single workload due to time constrains: Request -> Cache Miss -> DB Access -> Response or Request -> Cache Hit -> Response
+
+The shared CPU issue has the possibility of impacting performance in a real way. I carefully monitored my instance and did not notice anything out of the ordinary during testing, but noting this as a risk is still valuable. More importantly, there were limitations to the extensiveness of my tests due to time constraints. Notably, the use of a single workload and third party scaling services meant the results are not representative of all possible workloads and architectures for international scaling. All of these limitations provide important context to better interpret the results.
+
+## Conclusion
+While the results imply that it is almost always better to deploy to a single region, it is important to take the contexts of the results into account. The way I chose to scale is representative of startups using off-the-shelf tools to scale without concerning themselves with the complexities of distributed infrastructure. In this context, I stand by the results. Other solutions include using CDNs to speed up the delivery of static content, which can often be enough to make a web application _feel_ significantly faster. More aggressive caching schemes, local-first writes, and fully detached regions can all be valid strategies depending on your needs. All of this ignores the cost of running this sort of infrastructure, which is non-trivial for high throughput applications. Ultimately, this research showed that the field of globally distributed computing is complex. Choosing distributed infrastructure is not a guaranteed win, and requires careful consideration of your common workloads, user distributions, and overall requirements of your application.