Load Balancing *

Build a simple HTTP/1.1 system with multiple worker servers behind a proxy that load balances incoming requests. Each worker listens on a configurable port, responds 200 OK on /health, and performs simulated CPU work on /work by generating a random integer N and computing a sum of 1..N. Package workers in Docker. Run a separate proxy that accepts external traffic and forwards to the workers using a load balancing policy.

HTTP ΑΡΙ

GET /health returns 200 OK.
GET /work generates a random N, computes the sum and returns 200 OK.
Any other path returns 404 NotFound.

You can read about how a GET request is structured at https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Methods/GET.

Implementation Guidelines

Worker server
- Accept valid HTTP/1.1 GET requests.
- Implement /health and /work as specified. Ensure /work blocks the request until the sum completes to simulate CPU load.
- Each request should be handled in parallel.
- Limit worker server to be able to use 3 threads (so each server has some resources available to it on the same machine).
Proxy load balancer
- Listens on PROXY_PORT. For /health return 200 if at least one backend is healthy or return 503 if none are healthy. For /work forward to a backend server.
Balancing policies
- round robin rotate through the list.
- least_conn choose the backend with the fewest in-flight requests.
- random uniform choice each request.
Health checks poll backend /health every second to know how many workers are listening.

Testing

Generate 1k requests one by one (sequentially). Record request throughput. Then generate 2k requests with concurrency of 2 (2 generating requests at the same time and each generates 1k). Then scale concurrency to 4, 6, 8, 10... (scale depending on your machine capabilities until it takes more than 10 min to complete one test).
Scale workers from 1, 2, 4, 6, 8, 10. Measure request throughput. Record throughput with multiple concurrent clients (bullet point above).

Since the networking part of the project is inherently concurrent, and at the same time the system is distributed, a working implementation covers all three parts. For running multiple instances it's recommended to use docker.