Latency Measurement in Distributed Systems
Measuring latency is crucial for understanding the performance of a system, identifying bottlenecks, and ensuring a good user experience. Latency refers to the time taken for a request to travel from the client to the server and back.
1. Sending HTTP Requests and Measuring Response Times
To measure response times, we can send multiple HTTP requests to a mock server and record the latency for each request. Here’s a Python script using the requests
and matplotlib
libraries:
Python Script for Measuring Latency
import requests
import time
import matplotlib.pyplot as plt
# Mock server endpoint
URL = "https://httpbin.org/delay/1" # A mock endpoint with a 1-second delay
# Function to measure latency
def measure_latency(url, num_requests=100):
latencies = []
for i in range(num_requests):
start_time = time.time()
try:
response = requests.get(url, timeout=5) # Send request
response.raise_for_status()
except requests.exceptions.RequestException as e:
print(f"Request {i + 1} failed: {e}")
latencies.append(None) # Mark failed request
else:
end_time = time.time()
latencies.append(end_time - start_time)
return latencies
# Measure latencies
latency_results = measure_latency(URL, num_requests=100)
# Filter out failed requests (None values)
valid_latencies = [lat for lat in latency_results if lat is not None]
# Output basic metrics
print(f"Number of requests: {len(latency_results)}")
print(f"Successful requests: {len(valid_latencies)}")
print(f"Average latency: {sum(valid_latencies) / len(valid_latencies):.3f} seconds")
print(f"P99 latency: {sorted(valid_latencies)[int(len(valid_latencies) * 0.99) - 1]:.3f} seconds")
# Save raw latencies to a file (optional)
with open("latency_data.txt", "w") as f:
for lat in valid_latencies:
f.write(f"{lat}\n")
2. Visualizing Latency Distribution
Once we have the latency data, we can visualize the distribution using a histogram or percentiles to understand the system’s performance under varying conditions.
Plotting the Latency Distribution
# Plot histogram of latencies
plt.figure(figsize=(10, 6))
plt.hist(valid_latencies, bins=20, color='blue', alpha=0.7, edgecolor='black')
plt.title("Latency Distribution")
plt.xlabel("Latency (seconds)")
plt.ylabel("Frequency")
plt.grid(axis="y", linestyle="--", alpha=0.7)
plt.show()
Percentiles for Performance Metrics
Percentiles (e.g., P50, P90, P99) are critical for understanding performance under different loads:
- P50: Median latency (50% of requests are faster than this value).
- P90: Indicates high-load performance (90% of requests are faster).
- P99: Shows the latency for nearly all requests (99% are faster).
Example Code for Percentiles:
def calculate_percentiles(latencies, percentiles=[50, 90, 99]):
sorted_latencies = sorted(latencies)
results = {}
for p in percentiles:
index = int(len(sorted_latencies) * (p / 100)) - 1
results[f"P{p}"] = sorted_latencies[index]
return results
percentile_results = calculate_percentiles(valid_latencies)
print("Latency Percentiles:", percentile_results)
3. Observations and Insights
- Average Latency: Gives a general idea of the system’s performance under normal conditions.
- P99 Latency: Crucial for understanding the worst-case scenarios.
- Histogram: Helps identify anomalies or irregular patterns (e.g., a spike in high latencies).
Sample Output and Visualization
After running the script:
- Average latency: 1.05 seconds
- P99 latency: 1.21 seconds
Histogram Visualization: The histogram shows most requests complete within 1-1.2 seconds, with occasional outliers.
Use Cases of Latency Measurement
- Performance Testing: Identify bottlenecks before deployment.
- Capacity Planning: Prepare for expected traffic patterns.
- SLAs and SLOs: Ensure compliance with latency-based agreements.
This process allows teams to proactively identify performance issues and optimize system reliability.