PACELC Exploration: Latency vs Consistency Trade-offs

The PACELC theorem extends the CAP theorem by considering latency trade-offs in distributed systems during normal operations. It states that even when there is no network partition, a system must choose between consistency and latency. This exercise demonstrates the latency-consistency trade-offs in a NoSQL database like DynamoDB or MongoDB using both strongly consistent and eventually consistent reads.

1. Setting Up the Environment

Using DynamoDB

Step 1: Install AWS CLI and SDK

Install AWS CLI:

sudo apt update
sudo apt install awscli -y

Configure AWS credentials:
```
aws configure
```
- Provide your access key, secret key, region, and output format.
Install the AWS SDK for Python (Boto3):
```
pip install boto3
```

Step 2: Create a DynamoDB Table

Create a table with a primary key (id):

aws dynamodb create-table \
    --table-name TestTable \
    --attribute-definitions AttributeName=id,AttributeType=S \
    --key-schema AttributeName=id,KeyType=HASH \
    --billing-mode PAY_PER_REQUEST

Insert test data:

aws dynamodb put-item \
    --table-name TestTable \
    --item '{"id": {"S": "1"}, "value": {"S": "Hello, World!"}}'

2. Demonstrate Latency vs Consistency Trade-offs

Testing Strongly Consistent Reads

A strongly consistent read ensures that the data returned reflects all previous write operations, guaranteeing the most up-to-date value.

Python Code for Strongly Consistent Read

import boto3
import time

dynamodb = boto3.client('dynamodb')

def strong_consistent_read():
    start_time = time.time()
    response = dynamodb.get_item(
        TableName='TestTable',
        Key={'id': {'S': '1'}},
        ConsistentRead=True  # Strong consistency
    )
    latency = time.time() - start_time
    print(f"Strong Read Latency: {latency:.4f}s")
    print("Data:", response.get('Item'))

strong_consistent_read()

Testing Eventually Consistent Reads

An eventually consistent read may return stale data, as it doesn’t guarantee the latest write is reflected.

Python Code for Eventually Consistent Read

def eventual_consistent_read():
    start_time = time.time()
    response = dynamodb.get_item(
        TableName='TestTable',
        Key={'id': {'S': '1'}},
        ConsistentRead=False  # Eventual consistency
    )
    latency = time.time() - start_time
    print(f"Eventual Read Latency: {latency:.4f}s")
    print("Data:", response.get('Item'))

eventual_consistent_read()

3. Experiment: Measuring Latency

Simulating Traffic

To test latency under load, use a tool like Apache JMeter or Python’s concurrent.futures to send multiple read requests concurrently.

Python Code for Concurrent Reads

from concurrent.futures import ThreadPoolExecutor

def measure_latency(consistency_type):
    def read_request():
        response = dynamodb.get_item(
            TableName='TestTable',
            Key={'id': {'S': '1'}},
            ConsistentRead=consistency_type == 'strong'
        )
        return time.time()

    with ThreadPoolExecutor(max_workers=10) as executor:
        start_times = [time.time() for _ in range(10)]
        results = executor.map(read_request, start_times)
        latencies = [end - start for start, end in zip(start_times, results)]
        print(f"{consistency_type.capitalize()} Consistency Avg Latency: {sum(latencies)/len(latencies):.4f}s")

measure_latency('strong')
measure_latency('eventual')

4. Reporting the Results

Latency Observations

5. Visualizing Results

Latency Histogram

Visualize the latency distribution using Python’s matplotlib:

import matplotlib.pyplot as plt

strong_latencies = [30, 32, 35, 50, 45]  # Replace with actual data
eventual_latencies = [10, 12, 15, 18, 20]

plt.hist(strong_latencies, bins=5, alpha=0.7, label='Strong Consistency')
plt.hist(eventual_latencies, bins=5, alpha=0.7, label='Eventual Consistency')
plt.xlabel('Latency (ms)')
plt.ylabel('Frequency')
plt.title('Latency Comparison: Strong vs Eventual Consistency')
plt.legend()
plt.show()

6. MongoDB Alternative

For MongoDB, similar tests can be conducted with read concern levels:

Strong consistency: Use majority read concern.
Eventual consistency: Use local or default read concern.

7. Insights from the Experiment

Strong Consistency:
- Guarantees the latest data but incurs higher latency due to synchronization overhead.
- Use Cases: Banking, inventory systems, and any application requiring real-time accuracy.
Eventual Consistency:
- Reduces latency by allowing stale reads.
- Use Cases: Social media, recommendation systems, and analytics.