Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
146 changes: 146 additions & 0 deletions pip/pip-487.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
# PIP-487: Add event count metrics for InflightReadsLimiter acquire and release operations

# Background knowledge

The **InflightReadsLimiter** is a component in the managed ledger cache layer that limits the total amount of
in-flight data read from storage (BookKeeper) or cache. It works as a semaphore over bytes: each read
operation acquires a certain number of byte "permits" before proceeding, and releases them once the data has
been delivered to the client. This prevents excessive memory pressure when many consumers request large
amounts of data simultaneously.

Key concepts:
- `maxReadsInFlightSize`: the total byte capacity of the limiter.
- `remainingBytes`: the current number of free bytes available for new read requests.
- Permits can exceed `maxReadsInFlightSize`; in that case the request is capped at the maximum capacity.
- When no permits are available, acquire requests are queued and fulfilled as permits become available (via
release operations).

Existing metrics:
- `pulsar.broker.managed_ledger.inflight.read.limit` — the configured maximum capacity (bytes).
- `pulsar.broker.managed_ledger.inflight.read.usage` — the current used and free bytes.

The code resides in `InflightReadsLimiter.java` within the `managed-ledger` module.

# Motivation

The existing usage metric (`pulsar.broker.managed_ledger.inflight.read.usage`) reports the instantaneous
free/used bytes via an OTEL observable counter (callback-based). While this is useful for understanding
current utilization, it is not sufficient for alerting on a **permits leak** — a scenario where a bug
prevents acquired permits from ever being released, causing `remainingBytes` to stay at zero permanently.

**Why the existing metrics are insufficient for alerting:**

Consider a metrics scrape interval of 30 seconds. If `remainingBytes` is observed as 0 at multiple scrape
points, this could be explained by:

1. **A permits leak (bug):** permits were acquired but never released, so `remainingBytes` is truly stuck
at 0.
2. **High but legitimate read pressure:** many read requests are continuously acquiring and releasing
permits, and by chance the scrapes always happen to catch `remainingBytes` at 0.

With only instantaneous usage data, operators cannot distinguish between these two scenarios. However, with an **event count metric** that increments on every release, combined with the existing
`remainingBytes`, an operator can accurately detect a leak:

- If `remainingBytes` is 0 **and** `release.count` has not increased for an extended period, a permits
leak is likely — permits were acquired but are never being returned.

# Goals

## In Scope

- Add a cumulative counter metric that increments each time `remainingBytes` decreases (i.e., permits are
acquired).
- Add a cumulative counter metric that increments each time `remainingBytes` increases (i.e., permits are
released).
- Enable operators to combine these event counters with the existing usage metric to accurately detect
permits leaks.

# High Level Design

Add two new OTEL `LongCounter` metrics to `InflightReadsLimiter`:

| Metric | Trigger | Type |
|--------|---------|------|
| `acquire.count` | Incremented each time `remainingBytes` is decreased | Cumulative counter |
| `release.count` | Incremented each time `remainingBytes` is increased | Cumulative counter |

These are **event counters** — each individual acquire or release event increments the counter by 1,
regardless of the number of bytes involved. This allows operators to compare the rate of acquire vs.
release events (e.g., via `rate()` in Prometheus) to detect imbalances.

When the limiter is disabled (`maxReadsInFlightSize <= 0`), the counters are still registered but never
incremented, since the `acquire()` method short-circuits and `release()` becomes a no-op in the disabled
state.

## Public-facing Changes

### Public API

No changes.

### Binary protocol

No changes.

### Configuration

No changes.

### CLI

No changes.

### Metrics

| Full name | Description | Attributes | Unit |
|-----------|-------------|------------|------|
| `pulsar.broker.managed_ledger.inflight.read.acquire.count` | The number of times inflight read permits were acquired, decreasing the remaining bytes. | _(none)_ | `{event}` |
| `pulsar.broker.managed_ledger.inflight.read.release.count` | The number of times inflight read permits were released, increasing the remaining bytes. | _(none)_ | `{event}` |

# Monitoring

**Alerting on a permits leak:**

Operators can set up an alert that fires when:

1. `pulsar.broker.managed_ledger.inflight.read.usage{state="free"} == 0` (no free capacity), **AND**
2. `pulsar.broker.managed_ledger.inflight.read.release.count` has not increased for an extended period.

A specific Prometheus alert rule example:

```promql
pulsar_broker_managed_ledger_inflight_read_usage_free == 0
and
rate(pulsar_broker_managed_ledger_inflight_read_release_count_total[5m]) == 0
```

This fires when free bytes are stuck at zero and no releases have occurred in the last 5 minutes.
Adjust the time window based on expected workload and scraping interval.

# Security Considerations

No security implications. These are read-only metrics exposed via the existing OpenTelemetry metrics
infrastructure, which follows the same authentication and authorization as all other broker metrics.

# Backward & Forward Compatibility

## Upgrade

No special upgrade steps required. The new metrics will become available immediately upon broker restart
with the new version.

## Downgrade / Rollback

Rolling back to a previous version is safe. The new metrics simply disappear; no metric names are changed
or removed.

## Pulsar Geo-Replication Upgrade & Downgrade/Rollback Considerations

No geo-replication impact. Metrics are per-broker and local.

# General Notes

# Links

* Mailing List discussion thread: https://lists.apache.org/thread/nl1ropc9zd2ttxj06f2s0oxjdcg59sqk
* Mailing List voting thread: