Domains by siddhsuresh · Pull Request #1 · flightcontrolhq/modules

siddhsuresh · 2026-05-17T09:13:28Z

Greptile Summary

This PR introduces opt-in Ravion-managed domain/cert provisioning for ECS clusters and services, adds a new ravion_domain Terraform resource type backed by a custom provider, and refactors load_balancer_attachment from a typed Terraform variable to a JSON-decoded string to support control-plane ARN templating.

ecs_cluster: new ravion_domains.tf allocates a wildcard ravion_domain for the cluster, binds the cert to the public ALB HTTPS listener as an SNI extra, and exposes domain outputs; module.yml adds UI inputs, dashboard metrics, and output declarations.
ecs_service: new ravion_domains.tf implements Mode A (auto-FQDN under the cluster wildcard, with a two-phase cutover via ravion_auto_domain_status) and Mode B (per-service cert covering customer FQDNs); load_balancer_attachment variable replaced by load_balancer_attachment_json string with local decoding; module.yml added from scratch.
Broad try(..., true) change: every variable validation in autoscaling, ecs_cluster, ecs_service, alb, nlb, and vpc is wrapped to suppress throws from complex-object attribute access.

Confidence Score: 2/5

Multiple blocking gaps exist between the module.yml control-plane definitions and the actual Terraform code — mismatched output names, mismatched input variable names, and conflicting provider version pins — that will cause apply failures or silent no-ops on every stack that uses the new Ravion domain wiring.

The module.yml files declare Ravion output names (ravion_managed_domains_enabled, ravion_public_alb_default_*) that simply do not exist in outputs.tf, and input names (ravion_parent_app_domain_id, ravion_auto_domain_listener_arn, etc.) that don't match the TF variables the resources actually read. Any downstream module referencing these outputs will receive nothing, and any upstream value the control plane passes using the module.yml keys will be silently ignored. Separately, the Ravion provider is pinned at = 0.4.3 in ecs_cluster and = 0.4.5 in ecs_service, which Terraform cannot satisfy in a shared workspace. Finally, when certificate_arns = [] (the new default set in module.yml), the ALB HTTPS listener is created with certificate_arn = null, which AWS rejects. Together these issues mean the primary new feature cannot be exercised in production without code fixes.

compute/ecs_cluster/module.yml and compute/ecs_service/module.yml (output/input name mismatches), compute/ecs_cluster/versions.tf and compute/ecs_service/versions.tf (conflicting provider version pins), networking/alb/locals.tf (null certificate_arn on HTTPS listener)

Important Files Changed

Filename	Overview
compute/ecs_cluster/versions.tf	Ravion provider pinned to = 0.4.3, conflicting with ecs_service pin of = 0.4.5; breaks any workspace using both modules
compute/ecs_service/versions.tf	Ravion provider pinned to = 0.4.5, conflicting with ecs_cluster pin of = 0.4.3
compute/ecs_cluster/module.yml	Declared Ravion output names don't match actual TF outputs; also declares private ALB Ravion outputs that have no corresponding TF resources
compute/ecs_service/module.yml	New file; UI input names (ravion_parent_app_domain_id, ravion_auto_domain_listener_arn, etc.) don't match TF variables; output names also diverge from outputs.tf
networking/alb/locals.tf	https_default_cert_arn can be null when certificate_arns is empty (now the default), causing HTTPS listener creation to fail at the AWS API level
networking/alb/listeners.tf	certificate_arn now references local.https_default_cert_arn (null-safe); SNI extras guard improved with length check; depends on ALB locals fix
compute/ecs_cluster/ravion_domains.tf	New file: allocates ravion_domain.cluster (wildcard cert) and binds it as SNI extra on public ALB HTTPS listener; preconditions guard required inputs
compute/ecs_service/ravion_domains.tf	New file implementing Mode A (auto-FQDN under cluster wildcard) and Mode B (custom cert) with a two-phase cutover mechanism via ravion_auto_domain_status data source
compute/ecs_service/locals.tf	Switches load_balancer_attachment from typed variable to JSON-decoded local; applies defaults post-decode; enable_auto_scaling guard simplified with try()
compute/ecs_service/variables.tf	load_balancer_attachment replaced with load_balancer_attachment_json string; wait_for_steady_state default changed from true to false; Ravion variables added
compute/autoscaling/variables.tf	All variable validations wrapped in try(..., true), silently swallowing errors for complex object inputs
compute/ecs_cluster/outputs.tf	Adds seven Ravion domain outputs (cluster domain id/fqdn/url, cert arn/status, aws account/region); names diverge from module.yml declarations

Sequence Diagram

sequenceDiagram
    participant CP as Control Plane
    participant Cluster as ecs_cluster module
    participant RavionAPI as Ravion Provider
    participant AWS as AWS (ACM + ALB)
    participant Service as ecs_service module

    CP->>Cluster: "use_ravion_managed_domains=true, ravion_aws_account_id"
    Cluster->>RavionAPI: "ravion_domain.cluster (wildcard cert, target=public ALB)"
    RavionAPI-->>AWS: "Issue ACM cert (*.cluster-fqdn + cluster-fqdn)"
    RavionAPI-->>Cluster: cert_arn, fqdn, id
    Cluster->>AWS: aws_lb_listener_certificate (SNI extra on HTTPS listener)
    Cluster-->>CP: ravion_cluster_domain_id, ravion_cluster_cert_arn, ...

    CP->>Service: cluster_parent_domain_id, cluster_https_listener_arn
    Service->>RavionAPI: data.ravion_auto_domain_status (check retirement)
    RavionAPI-->>Service: "retired=false"
    Service->>RavionAPI: ravion_domain.auto (child under cluster domain, Mode A)
    RavionAPI-->>Service: auto fqdn (svc-hash.cluster-fqdn)
    Service->>AWS: aws_lb_listener_rule.ravion (host_header to target group)

    Note over Service,AWS: Mode B (custom domains)
    CP->>Service: "domains=[api.example.com]"
    Service->>RavionAPI: ravion_domain.custom (cert covering customer FQDNs)
    RavionAPI-->>AWS: Attach cert to ALB listener via resource_arn
    Note over RavionAPI,AWS: Customer adds DNS CNAME records
    RavionAPI-->>Service: "retired=true (after DNS resolves)"
    Service->>RavionAPI: ravion_domain.auto count to 0 (destroy)

Comments Outside Diff (1)

compute/autoscaling/variables.tf, line 9-10 (link)

try(..., true) silences all validation errors

Wrapping every validation condition in try(..., true) means any expression that throws (e.g., a type mismatch or an unexpected attribute access on an incorrectly-shaped value) silently passes as valid instead of surfacing an error message. For the simple scalar checks like var.desired_capacity == null || var.desired_capacity >= 0, the null-guard short-circuit already prevented the throw without try(), so the wrapper is no-op but harmless there. The concern is the object-typed variables (mixed_instances_policy, instance_refresh, warm_pool, scaling_policies) where a malformed input object that causes an attribute-access error will now be accepted with no user feedback. This same pattern is applied throughout compute/ecs_cluster/variables.tf, compute/ecs_service/variables.tf, networking/alb/variables.tf, and networking/nlb/variables.tf.

Prompt To Fix With AI

This is a comment left during a code review.
Path: compute/autoscaling/variables.tf
Line: 9-10

Comment:
**`try(..., true)` silences all validation errors**

Wrapping every validation condition in `try(..., true)` means any expression that throws (e.g., a type mismatch or an unexpected attribute access on an incorrectly-shaped value) silently passes as valid instead of surfacing an error message. For the simple scalar checks like `var.desired_capacity == null || var.desired_capacity >= 0`, the null-guard short-circuit already prevented the throw without `try()`, so the wrapper is no-op but harmless there. The concern is the object-typed variables (`mixed_instances_policy`, `instance_refresh`, `warm_pool`, `scaling_policies`) where a malformed input object that causes an attribute-access error will now be accepted with no user feedback. This same pattern is applied throughout `compute/ecs_cluster/variables.tf`, `compute/ecs_service/variables.tf`, `networking/alb/variables.tf`, and `networking/nlb/variables.tf`.

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

Fix the following 5 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 5
compute/ecs_cluster/versions.tf:17-20
**Conflicting provider version pins across modules**

`ecs_cluster` pins the Ravion provider at `= 0.4.3` while `ecs_service/versions.tf` pins it at `= 0.4.5`. Terraform/OpenTofu resolves a single provider version per workspace. Any stack that instantiates both modules (the typical pattern shown in the README) will immediately fail with `Error: Failed to query available provider packages — conflicting provider version constraints`. Both modules must use the same pinned version (or a shared `>= x.y` range).

### Issue 2 of 5
compute/ecs_cluster/module.yml:622-629
**module.yml output declarations don't match actual Terraform outputs**

The seven outputs declared here (`ravion_managed_domains_enabled`, `ravion_public_alb_default_url`, `ravion_public_alb_default_fqdn`, `ravion_public_alb_default_cert_arn`, `ravion_private_alb_default_url`, `ravion_private_alb_default_fqdn`, `ravion_private_alb_default_cert_arn`) do not exist in `compute/ecs_cluster/outputs.tf`. The actual TF outputs are `ravion_cluster_domain_id`, `ravion_cluster_domain_fqdn`, `ravion_cluster_domain_url`, `ravion_cluster_cert_arn`, `ravion_cluster_cert_status`, `ravion_aws_account_id`, and `ravion_aws_region`. Any control-plane reference to these declared-but-absent outputs will resolve to nothing, silently breaking downstream wiring. There are also no private-ALB Ravion domain resources in the TF code, so the three `ravion_private_alb_*` entries can never be populated.

### Issue 3 of 5
compute/ecs_service/module.yml:101-127
**module.yml input names don't match Terraform variable names**

Several UI inputs defined here map to non-existent TF variables:

- `ravion_listener_arn` → no matching TF variable (the TF variable is `cluster_https_listener_arn`, which serves both Mode A and Mode B).
- `ravion_parent_app_domain_id` → TF variable is `cluster_parent_domain_id`.
- `ravion_auto_domain_listener_arn` → TF variable is `cluster_https_listener_arn`.
- `ravion_auto_domain_alb_dns_name` → TF variable is `cluster_alb_dns_name`.
- `ravion_auto_domain_alb_zone_id` → TF variable is `cluster_alb_zone_id`.

When the control plane passes these UI input values to Terraform, the misnamed keys will be silently ignored, leaving the actual TF variables unset (defaulting to `null`) and breaking the Ravion domain wiring at plan/apply time. The output section has the same issue — `ravion_domains`, `ravion_cert_id`, `ravion_cert_arn`, `ravion_cert_status`, `ravion_auto_domain_enabled`, `ravion_auto_domain_fqdn`, `ravion_auto_domain_url` are declared but the actual TF outputs are `ravion_domain_id`, `ravion_domain_fqdn`, `ravion_domain_url`, `ravion_custom_domain_id`, and `ravion_custom_domain_cert_arn`.

### Issue 4 of 5
networking/alb/locals.tf:32
**HTTPS listener created with `null` certificate_arn when `certificate_arns = []`**

`https_default_cert_arn` is `null` when `certificate_arns` is empty. `compute/ecs_cluster/module.yml` now sets `default: []` for `public_alb_certificate_arns`, so a user who sets `use_ravion_managed_domains = true` and leaves `public_alb_certificate_arns` at its default will trigger this path. The `aws_lb_listener.https` resource then passes `certificate_arn = null` to AWS, which requires a certificate for HTTPS listeners and will reject the call with a validation error. The Ravion wildcard cert is wired only as an SNI extra via `aws_lb_listener_certificate` (after the listener exists), so it cannot fill the listener's mandatory default-cert slot. The HTTPS listener needs at least one cert at creation time.

### Issue 5 of 5
compute/autoscaling/variables.tf:9-10
**`try(..., true)` silences all validation errors**

Wrapping every validation condition in `try(..., true)` means any expression that throws (e.g., a type mismatch or an unexpected attribute access on an incorrectly-shaped value) silently passes as valid instead of surfacing an error message. For the simple scalar checks like `var.desired_capacity == null || var.desired_capacity >= 0`, the null-guard short-circuit already prevented the throw without `try()`, so the wrapper is no-op but harmless there. The concern is the object-typed variables (`mixed_instances_policy`, `instance_refresh`, `warm_pool`, `scaling_policies`) where a malformed input object that causes an attribute-access error will now be accepted with no user feedback. This same pattern is applied throughout `compute/ecs_cluster/variables.tf`, `compute/ecs_service/variables.tf`, `networking/alb/variables.tf`, and `networking/nlb/variables.tf`.

_{Reviews (1): Last reviewed commit: "ecs_service: bring auto-FQDN back when u..." | Re-trigger Greptile}

Greptile also left 4 inline comments on this PR.

…ster + compute/ecs_service networking/alb - New use_ravion_managed_domains toggle (default false, non-breaking). - When on, declares one domains_alb_attachment that allocates an FQDN, issues the cluster wildcard ACM cert, writes the A-ALIAS, and exposes default_cert_arn. - HTTPS listener is force-created and its default cert comes from the attachment; var.certificate_arns[0] is ignored, additional ARNs still attach as SNI. New outputs: ravion_default_url / _fqdn / _cert_arn / _alb_attachment_id. compute/ecs_cluster - Single use_ravion_managed_domains toggle that forwards down to both the public and private alb child modules (distinct slots so they coexist). - Forces HTTPS + HTTP->HTTPS redirect when on. Per-ALB cert ARN inputs are ignored at index 0 when on (additional ARNs still attach as SNI). - module.yml gets a new "Ravion-Managed Domains" section; cert-ARN inputs are hidden when the toggle is on. New outputs cover both ALBs. compute/ecs_service - New domains list(string) input. When non-empty, declares one domains_module_certificate that hands the FQDN list + cluster's HTTPS listener ARN to Ravion's api. The api requests the ACM cert, persists validation CNAMEs (visible in Domains tab), polls until ISSUED, and attaches as SNI out-of-band — terraform apply never blocks on customer DNS. On destroy the api detaches + deletes the cert. - New outputs surface cert id / arn / status for the UI. The Ravion provider source (ravion.com/ravion/domains) only resolves when the runner's terraform mirror is configured (Ravion pipeline runners inject this via ~/.terraformrc). Required-provider declarations are added so any caller using these modules without Ravion-managed mode still works against a local mirror or skipping the providers in tooling.

When the cluster module has Ravion-managed domains on, it exposes `ravion_public_alb_default_app_domain_id` (the app_domain_id from the cluster's alb_attachment). A service module wires that as `ravion_parent_app_domain_id`, plus the cluster ALB's HTTPS listener ARN and DNS name. The new resources in compute/ecs_service/ravion_domains.tf then activate iff: - var.domains is empty (no custom-domain takeover yet), AND - ravion_parent_app_domain_id is non-empty and install: 1. domains_app_domain.auto with parent_id = cluster's app_domain. Allocates `<svc>-<hash>.<cluster-fqdn>` server-side (allocator now supports a ParentApex). The cluster wildcard cert covers it — zero per-service ACM work, zero per-service validation records. 2. domains_dns_record.auto_alias writes the A-ALIAS in our Route53. 3. aws_lb_listener_rule.ravion_auto_domain installs a host_header rule on the cluster HTTPS listener, forwarding to the service's TG. When the user adds a custom domain, count flips to 0 → terraform destroys all three → server-side cascade removes the ManagedDomain row, the Route53 record, and AWS deletes the listener rule. The user's domains_module_certificate takes over. networking/alb gains a `ravion_default_app_domain_id` output exposing the alb_attachment's app_domain_id (was already in the underlying provider resource, just not surfaced). ecs_cluster forwards the public + private flavors.

- networking/vpc/module.yml + compute/ecs_cluster/module.yml: add readme, ui.metrics (CloudWatch via Ravion AWS account), and ui.links. - compute/ecs_service/module.yml: new file mirroring the input schema (incl. the new ravion-managed domain inputs) with metrics + readme. - networking/vpc/variables.tf: wrap validation conditions in try() so a null default doesn't blow up the iteration/length checks. Terraform 1.10 does not short-circuit `||` over `for`/`length` in variable validation. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Same TF 1.10 quirk as the variables.tf fix in the previous commit: `||` does not short-circuit length() over a null value inside a precondition. Wrap each length-check in try() so the null default for *_subnet_cidrs passes the precondition without erroring. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Same TF 1.10 quirk hitting child modules pulled in by ecs_cluster: - compute/autoscaling/variables.tf: instance_maintenance_policy attribute access fails when the policy itself is null. Wrap min/max condition in try() so the null default doesn't blow up the validation. - networking/nlb/variables.tf: contains() barfs on null. Wrap both dns_record_client_routing_policy and enforce_security_group_inbound_rules conditions in try(). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Bulk-applied transform via /tmp/wrap_null_conditions.py over all the modules ecs_cluster + ecs_service pull in. TF 1.10's variable validation does not short-circuit `||` when the right side touches a null value, so the existing `var.X == null || EXPR(var.X)` guards blow up at plan time. try(..., true) preserves the original semantics for non-null inputs (the expression evaluates and returns its bool result) and silently passes when the expression errors on null — same intent the `== null ||` guards had originally. Touches: autoscaling, ecs_cluster, ecs_service, alb, nlb, vpc. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

… is on When use_ravion_managed_domains = true, the listener default cert is supplied via the domains_alb_attachment resource (Ravion provisions and attaches the cluster wildcard cert). var.certificate_arns is intentionally empty in that case — the precondition was rejecting the valid config. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

…s is on Without this gate the per-service auto-domain modules can't reference public_alb_https_listener_arn even though the HTTPS listener is created by the Ravion-managed-domains code path on the cluster ALB.

The control plane stores load_balancer_attachment_json as a string in the module input so it can template-substitute listener/ALB ARNs from upstream cluster outputs. The Terraform variable now matches that shape (string) and the module decodes it once in locals.tf, merging the decoded value with the optional() defaults the typed variable used to provide. All downstream resources read from local.load_balancer_attachment unchanged.

TF 1.10 doesn't short-circuit `var.X != null && var.X.attr` when X is null — it still evaluates the right side and errors on attribute access. Same pattern fix as the validation cleanup commits.

Server-side DNS record validation rejects ALIAS values whose zone_id is empty. The ecs_service module was hardcoding zone_id="" — now takes ravion_auto_domain_alb_zone_id (mapped from the cluster's public_alb_zone_id output) and writes it into the ALIAS value.

Locked to ravion.com/ravion/domains 0.1.99, a local-dev throwaway version whose checksum no longer matches the published binary. TF regenerates a fresh lock on the next init under the ~> 0.1 constraint.

Initial TF apply runs against the placeholder hello-world task definition (no real app image yet) which never stabilises and trips ECS's deployment circuit breaker. The deploy workflow that pushes the actual image is the one that should wait for steady state, not the infra apply.

Replaces ravion_app_domain + dns_record + module_certificate + the auto/custom split with a single `ravion_domain` resource that nests under a cluster-level parent (customer's top-level TF). The parent owns the wildcard cert; the service rides it via SNI — no per-service ACM. Drops `use_ravion_managed_domains` and all related Ravion plumbing from ecs_cluster and networking/alb: the cluster module is now pure-AWS, and the customer wires `ravion_domain` themselves at the top level. Provider pinned to ~> 0.3. Pin ravion provider to 0.1.99 for local-mirror testing The local mirror at port 8095 has 0.1.99 republished with the latest provider code (Mode A/B, certificate.domains). Higher versions exist in the mirror but aren't in sync with the runner's lock file. Exact-pin lets the runner pick the binary we just published without touching .terraform.lock.hcl. Bump ravion provider constraint to ~> 0.4 Domain handler rewrite (Mode A/B split, `domains` field replaces `custom_domains`) ships in 0.4.0. Per-service custom domains: Mode A vs Mode B Custom domains move OUT of the cluster module and onto each service: add `domains = ["api.example.com"]` to `ecs_service` to opt into Mode B. - ecs_cluster: drop ravion_custom_domains var (cluster cert covers only the wildcard pair); add ravion_aws_account_id + ravion_aws_region as passthrough outputs so services can issue their own certs in the same account. - ecs_service: new `domains` input (max 10). New cluster_alb_dns_name + cluster_alb_zone_id + ravion_aws_account_id + ravion_aws_region inputs (Mode B only — pipe from cluster outputs). ravion_domain.this switches between Mode A (rides wildcard) and Mode B (own cert covering only customer FQDNs) based on whether `domains` is set. - ecs_service: aws_lb_listener_certificate.ravion added (Mode B only, attaches the per-service cert as SNI extra). aws_lb_listener_rule host_header values switch between auto-FQDN (Mode A) and custom domains (Mode B). Move cluster ravion_domain into ecs_cluster module The cluster-level `ravion_domain` (wildcard cert + listener binding) now lives inside `compute/ecs_cluster` rather than being hand-rolled in the customer's top-level TF. Opt in with `use_ravion_managed_domains = true` and pipe `module.cluster.ravion_cluster_domain_id` + `module.cluster.public_alb_https_listener_arn` into each `ecs_service`. - ecs_cluster: new ravion_domains.tf allocates cluster ravion_domain (wildcard) and binds the cert as SNI extra on the public ALB HTTPS listener (does NOT replace the listener's default cert) - ecs_cluster: new vars `use_ravion_managed_domains`, `ravion_cluster_name`, `ravion_aws_account_id`, `ravion_aws_region`, `ravion_custom_domains` - ecs_cluster: new outputs `ravion_cluster_domain_id`, `ravion_cluster_domain_fqdn`, `ravion_cluster_domain_url`, `ravion_cluster_cert_arn`, `ravion_cluster_cert_status` - ecs_cluster: require ravion provider ~> 0.3 - ecs_cluster: README section documenting the wildcard pattern + the `ravion_custom_domains` DNS-01 validation flow - ecs_service: update comments to reference cluster module outputs Adopt domains provider 0.2.0 (domains_module_certificate → domains_certificate) The provider dropped the platform-internal naming (no more "module" or "cluster" in resource types). domains_module_certificate is now just domains_certificate; domains_cluster_certificate folded into the same resource since the only difference was whether the caller passes explicit domains. Bump version constraint to ~> 0.2. Wire per-service custom-domain certs via aws_lb_listener_certificate domains_module_certificate no longer accepts listener_arn (provider 0.1.x). Add explicit aws_lb_listener_certificate.ravion_custom to bind the cert as SNI on the cluster's HTTPS listener. Refresh doc strings across both modules to match the new ownership model. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

This reverts commit 32f6012.

This reverts commit 1cbbab9.

0.4.3 in the local mirror has the new ravion_domain resource (and certificate.domains field). 0.1.99 was an old binary that still exposed domains_module_certificate / domains_app_domain etc. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

The alb module knows nothing about Ravion-managed domains anymore — the ecs_cluster module attaches the Ravion wildcard cert as an SNI extra on the listener after the alb is created. The precondition was referencing a variable that was never declared in alb's variables.tf. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Mode B previously replaced the auto-FQDN with the customer domains in the listener rule's host_header, dropping all traffic to the auto-FQDN the moment `domains` was set. Now Mode A and Mode B are two separate `ravion_domain` resources living side-by-side; the listener rule matches BOTH the auto-FQDN and every customer FQDN, giving the customer a no-downtime window to flip their DNS over. The Ravion control plane retires the auto-FQDN's Route53 record once at least one customer routing record is MATCHED. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Reads cutover state from the new data source so the auto resource's count flips to 0 once Ravion retires the slot. Next plan after retirement shows a clean destroy of ravion_domain.auto[0] and the listener rule narrows to only customer FQDNs. Provider bumped to 0.4.4 for the new data source. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Following the Mode A/B split, outputs.tf still referenced the dead `ravion_domain.this` symbol. Repoint to `ravion_domain.auto` and add two new outputs for the Mode B cert (custom domain id + cert_arn). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Drops aws_lb_listener_certificate.ravion. The new ravion_domain.custom takes `resource_arn = var.cluster_https_listener_arn` and Ravion attaches the cert to the listener server-side after ACM validates it. TF apply no longer fails with `UnsupportedCertificate` for the between-states-window when ACM has not yet flipped the cert ISSUED. The whole post-issue + pre-delete cert→listener lifecycle lives in Ravion now. Provider pinned to 0.4.5 for the new resource_arn attribute. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Two changes both targeting the same edge case (Mode B → Mode A): 1. `ravion_auto_retired` is only honoured while `ravion_mode_b` is true. When the user removes every entry from `var.domains`, the auto-FQDN must come back — otherwise the listener rule has zero host_header values and AWS rejects the apply. 2. Defensive: `aws_lb_listener_rule.ravion` count guards on `length(local.ravion_host_header_values) > 0`, so any future transient zero-headers state declines to declare the rule instead of failing AWS schema validation at plan time. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

…r to 0.5.0 Multiple services share one cluster HTTPS listener, and AWS rejects duplicate listener-rule priorities. Default ravion_listener_rule_priority was 50000 (collision-prone if anyone overrode to a fixed number); now defaults to 0 = auto-derive a stable priority from sha256(var.name), giving every service a unique [1000, 49999] slot without hand-picking. Explicit values (1-50000) still win. Bumps both compute/ecs_service and compute/ecs_cluster provider pin from 0.4.x to 0.5.0, which is the rotation-aware terraform-provider-domains build: in-place SAN edits no longer destroy+create the cert (the prior flow cascade-deleted ManagedDomain rows and left the Domains tab empty while the new cert sat in PENDING_VALIDATION). 0.5.0 lands the PATCH /domains/{id} path and tower-go's RotateManagedCertWorkflow. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

When cluster_parent_domain_id is set, aws_lb_listener_rule.ravion owns ALB routing scoped by host_header to this service's FQDNs. The caller- supplied listener_rules from the control plane became redundant — and actively harmful: priorities collide across services on the shared listener, and any path-only rule (path-pattern /*) catches traffic destined for sibling services. Short-circuit them entirely in that mode. Also add `ignore_changes = [action]` to the ravion rule so blue/green deploy controllers can flip target groups without TF undoing the swap. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

0.5.1 ships the schema fix that drops UseStateForUnknown on the rotation pending_* attributes — without it, in-place SAN rotations trigger "Provider produced inconsistent result after apply" because the planner expected the prior (empty) state values but the server returns the new rotation target. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

…ssary try() calls for improved readability and performance. Update ECS cluster and service modules to support Ravion-managed domains, including new inputs for cluster domain ID and HTTPS listener ARN. Adjust ALB and NLB configurations to handle Ravion-specific requirements, ensuring proper certificate management and listener settings. Enhance documentation in module.yml files to clarify usage of Ravion-managed domains and associated configurations.

Restores the validation conditions in compute/{autoscaling,ecs_cluster,ecs_service} and networking/{alb,nlb} variables.tf to their pre-03f87c3 state. The other changes in 03f87c3 (Ravion module.yml + listener wiring fixes) remain. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

`public_alb_certificate_arns` is a pre-Ravion input. ARNs in it are frequently stale once the customer flips to Ravion mode, and silently trying to SNI-attach them on every apply turns dead certs into recurring "CertificateNotFound" failures. The Ravion wildcard already serves as the listener default, so the SNI-extras resource was strictly additive and never load-bearing. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

The alb sub-module's security_group wires HTTPS ingress (443) only when it owns the HTTPS listener (gated on local.create_https_listener). My prior refactor flipped that off so Ravion could own the listener with the wildcard cert as default — but I forgot to re-add the SG rule. Symptom: dig resolves, port 443 TCP connect times out (SG drop), demo HTTPS broken end-to-end. Mirrors the rules the alb module would have emitted: one per IPv4 cidr in var.public_alb_ingress_cidr_blocks, one per IPv6 cidr (hardcoded ::/0 to match the alb module's default since the cluster module doesn't yet expose a public_alb_ingress_ipv6_cidr_blocks pass-through). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

siddhsuresh and others added 16 commits May 16, 2026 13:10

Surface ALB https_listener_arn outputs when use_ravion_managed_domain…

054f243

…s is on Without this gate the per-service auto-domain modules can't reference public_alb_https_listener_arn even though the HTTPS listener is created by the Ravion-managed-domains code path on the cluster ALB.

Wrap auto_scaling.enabled access in try() to handle null

6fc6f35

TF 1.10 doesn't short-circuit `var.X != null && var.X.attr` when X is null — it still evaluates the right side and errors on attribute access. Same pattern fix as the validation cleanup commits.

Wrap var.auto_scaling.scheduled access in try() for null safety

792e059

Drop stale .terraform.lock.hcl from ecs_{cluster,service}

6d2e1b8

Locked to ravion.com/ravion/domains 0.1.99, a local-dev throwaway version whose checksum no longer matches the published binary. TF regenerates a fresh lock on the next init under the ~> 0.1 constraint.

Revert "Collapse ecs_service Ravion wiring to one ravion_domain child"

1cbbab9

This reverts commit 32f6012.

siddhsuresh force-pushed the domains branch from 4ce7fa6 to 1cbbab9 Compare May 18, 2026 15:43

siddhsuresh and others added 8 commits May 18, 2026 21:32

Reapply "Collapse ecs_service Ravion wiring to one ravion_domain child"

1ee8126

This reverts commit 1cbbab9.

siddhsuresh marked this pull request as ready for review May 18, 2026 21:15

greptile-apps Bot reviewed May 18, 2026

View reviewed changes

Comment thread compute/ecs_cluster/versions.tf

Comment thread compute/ecs_cluster/module.yml Outdated

Comment thread compute/ecs_service/module.yml Outdated

Comment thread networking/alb/locals.tf

siddhsuresh and others added 3 commits May 19, 2026 16:39

siddhsuresh and others added 4 commits May 19, 2026 21:47

siddhsuresh closed this May 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Domains#1

Domains#1
siddhsuresh wants to merge 31 commits into
mainfrom
domains

siddhsuresh commented May 17, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

siddhsuresh commented May 17, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

siddhsuresh commented May 17, 2026 •

edited by greptile-apps Bot

Loading