AutoPIL operates as a runtime enforcement layer — every time an AI agent requests access to a data source, the request goes through AutoPIL before any data is returned. The agent's role, the source being requested, and the task being performed are evaluated against a policy. The decision (allow or deny) is written to a tamper-evident audit log. All of this happens in-line, before data touches the agent's context window.
In the SaaS deployment, AutoPIL runs on managed infrastructure and enforcement calls go over HTTPS to a hosted endpoint. For many enterprise teams — particularly in financial services, healthcare, and regulated industries — that is not acceptable. The enforcement engine must be in their account. The audit trail must be in their database. Policy files must be in their version control. And enforcement calls must never leave their network perimeter.
The self-hosted deployment on AWS satisfies all of that. This post covers the full architecture: how AutoPIL runs on ECS, how it connects to RDS Postgres for the audit trail, and — the part that requires the most non-obvious infrastructure work — how to make it reachable from Databricks Serverless clusters, which run in Databricks's own managed VPC with no direct path into yours.
The deployment architecture
The self-hosted AutoPIL stack has three components: the enforcement API running on ECS Fargate, a Postgres database on RDS for the audit log and policy store, and a PrivateLink-based connectivity layer for Databricks Serverless access.
The Databricks Serverless agent calls POST /v1/context/evaluate on AutoPIL before accessing any data source. AutoPIL evaluates the request against the governing policy, writes the audit event to RDS with a cryptographic chain hash, and returns an allow or deny decision. The enforcement call, the audit write, and the policy files never leave your AWS account.
Why Databricks Serverless needs special networking
The connectivity challenge is specific to Databricks Serverless. With Classic cluster mode, nodes deploy into your VPC — they have private IPs in your address space and can reach internal ALBs and other services via normal VPC routing. Serverless is different: the compute pool runs in a VPC that Databricks owns and manages. You have no visibility into its IP ranges, no control over its security groups, and cannot add VPC peering from that side.
The practical consequence: Serverless clusters cannot reach internal ALB hostnames (which resolve to 10.x.x.x IPs outside their network), and their internal DNS resolver does not serve Route 53 private hosted zones from your account. Several approaches that seem reasonable fail silently:
- Pointing the notebook at an internal ALB — the DNS resolves, but the IP is in your VPC's RFC-1918 space. The Serverless cluster has no route to it.
- Using a public endpoint with IP allowlisting — Serverless cluster egress IPs are not stable, so allowlisting is impractical. And most enterprise security teams won't approve a public enforcement API endpoint for regulated workloads.
- Falling back to Classic clusters — this works, but Serverless starts in seconds versus minutes, costs less per compute hour, and is the direction Databricks is investing in. Designing around it is the wrong long-term answer.
AWS PrivateLink is the correct solution. It creates a private network path from Databricks's managed VPC into yours over the AWS backbone, with no public internet traversal. Databricks has first-class support for this through their Network Connectivity Configuration (NCC) feature — it's the intended architecture for Serverless-to-private-API connectivity.
Key architectural decisions
NLB in front of the existing ALB
PrivateLink VPC Endpoint Services only accept Network Load Balancers as the backend — not ALBs. Since AutoPIL's ECS service is already fronted by an internal ALB (handling health checks, path routing, and target group registration), the right pattern is to put an NLB in front of it rather than replace it.
AWS supports target_type = "alb" on NLB target groups for exactly this case. The NLB terminates TLS and forwards plain HTTP to the ALB, which handles all HTTP-layer semantics downstream. Replacing the ALB with an NLB would require porting all existing routing configuration and gives up HTTP routing capability — that's rarely worth it for a service that already has an ALB in place.
Access control on the endpoint service
The VPC Endpoint Service's allowed_principals list is your perimeter gate. Set it to the specific AWS account ID of the Databricks Serverless network plane for your workspace — not a wildcard. With acceptance_required = false and an open principal list, any AWS entity that discovers your endpoint service name can connect to AutoPIL. That is not the security posture you want for an enforcement API.
Databricks does not publish the Serverless AWS account ID — contact Databricks support to obtain it for your workspace before locking down allowed_principals.
The Terraform
resource "aws_lb" "nlb" { name = "autopil-nlb" internal = true load_balancer_type = "network" subnets = var.subnet_ids enable_cross_zone_load_balancing = true } resource "aws_lb_target_group" "nlb_to_alb" { name = "autopil-nlb-tg" port = 80 protocol = "TCP" vpc_id = var.vpc_id target_type = "alb" # forward to ALB, not ECS directly health_check { path = "/health" protocol = "HTTP" } } resource "aws_lb_target_group_attachment" "nlb_to_alb" { target_group_arn = aws_lb_target_group.nlb_to_alb.arn target_id = aws_lb.autopil_alb.arn port = 80 } resource "aws_lb_listener" "nlb_443" { load_balancer_arn = aws_lb.nlb.arn port = 443 protocol = "TLS" ssl_policy = "ELBSecurityPolicy-TLS13-1-2-2021-06" certificate_arn = aws_acm_certificate.autopil.arn default_action { type = "forward" target_group_arn = aws_lb_target_group.nlb_to_alb.arn } }
resource "aws_vpc_endpoint_service" "autopil" { acceptance_required = false network_load_balancer_arns = [aws_lb.nlb.arn] # Databricks Serverless AWS account ID for your workspace # Obtain from Databricks support — varies by workspace and region allowed_principals = ["arn:aws:iam::<DATABRICKS_SERVERLESS_ACCOUNT_ID>:root"] } output "endpoint_service_name" { value = aws_vpc_endpoint_service.autopil.service_name # com.amazonaws.vpce.us-east-1.vpce-svc-0xxxxx # This goes into the Databricks NCC private endpoint rule }
Configuring the Databricks NCC
After terraform apply you have an endpoint service name. The Databricks side is configured in the Account Console — account admin access required, not workspace admin.
- Go to Cloud Resources → Network Connectivity Configurations and create a new NCC in the same region
- Add a private endpoint rule: paste the endpoint service name, set a private domain (e.g.
autopil-api.internal) - Go to Workspaces → your workspace → Network and attach the NCC
Databricks provisions the VPC endpoint and the status moves from PENDING to ACTIVE within a few minutes. From that point, every Serverless cluster in the workspace resolves autopil-api.internal to the PrivateLink endpoint and can reach AutoPIL at https://autopil-api.internal.
The NCC applies at the workspace level — all Serverless clusters in that workspace get the private endpoint. Network-layer access is workspace-scoped; per-agent access control is handled by AutoPIL's policy engine using the API key and agent identity on each enforcement call.
Deploying updates to ECS
When deploying a new AutoPIL image version, calling aws ecs update-service --force-new-deployment without first registering a new task definition revision will redeploy the old image. ECS uses whatever image URI is in the current task definition — the new tag in ECR is irrelevant until a new revision references it explicitly. Always register the task definition first:
# Push new AutoPIL image to ECR docker push "${ECR_REPO}:${VERSION}" # Register new task definition revision pointing to the new image CURRENT=$(aws ecs describe-task-definition \ --task-definition autopil --query 'taskDefinition' --output json) NEW=$(echo "$CURRENT" | python3 -c " import json, sys, os td = json.load(sys.stdin) td['containerDefinitions'][0]['image'] = os.environ['NEW_IMAGE'] for k in ['taskDefinitionArn','revision','status','requiresAttributes', 'compatibilities','registeredAt','registeredBy']: td.pop(k, None) print(json.dumps(td)) " NEW_IMAGE="${ECR_REPO}:${VERSION}") NEW_REV=$(aws ecs register-task-definition \ --cli-input-json "$NEW" \ --query 'taskDefinition.revision' --output text) aws ecs update-service \ --cluster autopil \ --service autopil \ --task-definition "autopil:${NEW_REV}" \ --force-new-deployment
Key environment variables for the ECS task
AutoPIL reads its configuration from environment variables set on the ECS task definition. The critical ones for a self-hosted enterprise deployment:
DATABASE_URL = postgresql://user:pass@your-rds-host:5432/autopil AUTOPIL_POLICY = /policies # path to your mounted policy YAML directory AUTOPIL_SECRET_KEY = <signing key for API key generation> AUTOPIL_RESET_ADMIN_KEY = 1 # set on first deploy to bootstrap superadmin key; remove after AUTOPIL_REQUIRE_AGENT_ID = 1 # deny calls without a registered agent_id (recommended for prod)
Policy YAML files are typically mounted via ECS volume from S3 (using the ECS S3 volume driver) or baked into the container image at build time. For enterprises with an existing policy-as-code workflow, mounting from S3 or an internal artifact store lets you update policies without redeploying the container.
Observability
Three monitoring points matter most for a production AutoPIL deployment behind PrivateLink:
NLB HealthyHostCount — alarm when this drops to zero. It means the ALB is unreachable from the NLB and all enforcement calls from Databricks are failing. No healthy hosts = your agents are either all denied or bypassing enforcement, depending on your fail-open/fail-closed configuration.
AutoPIL's own deny rate metrics — the dashboard's Overview tab shows policy decision distribution across tenants in real time. A sudden spike in deny rate on a specific agent or source can indicate a misconfigured policy, a new data source that hasn't been registered, or an actual policy violation. This is where AutoPIL's alerting engine adds value — denial spikes trigger alerts before they become incidents.
VPC Flow Logs on NLB subnets — source IPs for PrivateLink traffic appear as NLB ENI IPs, not the originating Databricks cluster IPs (those are NATed at the PrivateLink boundary). This is worth documenting for your security team: you cannot trace individual Serverless connections from flow logs alone. AutoPIL's audit log is the authoritative per-call record — it captures agent ID, session ID, source, task, decision, and the cryptographic chain hash for every evaluation.
What the enforcement model looks like end-to-end
Once the stack is deployed and the NCC is attached, enforcement is transparent to the Databricks notebook code. Each agent tool call hits POST /v1/context/evaluate before accessing any data source. AutoPIL evaluates the request against the agent's governing policy — based on the agent's registered role and the source/task combination — and returns a decision with minimal added latency — typically in the low single-digit milliseconds for the policy evaluation itself.
The audit trail in RDS captures every evaluation with a cryptographic chain hash linking each event to the previous one. Tampering with any event in the sequence breaks the chain and is detectable via the GET /v1/audit/verify endpoint. For regulated industries where audit trail integrity is a compliance requirement — not just a nice-to-have — this is the property that makes the audit log defensible to an examiner.
This is the same deployment pattern we use for the AutoPIL Databricks reference implementation. If you're evaluating a self-hosted deployment, start with the SaaS trial to validate your policies and data taxonomy, then reach out to discuss bringing enforcement inside your perimeter.