π Welcome to AWS DNS Services
Welcome to our comprehensive deep dive into AWS DNS services. Today we're going to explore the complete ecosystem of DNS solutions that AWS offers, and trust me, it's much more extensive than you might initially think.
We'll start with a high-level overview of the entire DNS landscape in AWS, then drill down into each service with practical examples and real-world use cases. By the end of this presentation, you'll understand not just what each service does, but when and how to use them together.
This isn't just theory - we'll cover actual command-line examples, configuration patterns, and the gotchas I've learned from implementing these services in production environments.
ποΈ The Complete DNS Ecosystem
Let's start by understanding the big picture. AWS doesn't just have one DNS service - they have four major components that work together to provide a complete DNS solution.
Route 53 is your primary authoritative DNS service - think of it as your main DNS control center. Cloud Map handles service discovery for microservices. CloudFront provides edge DNS resolution for global content delivery. And App Mesh offers DNS integration for service mesh architectures.
What's really powerful is how these services integrate. For example, your microservices might use Cloud Map for internal discovery, while Route 53 handles external traffic routing, and CloudFront accelerates global delivery.
Notice the traffic flow - external users hit Route 53 first, which can route to CloudFront for cached content, or directly to your infrastructure. Meanwhile, your internal services use Cloud Map and App Mesh for service-to-service communication.
π― Route 53 Deep Architecture
Now let's dive deep into Route 53's architecture. This is where the real magic happens for DNS routing and traffic management.
The diagram shows the complete DNS resolution flow - from when a user types your domain name, through the recursive DNS lookup process, all the way to Route 53 returning the appropriate IP address based on your routing policies.
What makes Route 53 powerful isn't just basic DNS - it's the sophisticated routing policies. You can distribute traffic by weight for blue-green deployments, route by latency for performance, implement failover for high availability, or route by geography for compliance.
The health checks are crucial here - they're continuously monitoring your endpoints from multiple AWS regions, feeding that health data into the routing decisions. This isn't just passive DNS; it's intelligent traffic management.
π Route 53 Setup Command Flow
Let me walk you through the exact sequence for setting up Route 53. Order matters here - you can't just jump to advanced routing policies without laying the foundation properly.
Phase 1 is all about establishing DNS authority. You create the hosted zone, then update your domain registrar with the name servers Route 53 gives you. This is where many people get stuck - DNS propagation can take up to 48 hours.
Phase 2 sets up health monitoring before you create DNS records that depend on it. This is critical - always create your health checks first, then reference them in your routing policies.
Phases 3 and 4 build up your DNS records from simple to complex. Start with basic A records, then add weighted routing, latency-based routing, and failover as needed.
The advanced features in phases 4 and 5 - like Traffic Flow policies and query logging - are where Route 53 really shines for enterprise use cases.
ποΈ Creating Your First Hosted Zone
Let's start with the foundation - creating a hosted zone. This is your DNS control center for a domain. The hosted zone contains all the DNS records for your domain.
The caller-reference parameter is crucial - it prevents duplicate creation. I always use a timestamp to ensure uniqueness. The comment helps with documentation, especially in larger organizations.
When you create a hosted zone, Route 53 gives you four name servers. You must configure these at your domain registrar - this is how the global DNS system knows that Route 53 is authoritative for your domain.
Pro tip: For private hosted zones, set PrivateZone to true and specify the VPC. This creates DNS that only works within your VPC - perfect for internal service names.
β€οΈ Health Checks - The Foundation of Reliability
Health checks are absolutely critical for production DNS. They're continuously monitoring your endpoints from multiple AWS regions, providing the health data that drives intelligent routing decisions.
I always create health checks before creating the DNS records that reference them. The configuration shown here monitors an HTTP endpoint every 30 seconds, marking it unhealthy after 3 consecutive failures.
The MeasureLatency option is really valuable - it feeds latency data to CloudWatch, helping you understand performance patterns. The Regions parameter controls which AWS regions perform the health checks - I recommend using at least 3 regions to avoid false positives.
String matching health checks can verify that your application is returning expected content, not just responding with an HTTP 200. This catches partial outages that basic health checks might miss.
π― Simple DNS Records
Let's start with the basics - simple DNS records. These are your standard A records that map a domain name to an IP address. Every DNS setup starts here.
The Action parameter can be CREATE, DELETE, or UPSERT. I usually use UPSERT because it creates the record if it doesn't exist or updates it if it does - very handy for automation.
TTL is important here - 300 seconds (5 minutes) is a good default for most applications. Lower values mean faster updates but more DNS queries and cost. Higher values mean better caching but slower updates.
You can have multiple ResourceRecords in the array for the same name - Route 53 will return all of them, and the client will typically use the first one. This provides basic redundancy without routing policies.
βοΈ Weighted Routing for Traffic Distribution
Now we're getting into the more advanced features. Weighted routing lets you distribute traffic across multiple endpoints based on assigned weights. This is perfect for blue-green deployments or canary releases.
The SetIdentifier is required and must be unique across all weighted records for the same name. I like to include the purpose in the identifier - like "primary-us-east-1" - for clarity.
With weights of 80 and 20, you're sending 80% of traffic to the primary endpoint and 20% to the secondary. Route 53 automatically removes unhealthy endpoints from the rotation based on the health checks.
Lower TTL values are important here - 60 seconds means clients will pick up routing changes quickly. This is crucial for deployments where you want to gradually shift traffic.
β‘ Latency-Based Routing for Performance
Latency-based routing is one of my favorite Route 53 features. It automatically routes users to the AWS region with the lowest latency, based on real network measurements that Route 53 continuously collects.
The Region parameter tells Route 53 where your resource is located. Route 53 maintains latency data between different user locations and AWS regions, automatically routing to the fastest option.
This is incredibly powerful for global applications. A user in Singapore might get routed to ap-southeast-1, while a user in Germany gets routed to eu-west-1, all automatically based on current network conditions.
Remember to always include health checks - you don't want users routed to the "fastest" region if it's actually down. Route 53 will automatically fall back to the next-fastest healthy region.
π Failover Routing for High Availability
Failover routing provides active-passive disaster recovery. All traffic normally goes to your primary endpoint, but Route 53 automatically switches to secondary if the primary fails its health checks.
The health check is required for the PRIMARY record - this is how Route 53 knows when to failover. The SECONDARY record can have a health check too, which I recommend for production systems.
The 60-second TTL is crucial for failover scenarios - you want clients to pick up the failover quickly. Some organizations use even lower values like 30 seconds for critical applications.
Pro tip: Test your failover regularly. Simulate failures of your primary endpoint to ensure the secondary can handle the full load and that the failover happens within your RTO requirements.
π Traffic Flow for Complex Routing
Traffic Flow is Route 53's visual policy engine for complex routing scenarios. When you need to combine multiple routing policies - like geolocation with weighted routing - Traffic Flow is your solution.
The policy shown here implements a sophisticated routing strategy: US users get distributed between two US regions with 80/20 weighting, UK users go to Europe, and everyone else gets a default weighted distribution.
The Rules section defines the decision tree - starting with geolocation, then applying weighted routing within each region. The Endpoints section defines the final destinations.
What's really powerful is the visual editor in the AWS console - you can see the entire decision tree graphically, making it easy to understand and modify complex routing logic.
π Query Logging and Monitoring
Query logging is essential for troubleshooting DNS issues and understanding traffic patterns. Route 53 can log every DNS query to CloudWatch Logs, giving you detailed visibility into DNS resolution.
You must create the CloudWatch log group first, then enable query logging. The logs include timestamp, client IP, query name, query type, response code, and resolver location - everything you need for troubleshooting.
Be aware of costs - DNS logging can generate significant log volume for high-traffic domains. Set appropriate retention policies on your log groups to manage costs.
The query logs are incredibly valuable for security monitoring too - you can detect DNS tunneling attempts, DGA (domain generation algorithm) malware, and other DNS-based attacks.
πΊοΈ Cloud Map Service Discovery
Now let's move to Cloud Map, AWS's service discovery solution. While Route 53 handles external DNS, Cloud Map is designed for internal service-to-service communication in microservices architectures.
The diagram shows the complete service discovery flow - applications register themselves with Cloud Map, which maintains a registry of running service instances with their IP addresses and ports.
What makes Cloud Map powerful is the integration with health checks. Only healthy instances are discoverable, so your services automatically route around failures. The health checks can be HTTP/HTTPS endpoints, TCP ports, or custom health checks.
Cloud Map supports both DNS-based discovery (traditional DNS queries) and API-based discovery (programmatic queries). It also integrates automatically with ECS and EKS for hands-off service registration.
ποΈ Creating Cloud Map Namespace
Let's walk through setting up Cloud Map. It starts with creating a namespace - think of this as a DNS domain for your services. I usually use ".local" for private namespaces.
Private DNS namespaces only work within the specified VPC, which is perfect for internal microservices. The services get DNS names like "web-service.my-company.local" that resolve to the current instance IPs.
For services that need to be discoverable from the internet, use create-public-dns-namespace instead. HTTP namespaces are for API-only discovery without DNS resolution.
The namespace becomes the foundation for all your services - you'll create multiple services within each namespace, each with their own instances.
π§ Service Registry Creation
Once you have a namespace, you create services within it. Each service gets its own DNS name and can contain multiple instances.
The DnsConfig defines what type of DNS records get created - A records for IPv4, AAAA for IPv6, SRV records for port information. The TTL determines how long clients cache the DNS responses.
HealthCheckCustomConfig enables Cloud Map's health checking. The FailureThreshold determines how many consecutive health check failures mark an instance as unhealthy.
You can also use Route 53 health checks instead of custom health checks if you need more sophisticated monitoring like HTTP string matching or calculated health checks.
π Registering Service Instances
This is where the rubber meets the road - registering actual running instances with Cloud Map. Each instance needs a unique ID and its network information.
AWS_INSTANCE_IPV4 and AWS_INSTANCE_PORT are the core attributes that determine where traffic gets routed. You can add custom attributes for additional metadata like version, environment, or capability flags.
In production, you typically don't register instances manually - ECS and EKS can do this automatically. But understanding the manual process helps with troubleshooting and custom integrations.
Custom attributes are really powerful for advanced routing. For example, you could register instances with a "version" attribute, then query for specific versions during deployment testing.
β‘ CloudFront DNS Integration
CloudFront adds another layer to your DNS architecture - global edge DNS resolution. When users query your CloudFront-enabled domain, they get automatically routed to the nearest edge location.
The beauty of CloudFront's DNS integration is that it's completely transparent. Users query Route 53 for your custom domain, get back a CNAME to the CloudFront distribution, then CloudFront's edge DNS routes them to the optimal edge location.
Edge location selection considers not just geographic distance, but actual network performance and edge location load. CloudFront continuously measures performance and adjusts routing automatically.
Different content types get different caching behaviors - static assets might cache for 24 hours, while API responses might have very short TTLs or no caching at all.
β‘ CloudFront Distribution Setup
Creating a CloudFront distribution requires careful configuration of origins, cache behaviors, and DNS aliases. This example shows a multi-origin setup with both S3 and ALB origins.
The Aliases section is crucial - this is where you specify your custom domain names. You'll need an SSL certificate in ACM for HTTPS support.
Origins define your backend servers. S3 origins use Origin Access Identity for security, while custom origins (like ALBs) use HTTPS-only for encryption in transit.
Cache policies control how long content is cached at edge locations. Use managed policies for common patterns, or create custom policies for specific requirements.
π Route 53 CloudFront Integration
The final step is connecting Route 53 to your CloudFront distribution using an ALIAS record. This is much better than a CNAME record because ALIAS records are free and can be used at the root domain.
The HostedZoneId for CloudFront is always Z2FDTNDATAQYW2 - this is CloudFront's global hosted zone ID. EvaluateTargetHealth is typically false for CloudFront because CloudFront handles its own failover.
ALIAS records automatically resolve to the IP addresses of the CloudFront edge locations, so there's no additional DNS lookup cost or latency. This makes CloudFront integration very efficient.
Remember to configure your SSL certificate in CloudFront and set up appropriate cache behaviors before creating the Route 53 ALIAS record.
πΈοΈ App Mesh DNS Architecture
App Mesh adds service mesh capabilities to your DNS strategy. It provides sophisticated service-to-service communication with features like circuit breaking, retries, and automatic mTLS.
The key insight here is that App Mesh integrates with Cloud Map for service discovery, but adds an Envoy proxy sidecar that handles all the network communication. Your application code doesn't change - it just makes normal HTTP calls.
Virtual Nodes represent your actual compute resources, while Virtual Services provide stable DNS names. The Envoy proxies automatically handle load balancing, health checking, and traffic management based on App Mesh policies.
All communication flows through the Envoy sidecars, which enables features like automatic retry, circuit breaking, and detailed observability without any code changes.
πΈοΈ App Mesh Setup Process
Setting up App Mesh follows a specific sequence. You start by creating the mesh itself, which is the container for all your App Mesh resources.
The egress filter controls whether services can communicate with endpoints outside the mesh. ALLOW_ALL permits external calls, while DROP_ALL blocks them unless explicitly configured.
Virtual nodes represent your actual services and include listener configuration, health checks, and service discovery integration. The backends section controls which services each node can communicate with.
Virtual services provide stable DNS names that abstract the underlying virtual nodes. This enables things like traffic shifting during deployments without changing client code.
π¦ Virtual Node Configuration
Virtual nodes are where you define the characteristics of your services. This includes what ports they listen on, how health checking works, and which services they can call.
The listeners section defines the network interface - port, protocol, and health check configuration. HTTP health checks should point to your application's health endpoint.
Service discovery integration with Cloud Map enables automatic endpoint discovery. As ECS tasks or EKS pods start and stop, Cloud Map keeps the service registry updated.
The backends section implements least-privilege networking - services can only communicate with explicitly allowed backends. This provides security and helps prevent cascading failures.
π¦ ECS App Mesh Integration
Integrating App Mesh with ECS requires specific task definition configuration. The key is the proxyConfiguration section, which sets up traffic interception.
The Envoy sidecar runs with UID 1337 and uses iptables rules to intercept all network traffic. AppPorts specifies which ports your application listens on.
EgressIgnoredIPs is crucial - it excludes AWS metadata endpoints that ECS tasks need to access directly. The APPMESH_VIRTUAL_NODE_NAME environment variable links the task to its mesh configuration.
The dependsOn configuration ensures Envoy starts and becomes healthy before your application container starts. This prevents race conditions during task startup.
π Complete DNS Setup Flow
Let's wrap up with the complete setup flow that ties everything together. This is your roadmap for implementing a comprehensive DNS strategy in AWS.
The key is following the dependencies - you need foundation services before advanced features. Health checks before routing policies. Infrastructure before DNS records that point to it.
Each phase builds on the previous one. You start with basic DNS authority, add health monitoring, create basic records, then layer on advanced routing, service discovery, CDN, and service mesh capabilities.
The verification commands at each step are crucial - always test that one phase works before moving to the next. DNS issues can be difficult to troubleshoot, so validation at each step saves time later.
π Verification and Testing
Testing and verification are absolutely critical for DNS implementations. The commands shown here cover the essential verification steps for each AWS DNS service.
dig and nslookup are your primary tools for DNS testing. Test from multiple locations and DNS servers to verify propagation. Don't forget to test different record types - A, AAAA, CNAME, MX.
Health check testing ensures your failover and routing policies work correctly. Use get-health-check-status to see the current status from all checking regions.
For Cloud Map, test both DNS and API-based discovery. For CloudFront, verify both the distribution status and the actual content delivery. For App Mesh, check the virtual node status and Envoy proxy health.
Query logs are invaluable for troubleshooting - they show exactly what queries are being made and how Route 53 is responding.
π‘ Key Takeaways and Best Practices
Let me share the key insights and best practices I've learned from implementing these DNS services in production environments.
First, always start simple and build complexity gradually. Begin with basic Route 53 records, then add routing policies as needed. Don't try to implement everything at once.
Health checks are absolutely critical - invest time in getting them right. Use multiple regions, appropriate thresholds, and test your health check endpoints regularly.
For microservices, the combination of Cloud Map and App Mesh provides powerful service discovery and communication management. But remember, App Mesh has a learning curve - start with simple topologies.
Monitor everything - DNS query patterns, health check status, CloudFront cache hit rates, and App Mesh metrics. DNS issues often show up first in monitoring before users complain.
Finally, document your DNS architecture thoroughly. DNS can become complex quickly, and good documentation is essential for troubleshooting and team knowledge transfer.
π AWS DNS Services
βοΈ
Comprehensive Guide to DNS Solutions
Route 53 β’ Cloud Map β’ CloudFront β’ App Mesh
π― Route 53
Authoritative DNS with advanced routing policies and health monitoring
πΊοΈ Cloud Map
Service discovery for microservices and containerized applications
β‘ CloudFront
Global CDN with integrated edge DNS resolution
πΈοΈ App Mesh
Service mesh with DNS-based service communication
ποΈ AWS DNS Ecosystem Overview
graph TB
subgraph "External World"
Users[π₯ End Users]
Domain[π Domain Registrar]
ISP[π’ ISP DNS Resolvers]
end
subgraph "AWS DNS Services"
R53[π― Route 53
Authoritative DNS]
CM[πΊοΈ Cloud Map
Service Discovery]
CF[β‘ CloudFront
Edge DNS]
AM[πΈοΈ App Mesh
Service Mesh DNS]
end
subgraph "AWS Infrastructure"
ALB[βοΈ Application Load Balancer]
ECS[π¦ ECS Services]
EKS[βΈοΈ EKS Clusters]
S3[πͺ£ S3 Buckets]
end
Users -->|DNS Queries| ISP
ISP -->|Recursive Lookup| R53
Domain -->|NS Records| R53
R53 -->|Route Traffic| ALB
R53 -->|Route Traffic| CF
R53 -->|Static Hosting| S3
CM -->|Service Discovery| ECS
CM -->|Service Discovery| EKS
CF -->|Origin Requests| ALB
CF -->|Origin Requests| S3
AM -->|Mesh Communication| ECS
AM -->|Mesh Communication| EKS
Integrated DNS Strategy: Each service serves specific use cases but work together for comprehensive DNS and service discovery across your AWS infrastructure.
π― Route 53 Architecture Deep Dive
graph TB
subgraph "DNS Query Resolution Flow"
Browser[π User Browser]
Recursive[π Recursive Resolver]
Root[π Root Name Server]
TLD[π’ .com TLD Server]
Auth[π― Route 53 Authoritative]
end
subgraph "Route 53 Core Components"
HZ[π Hosted Zone
example.com]
Records[π DNS Records
A, AAAA, CNAME, MX, TXT]
HC[β€οΈ Health Checks
HTTP/HTTPS/TCP]
Policies[ποΈ Routing Policies]
end
subgraph "Routing Policy Types"
Simple[π― Simple Routing]
Weighted[βοΈ Weighted Routing]
Latency[β‘ Latency-based]
Failover[π Failover]
Geo[π Geolocation]
Multi[π² Multivalue Answer]
end
Browser -->|1. Query example.com| Recursive
Recursive -->|2. Query Root| Root
Root -->|3. Refer to .com| TLD
TLD -->|4. Refer to Route 53| Auth
Auth -->|5. Return IP| Recursive
Recursive -->|6. Return IP| Browser
Auth --> HZ
HZ --> Records
Records --> Policies
HC --> Policies
Policies --> Simple
Policies --> Weighted
Policies --> Latency
Policies --> Failover
Policies --> Geo
Policies --> Multi
π Route 53 Setup Command Flow
graph TD
Start[π Start Route 53 Setup]
subgraph "Phase 1: Foundation"
HZ[1οΈβ£ Create Hosted Zone
aws route53 create-hosted-zone]
NS[2οΈβ£ Update Domain Registrar
NS Records]
end
subgraph "Phase 2: Health Monitoring"
HC1[3οΈβ£ Create Health Check - Primary
aws route53 create-health-check]
HC2[4οΈβ£ Create Health Check - Secondary
aws route53 create-health-check]
end
subgraph "Phase 3: DNS Records"
Simple[5οΈβ£ Simple A Record
aws route53 change-resource-record-sets]
Weighted[6οΈβ£ Weighted Records
aws route53 change-resource-record-sets]
Latency[7οΈβ£ Latency Records
aws route53 change-resource-record-sets]
Failover[8οΈβ£ Failover Records
aws route53 change-resource-record-sets]
end
subgraph "Phase 4: Advanced Features"
TF[9οΈβ£ Traffic Flow Policy
aws route53 create-traffic-policy]
Log[π Query Logging
aws route53 create-query-logging-config]
end
Start --> HZ
HZ --> NS
NS --> HC1
HC1 --> HC2
HC2 --> Simple
Simple --> Weighted
Weighted --> Latency
Latency --> Failover
Failover --> TF
TF --> Log
Order Matters: Always create health checks before DNS records that reference them. Foundation before complexity.
ποΈ Step 1: Create Hosted Zone
Foundation Setup
aws route53 create-hosted-zone \
--name example.com \
--caller-reference "hz-$(date +%s)" \
--hosted-zone-config Comment="Production zone for example.com",PrivateZone=false
# Returns:
{
"HostedZone": {
"Id": "/hostedzone/Z1234567890ABC",
"Name": "example.com.",
"CallerReference": "hz-1640995200"
},
"DelegationSet": {
"NameServers": [
"ns-123.awsdns-12.com.",
"ns-456.awsdns-34.net.",
"ns-789.awsdns-56.org.",
"ns-012.awsdns-78.co.uk."
]
}
}
π Key Parameters
- --name: Domain name for the hosted zone
- --caller-reference: Unique identifier (prevents duplicates)
- Comment: Documentation for team understanding
- PrivateZone: false = public, true = VPC-only
Next Step: Configure the returned name servers at your domain registrar to establish DNS authority.
β€οΈ Step 2: Create Health Checks
Health Monitoring
aws route53 create-health-check \
--caller-reference "hc-primary-$(date +%s)" \
--health-check-config '{
"Type": "HTTP",
"ResourcePath": "/health",
"FullyQualifiedDomainName": "api.example.com",
"Port": 80,
"RequestInterval": 30,
"FailureThreshold": 3,
"MeasureLatency": true,
"Regions": ["us-east-1", "us-west-2", "eu-west-1"]
}'
# Returns:
{
"HealthCheck": {
"Id": "12345678-1234-1234-1234-123456789012",
"CallerReference": "hc-primary-1640995260",
"HealthCheckConfig": { ... },
"HealthCheckVersion": 1
}
}
| Parameter |
Options |
Recommendation |
| Type |
HTTP, HTTPS, TCP, CALCULATED |
HTTPS for production |
| RequestInterval |
10 or 30 seconds |
30s for most cases |
| FailureThreshold |
1-10 consecutive failures |
3 for balanced sensitivity |
| Regions |
Multiple AWS regions |
Use 3+ regions |
π― Step 3: Simple DNS Records
Basic DNS Setup
aws route53 change-resource-record-sets \
--hosted-zone-id Z1234567890ABC \
--change-batch '{
"Changes": [{
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "www.example.com",
"Type": "A",
"TTL": 300,
"ResourceRecords": [
{"Value": "192.0.2.1"}
]
}
}]
}'
# Returns:
{
"ChangeInfo": {
"Id": "/change/C123456789ABCDEF",
"Status": "PENDING",
"SubmittedAt": "2024-01-01T12:00:00.000Z"
}
}
Action Types
- CREATE: New record
- DELETE: Remove record
- UPSERT: Create or update
TTL Guidelines
- 300s: Standard default
- 60s: During changes
- 3600s: Stable records
βοΈ Step 4: Weighted Routing
Traffic Distribution
aws route53 change-resource-record-sets \
--hosted-zone-id Z1234567890ABC \
--change-batch '{
"Changes": [
{
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "api.example.com",
"Type": "A",
"SetIdentifier": "primary-us-east-1",
"Weight": 80,
"TTL": 60,
"ResourceRecords": [{"Value": "192.0.2.1"}],
"HealthCheckId": "12345678-1234-1234-1234-123456789012"
}
},
{
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "api.example.com",
"Type": "A",
"SetIdentifier": "secondary-us-west-2",
"Weight": 20,
"TTL": 60,
"ResourceRecords": [{"Value": "192.0.2.2"}],
"HealthCheckId": "87654321-4321-4321-4321-210987654321"
}
}
]
}'
graph TD
A[Incoming Traffic] --> B[Route 53 Weighted Routing]
B --> C{Weight Distribution}
C -->|80% Weight| D[Primary Endpoint
192.0.2.1]
C -->|20% Weight| E[Secondary Endpoint
192.0.2.2]
F[Health Check 1] --> D
G[Health Check 2] --> E
style D fill:#e8f5e8
style E fill:#fff3e0
β‘ Step 5: Latency-Based Routing
Performance Optimization
aws route53 change-resource-record-sets \
--hosted-zone-id Z1234567890ABC \
--change-batch '{
"Changes": [
{
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "global.example.com",
"Type": "A",
"SetIdentifier": "us-east-1-latency",
"Region": "us-east-1",
"TTL": 60,
"ResourceRecords": [{"Value": "192.0.2.1"}],
"HealthCheckId": "12345678-1234-1234-1234-123456789012"
}
},
{
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "global.example.com",
"Type": "A",
"SetIdentifier": "eu-west-1-latency",
"Region": "eu-west-1",
"TTL": 60,
"ResourceRecords": [{"Value": "192.0.2.3"}],
"HealthCheckId": "11111111-2222-3333-4444-555555555555"
}
}
]
}'
π How Latency Routing Works
Route 53 maintains real-time latency measurements between user locations and AWS regions. Users are automatically routed to the region with lowest latency based on current network conditions.
π Step 6: Failover Routing
High Availability
aws route53 change-resource-record-sets \
--hosted-zone-id Z1234567890ABC \
--change-batch '{
"Changes": [
{
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "app.example.com",
"Type": "A",
"SetIdentifier": "primary-failover",
"Failover": "PRIMARY",
"TTL": 60,
"ResourceRecords": [{"Value": "192.0.2.1"}],
"HealthCheckId": "12345678-1234-1234-1234-123456789012"
}
},
{
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "app.example.com",
"Type": "A",
"SetIdentifier": "secondary-failover",
"Failover": "SECONDARY",
"TTL": 60,
"ResourceRecords": [{"Value": "192.0.2.2"}]
}
}
]
}'
sequenceDiagram
participant User
participant Route53
participant Primary
participant Secondary
participant HealthCheck
User->>Route53: Query app.example.com
Route53->>HealthCheck: Check primary health
HealthCheck->>Primary: HTTP GET /health
Primary-->>HealthCheck: 200 OK
HealthCheck-->>Route53: Healthy
Route53-->>User: 192.0.2.1 (Primary)
Note over Primary: Primary goes down
User->>Route53: Query app.example.com
Route53->>HealthCheck: Check primary health
HealthCheck->>Primary: HTTP GET /health
Primary-->>HealthCheck: Timeout
HealthCheck-->>Route53: Unhealthy
Route53-->>User: 192.0.2.2 (Secondary)
π Step 7: Traffic Flow Policies
Advanced Routing Logic
aws route53 create-traffic-policy \
--name "global-app-routing" \
--document '{
"AWSPolicyFormatVersion": "2015-10-01",
"RecordType": "A",
"StartRule": "geolocation_rule",
"Rules": {
"geolocation_rule": {
"RuleType": "geolocation",
"Locations": [
{
"CountryCode": "US",
"EndpointReference": "us_weighted_rule"
},
{
"CountryCode": "GB",
"EndpointReference": "eu_endpoint"
},
{
"IsDefault": true,
"EndpointReference": "default_weighted_rule"
}
]
},
"us_weighted_rule": {
"RuleType": "weighted",
"Items": [
{
"Weight": 80,
"EndpointReference": "us_east_endpoint"
},
{
"Weight": 20,
"EndpointReference": "us_west_endpoint"
}
]
}
},
"Endpoints": {
"us_east_endpoint": {"Type": "value", "Value": "192.0.2.1"},
"us_west_endpoint": {"Type": "value", "Value": "192.0.2.2"},
"eu_endpoint": {"Type": "value", "Value": "192.0.2.3"}
}
}' \
--comment "Geolocation with weighted routing for global app"
# Apply the policy
aws route53 create-traffic-policy-instance \
--hosted-zone-id Z1234567890ABC \
--name "global.example.com" \
--ttl 300 \
--traffic-policy-id "12345678-1234-1234-1234-123456789012" \
--traffic-policy-version 1
π Step 8: Query Logging & Monitoring
Operational Visibility
# Create CloudWatch Log Group
aws logs create-log-group \
--log-group-name "/aws/route53/example.com" \
--region us-east-1
# Enable Route 53 query logging
aws route53 create-query-logging-config \
--hosted-zone-id Z1234567890ABC \
--cloud-watch-logs-log-group-arn "arn:aws:logs:us-east-1:123456789012:log-group:/aws/route53/example.com"
# View query logs
aws logs filter-log-events \
--log-group-name "/aws/route53/example.com" \
--start-time $(date -d '1 hour ago' +%s)000 \
--filter-pattern "ERROR"
# Monitor health check status
aws route53 get-health-check-status \
--health-check-id 12345678-1234-1234-1234-123456789012
Log Contents
- Query timestamp
- Client IP address
- Query name & type
- Response code
- Resolver location
Use Cases
- Troubleshooting DNS issues
- Security monitoring
- Traffic pattern analysis
- Performance optimization
πΊοΈ Cloud Map Service Discovery
graph TB
subgraph "Service Registration Flow"
App1[π± Application 1
web-service]
App2[π± Application 2
api-service]
App3[π± Application 3
db-service]
end
subgraph "AWS Cloud Map"
NS[π Namespace
my-company.local]
Svc1[π§ Service Registry
web-service]
Svc2[π§ Service Registry
api-service]
Svc3[π§ Service Registry
db-service]
Inst1[π Service Instance
10.0.1.100:8080]
Inst2[π Service Instance
10.0.2.100:3000]
Inst3[π Service Instance
10.0.3.100:5432]
end
subgraph "Service Discovery Methods"
DNS[π DNS-based Discovery
web-service.my-company.local]
API[π API-based Discovery
DiscoverInstances]
Auto[π€ Auto Registration
ECS/EKS Integration]
end
subgraph "Health Monitoring"
HC1[β€οΈ Health Check
HTTP /health]
HC2[β€οΈ Health Check
TCP 3000]
HC3[β€οΈ Health Check
Custom Health Check]
end
App1 -->|Register| Svc1
App2 -->|Register| Svc2
App3 -->|Register| Svc3
Svc1 --> Inst1
Svc2 --> Inst2
Svc3 --> Inst3
NS --> Svc1
NS --> Svc2
NS --> Svc3
Inst1 --> HC1
Inst2 --> HC2
Inst3 --> HC3
DNS --> NS
API --> NS
Auto --> NS
Service Discovery: Cloud Map provides DNS and API-based service discovery for microservices with automatic health monitoring and ECS/EKS integration.
ποΈ Cloud Map Setup: Create Namespace
Foundation for Service Discovery
# Create private DNS namespace
aws servicediscovery create-private-dns-namespace \
--name "my-company.local" \
--description "Private namespace for microservices" \
--vpc "vpc-12345678"
# Returns:
{
"OperationId": "op-12345678901234567"
}
# Check operation status
aws servicediscovery get-operation \
--operation-id "op-12345678901234567"
# Returns:
{
"Operation": {
"Id": "op-12345678901234567",
"Type": "CREATE_NAMESPACE",
"Status": "SUCCESS",
"Targets": {
"NAMESPACE": "ns-12345678901234567"
}
}
}
| Namespace Type |
Use Case |
Discovery Method |
| Private DNS |
VPC-internal services |
DNS queries within VPC |
| Public DNS |
Internet-accessible services |
DNS queries from anywhere |
| HTTP |
API-only discovery |
DiscoverInstances API |
π§ Cloud Map: Create Service Registry
Service Definition
aws servicediscovery create-service \
--name "web-service" \
--namespace-id "ns-12345678901234567" \
--dns-config '{
"NamespaceId": "ns-12345678901234567",
"DnsRecords": [
{
"Type": "A",
"TTL": 60
}
]
}' \
--health-check-custom-config '{
"FailureThreshold": 3
}' \
--description "Frontend web application service"
# Returns:
{
"Service": {
"Id": "srv-12345678901234567",
"Arn": "arn:aws:servicediscovery:us-east-1:123456789012:service/srv-12345678901234567",
"Name": "web-service",
"NamespaceId": "ns-12345678901234567",
"DnsConfig": {
"NamespaceId": "ns-12345678901234567",
"DnsRecords": [{"Type": "A", "TTL": 60}]
}
}
}
DNS Record Types
- A: IPv4 addresses
- AAAA: IPv6 addresses
- CNAME: Canonical names
- SRV: Service records
Health Check Options
- Custom: Application-managed
- Route 53: HTTP/HTTPS/TCP
- None: No health checking
π Cloud Map: Register Service Instance
Instance Registration
aws servicediscovery register-instance \
--service-id "srv-12345678901234567" \
--instance-id "web-service-instance-1" \
--attributes '{
"AWS_INSTANCE_IPV4": "10.0.1.100",
"AWS_INSTANCE_PORT": "8080",
"environment": "production",
"version": "1.2.3"
}'
# Returns:
{
"OperationId": "op-23456789012345678"
}
# Test service discovery
aws servicediscovery discover-instances \
--namespace-name "my-company.local" \
--service-name "web-service"
# Returns:
{
"Instances": [
{
"InstanceId": "web-service-instance-1",
"NamespaceName": "my-company.local",
"ServiceName": "web-service",
"HealthStatus": "HEALTHY",
"Attributes": {
"AWS_INSTANCE_IPV4": "10.0.1.100",
"AWS_INSTANCE_PORT": "8080",
"environment": "production",
"version": "1.2.3"
}
}
]
}
π Required Attributes
- AWS_INSTANCE_IPV4: IPv4 address for A records
- AWS_INSTANCE_PORT: Port number for SRV records
- Custom attributes: Additional metadata for filtering
ECS/EKS Integration: In production, use automatic service registration instead of manual registration for container-based workloads.
β‘ CloudFront DNS Integration
graph TB
subgraph "Global Users"
US[πΊπΈ US Users]
EU[πͺπΊ EU Users]
ASIA[π¦πΈ Asia Users]
end
subgraph "CloudFront Edge Locations"
EdgeUS[π US Edge Location
Ashburn, VA]
EdgeEU[π EU Edge Location
Frankfurt]
EdgeASIA[π Asia Edge Location
Tokyo]
end
subgraph "DNS Resolution"
R53[π― Route 53
cdn.example.com]
CFDomain[β‘ CloudFront Domain
d123456789.cloudfront.net]
EdgeDNS[π Edge DNS Resolution]
end
subgraph "Origin Servers"
S3[πͺ£ S3 Origin
my-website-bucket]
ALB[βοΈ ALB Origin
api.example.com]
Custom[π Custom Origin
legacy-server.com]
end
US -->|DNS Query| R53
EU -->|DNS Query| R53
ASIA -->|DNS Query| R53
R53 -->|CNAME| CFDomain
CFDomain -->|Route to Nearest| EdgeDNS
EdgeDNS -->|US Traffic| EdgeUS
EdgeDNS -->|EU Traffic| EdgeEU
EdgeDNS -->|Asia Traffic| EdgeASIA
EdgeUS -->|Cache Miss| S3
EdgeUS -->|API Calls| ALB
EdgeEU -->|Cache Miss| S3
EdgeEU -->|API Calls| ALB
EdgeASIA -->|Cache Miss| Custom
π Global Reach
400+ edge locations worldwide provide low-latency access to your content from anywhere.
π§ Intelligent Routing
Automatic routing to optimal edge location based on network performance, not just geography.
β‘ CloudFront Distribution Setup
Global CDN Configuration
aws cloudfront create-distribution \
--distribution-config '{
"CallerReference": "cf-distribution-2024-001",
"Aliases": {
"Quantity": 1,
"Items": ["cdn.example.com"]
},
"DefaultRootObject": "index.html",
"Origins": {
"Quantity": 2,
"Items": [
{
"Id": "S3-my-website-bucket",
"DomainName": "my-website-bucket.s3.amazonaws.com",
"S3OriginConfig": {
"OriginAccessIdentity": "origin-access-identity/cloudfront/E123456789ABCD"
}
},
{
"Id": "ALB-api-origin",
"DomainName": "api.example.com",
"CustomOriginConfig": {
"HTTPPort": 80,
"HTTPSPort": 443,
"OriginProtocolPolicy": "https-only"
}
}
]
},
"DefaultCacheBehavior": {
"TargetOriginId": "S3-my-website-bucket",
"ViewerProtocolPolicy": "redirect-to-https",
"CachePolicyId": "4135ea2d-6df8-44a3-9df3-4b5a84be39ad",
"Compress": true
},
"CacheBehaviors": {
"Quantity": 1,
"Items": [
{
"PathPattern": "/api/*",
"TargetOriginId": "ALB-api-origin",
"ViewerProtocolPolicy": "https-only",
"CachePolicyId": "4135ea2d-6df8-44a3-9df3-4b5a84be39ad",
"TTL": 0
}
]
}
}'
π Route 53 CloudFront Integration
DNS Alias Configuration
aws route53 change-resource-record-sets \
--hosted-zone-id Z1234567890ABC \
--change-batch '{
"Changes": [{
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "cdn.example.com",
"Type": "A",
"AliasTarget": {
"DNSName": "d123456789abcd.cloudfront.net",
"EvaluateTargetHealth": false,
"HostedZoneId": "Z2FDTNDATAQYW2"
}
}
}]
}'
# Test CloudFront distribution
aws cloudfront get-distribution \
--id E123456789ABCD
# Verify DNS resolution
dig cdn.example.com A +short
# Returns: CloudFront edge location IPs
curl -I https://cdn.example.com
# Headers show:
# Server: CloudFront
# X-Cache: Hit from cloudfront
# X-Amz-Cf-Pop: IAD89-C1
| Record Type |
Cost |
Root Domain Support |
Recommendation |
| ALIAS |
Free |
Yes |
β
Preferred for AWS resources |
| CNAME |
Charged |
No |
β Avoid for CloudFront |
CloudFront Hosted Zone ID: Always use Z2FDTNDATAQYW2 for CloudFront ALIAS records - this is AWS's global CloudFront hosted zone.
πΈοΈ App Mesh DNS Architecture
graph TB
subgraph "Service Mesh Control Plane"
AppMesh[πΈοΈ App Mesh Controller]
Envoy[π§ Envoy Proxy Config]
VirtualNodes[π¦ Virtual Nodes]
VirtualServices[π Virtual Services]
end
subgraph "ECS/EKS Cluster"
Task1[π¦ ECS Task 1
frontend-service]
Task2[π¦ ECS Task 2
api-service]
Task3[π¦ ECS Task 3
database-service]
Proxy1[π§ Envoy Sidecar 1]
Proxy2[π§ Envoy Sidecar 2]
Proxy3[π§ Envoy Sidecar 3]
end
subgraph "Service Discovery"
CloudMap[πΊοΈ Cloud Map
my-app.local]
DNS1[π frontend.my-app.local]
DNS2[π api.my-app.local]
DNS3[π database.my-app.local]
end
subgraph "Traffic Management"
LoadBalancing[βοΈ Load Balancing]
HealthCheck[β€οΈ Health Checking]
Retry[π Retry Logic]
TLS[π mTLS Encryption]
end
Task1 --> Proxy1
Task2 --> Proxy2
Task3 --> Proxy3
AppMesh --> Envoy
Envoy --> Proxy1
Envoy --> Proxy2
Envoy --> Proxy3
VirtualNodes --> Task1
VirtualNodes --> Task2
VirtualNodes --> Task3
VirtualServices --> DNS1
VirtualServices --> DNS2
VirtualServices --> DNS3
CloudMap --> DNS1
CloudMap --> DNS2
CloudMap --> DNS3
Proxy1 -->|Service Call| Proxy2
Proxy2 -->|Service Call| Proxy3
Proxy1 --> LoadBalancing
Proxy2 --> HealthCheck
Proxy3 --> Retry
LoadBalancing --> TLS
HealthCheck --> TLS
Retry --> TLS
πΈοΈ App Mesh Setup Process
Service Mesh Foundation
# Step 1: Create App Mesh
aws appmesh create-mesh \
--mesh-name "my-application-mesh" \
--spec '{
"egressFilter": {
"type": "ALLOW_ALL"
}
}'
# Step 2: Create Virtual Service
aws appmesh create-virtual-service \
--mesh-name "my-application-mesh" \
--virtual-service-name "frontend-service.my-app.local" \
--spec '{
"provider": {
"virtualNode": {
"virtualNodeName": "frontend-service-vn"
}
}
}'
# Returns:
{
"virtualService": {
"meshName": "my-application-mesh",
"virtualServiceName": "frontend-service.my-app.local",
"spec": {
"provider": {
"virtualNode": {
"virtualNodeName": "frontend-service-vn"
}
}
},
"status": {
"status": "ACTIVE"
}
}
}
Egress Filter Options
- ALLOW_ALL: Permit external calls
- DROP_ALL: Block external traffic
Provider Types
- virtualNode: Direct routing
- virtualRouter: Complex routing
π¦ Virtual Node Configuration
Service Definition
aws appmesh create-virtual-node \
--mesh-name "my-application-mesh" \
--virtual-node-name "frontend-service-vn" \
--spec '{
"listeners": [
{
"portMapping": {
"port": 8080,
"protocol": "http"
},
"healthCheck": {
"protocol": "http",
"path": "/health",
"intervalMillis": 30000,
"timeoutMillis": 5000,
"unhealthyThreshold": 3,
"healthyThreshold": 2
}
}
],
"serviceDiscovery": {
"awsCloudMap": {
"namespaceName": "my-app.local",
"serviceName": "frontend-service"
}
},
"backends": [
{
"virtualService": {
"virtualServiceName": "api-service.my-app.local"
}
}
],
"logging": {
"accessLog": {
"file": {
"path": "/dev/stdout"
}
}
}
}'
π Security Through Backends
The backends section implements network-level security - services can only communicate with explicitly allowed virtual services.
π¦ ECS App Mesh Integration
Sidecar Configuration
{
"family": "frontend-service-task",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::123456789012:role/ecsTaskRole",
"proxyConfiguration": {
"type": "APPMESH",
"containerName": "envoy",
"properties": [
{
"name": "IgnoredUID",
"value": "1337"
},
{
"name": "ProxyIngressPort",
"value": "15000"
},
{
"name": "ProxyEgressPort",
"value": "15001"
},
{
"name": "AppPorts",
"value": "8080"
},
{
"name": "EgressIgnoredIPs",
"value": "169.254.170.2,169.254.169.254"
}
]
},
"containerDefinitions": [
{
"name": "frontend-app",
"image": "my-frontend-app:latest",
"portMappings": [{"containerPort": 8080, "protocol": "tcp"}],
"environment": [
{
"name": "API_ENDPOINT",
"value": "api-service.my-app.local:3000"
}
],
"dependsOn": [
{
"containerName": "envoy",
"condition": "HEALTHY"
}
]
},
{
"name": "envoy",
"image": "public.ecr.aws/appmesh/aws-appmesh-envoy:v1.22.2.0-prod",
"essential": true,
"environment": [
{
"name": "APPMESH_VIRTUAL_NODE_NAME",
"value": "mesh/my-application-mesh/virtualNode/frontend-service-vn"
}
],
"healthCheck": {
"command": [
"CMD-SHELL",
"curl -s http://localhost:9901/server_info | grep state | grep -q LIVE"
]
}
}
]
}
π Complete DNS Setup Flow
graph TD
Start[π Start DNS Setup]
subgraph "Phase 1: Foundation"
A1[1οΈβ£ Plan DNS Architecture]
A2[2οΈβ£ Create Route 53 Hosted Zone]
A3[3οΈβ£ Update Domain Registrar]
A4[4οΈβ£ Verify DNS Propagation]
end
subgraph "Phase 2: Infrastructure"
B1[5οΈβ£ Deploy Infrastructure]
B2[6οΈβ£ Create Health Checks]
B3[7οΈβ£ Configure SSL Certificates]
end
subgraph "Phase 3: Basic DNS"
C1[8οΈβ£ Create Simple A Records]
C2[9οΈβ£ Create CNAME Records]
C3[π Create MX Records]
end
subgraph "Phase 4: Advanced Routing"
D1[1οΈβ£1οΈβ£ Weighted Routing]
D2[1οΈβ£2οΈβ£ Latency-based Routing]
D3[1οΈβ£3οΈβ£ Failover Routing]
D4[1οΈβ£4οΈβ£ Geolocation Routing]
end
subgraph "Phase 5: Service Discovery"
E1[1οΈβ£5οΈβ£ Cloud Map Namespace]
E2[1οΈβ£6οΈβ£ Service Registration]
E3[1οΈβ£7οΈβ£ Instance Registration]
end
subgraph "Phase 6: CDN Integration"
F1[1οΈβ£8οΈβ£ CloudFront Distribution]
F2[1οΈβ£9οΈβ£ Route 53 Alias Records]
F3[2οΈβ£0οΈβ£ SSL Certificate Setup]
end
subgraph "Phase 7: Service Mesh"
G1[2οΈβ£1οΈβ£ App Mesh Setup]
G2[2οΈβ£2οΈβ£ Virtual Nodes]
G3[2οΈβ£3οΈβ£ Virtual Services]
G4[2οΈβ£4οΈβ£ ECS/EKS Integration]
end
subgraph "Phase 8: Monitoring"
H1[2οΈβ£5οΈβ£ Query Logging]
H2[2οΈβ£6οΈβ£ CloudWatch Alarms]
H3[2οΈβ£7οΈβ£ Dashboard Setup]
end
Start --> A1 --> A2 --> A3 --> A4
A4 --> B1 --> B2 --> B3
B3 --> C1 --> C2 --> C3
C3 --> D1 --> D2 --> D3 --> D4
D4 --> E1 --> E2 --> E3
E3 --> F1 --> F2 --> F3
F3 --> G1 --> G2 --> G3 --> G4
G4 --> H1 --> H2 --> H3
π Verification and Testing
Essential Testing Commands
# DNS Resolution Testing
dig example.com NS
dig www.example.com A
dig api.example.com A +short
# Test from multiple DNS servers
nslookup example.com 8.8.8.8
nslookup example.com 1.1.1.1
nslookup example.com 208.67.222.222
# Route 53 Health Check Testing
aws route53 get-health-check --health-check-id 12345678-1234-1234-1234-123456789012
aws route53 get-health-check-status --health-check-id 12345678-1234-1234-1234-123456789012
# CloudFront Distribution Testing
aws cloudfront get-distribution --id E123456789ABCD
curl -I https://cdn.example.com
curl -H "Host: cdn.example.com" https://d123456789abcd.cloudfront.net
# Cloud Map Service Discovery Testing
aws servicediscovery discover-instances \
--namespace-name my-company.local \
--service-name web-service
# App Mesh Virtual Node Status
aws appmesh describe-virtual-node \
--mesh-name my-application-mesh \
--virtual-node-name frontend-service-vn
# Query Log Analysis
aws logs filter-log-events \
--log-group-name /aws/route53/example.com \
--start-time $(date -d '1 hour ago' +%s)000 \
--filter-pattern "ERROR"
# Performance Testing
time dig www.example.com A +short
curl -w "@curl-format.txt" -s -o /dev/null https://www.example.com
π‘ Key Takeaways and Best Practices
π― Route 53 Best Practices
- Always use health checks for routing policies
- Start with simple records, add complexity gradually
- Use lower TTL values during changes
- Monitor query patterns with logging
- Test failover scenarios regularly
πΊοΈ Cloud Map Best Practices
- Use private namespaces for internal services
- Implement proper health checking
- Leverage ECS/EKS auto-registration
- Include metadata in custom attributes
- Plan namespace structure carefully
β‘ CloudFront Best Practices
- Use ALIAS records instead of CNAME
- Configure appropriate cache behaviors
- Enable compression for better performance
- Set up proper SSL/TLS certificates
- Monitor cache hit ratios
πΈοΈ App Mesh Best Practices
- Start with simple mesh topologies
- Use backends for security boundaries
- Enable observability from day one
- Plan virtual service naming carefully
- Test proxy configurations thoroughly
π Implementation Strategy
Phase-by-phase approach: Start with basic Route 53 DNS, add health checks and routing policies, then layer on service discovery, CDN, and service mesh capabilities as your architecture evolves.
β οΈ Common Pitfalls to Avoid
- Don't forget to update domain registrar name servers
- Always test health checks before production use
- Plan for DNS propagation delays (up to 48 hours)
- Monitor costs - DNS queries and health checks add up
- Document your DNS architecture thoroughly
Thank you!
Questions & Discussion