Ansible Standards
Core Rules
- All infrastructure changes go through Git - No exceptions
- Playbooks must be idempotent - Safe to run multiple times
- Use roles for reusability - Don't repeat yourself
- Test in development first - Use
--checkmode - Document with comments - Explain why, not just what
Repository Structure
infrastructure/
├── ansible.cfg # Global Ansible config
├── inventory/
│ ├── production/
│ │ ├── hosts.yml # Server inventory
│ │ └── group_vars/
│ │ ├── all/
│ │ │ ├── vars.yml # Plain-text vars
│ │ │ └── vault.yml # Encrypted secrets
├── playbooks/
│ ├── site.yml # Master playbook (runs all)
│ ├── web_servers.yml # Role-specific playbooks
│ └── databases.yml
├── roles/
│ ├── common/ # Base config for all servers
│ │ ├── tasks/main.yml
│ │ ├── handlers/main.yml
│ │ ├── templates/
│ │ └── defaults/main.yml
│ ├── docker/
│ ├── nginx/
│ └── monitoring/
└── .github/workflows/
└── ansible-ci.yml # CI tests for Ansible
Playbook Standards
Naming Convention
- Playbooks:
verb_noun.yml(e.g.,deploy_app.yml,setup_database.yml) - Roles:
noun(e.g.,nginx,docker,postgresql) - Variables:
snake_case(e.g.,db_host,api_port) - Vault variables:
vault_*prefix (e.g.,vault_db_password)
Standard Playbook Template
---
# playbooks/deploy_app.yml
- name: Deploy application to web servers
hosts: web_servers
become: yes
pre_tasks:
- name: Verify prerequisites
assert:
that:
- app_version is defined
- app_environment in ['dev', 'staging', 'production']
msg: "Required variables not set"
roles:
- common
- docker
- nginx
- app_deployment
post_tasks:
- name: Verify application is running
uri:
url: "http://localhost:8080/health"
status_code: 200
retries: 3
delay: 5
handlers:
- name: restart nginx
service:
name: nginx
state: restarted
Role Standards
Standard Role Structure
roles/nginx/
├── tasks/
│ └── main.yml # Main task list
├── handlers/
│ └── main.yml # Handlers (restart service, etc.)
├── templates/
│ └── nginx.conf.j2 # Jinja2 templates
├── files/
│ └── ssl_params.conf # Static files
├── defaults/
│ └── main.yml # Default variables (lowest priority)
├── vars/
│ └── main.yml # Role variables (high priority)
└── meta/
└── main.yml # Role dependencies
Task Best Practices
# ✅ Good - Idempotent, clear, safe
- name: Ensure nginx is installed and running
package:
name: nginx
state: present
notify: restart nginx
- name: Configure nginx virtual host
template:
src: vhost.conf.j2
dest: /etc/nginx/sites-available/{{ domain }}.conf
owner: root
group: root
mode: '0644'
validate: 'nginx -t -c %s' # Test before applying
notify: reload nginx
# ❌ Bad - Not idempotent, risky
- name: Install nginx
shell: apt-get install -y nginx && systemctl restart nginx
Inventory Management
Host Inventory
# inventory/production/hosts.yml
all:
children:
web_servers:
hosts:
web-01:
ansible_host: 10.0.1.10
ansible_user: deploy
web-02:
ansible_host: 10.0.1.11
ansible_user: deploy
databases:
hosts:
db-01:
ansible_host: 10.0.2.10
ansible_user: deploy
postgres_primary: true
monitoring:
hosts:
monitor-01:
ansible_host: 10.0.3.10
Group Variables
# inventory/production/group_vars/all/vars.yml
---
# Global variables for all hosts
environment: production
domain: company.com
ntp_servers:
- 0.pool.ntp.org
- 1.pool.ntp.org
# Reference encrypted secrets
db_password: "{{ vault_db_password }}"
api_key: "{{ vault_api_key }}"
CI/CD Integration
GitHub Actions Workflow
# .github/workflows/ansible-ci.yml
name: Ansible CI
on:
pull_request:
paths:
- 'ansible/**'
- 'playbooks/**'
- 'roles/**'
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run ansible-lint
run: |
pip install ansible-lint
ansible-lint playbooks/
syntax-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Syntax check all playbooks
run: |
for playbook in playbooks/*.yml; do
ansible-playbook $playbook --syntax-check
done
deploy-staging:
runs-on: ubuntu-latest
if: github.event.pull_request.merged == true
steps:
- name: Deploy to staging
env:
ANSIBLE_VAULT_PASSWORD: ${{ secrets.ANSIBLE_VAULT_PASSWORD_STAGING }}
run: |
echo "$ANSIBLE_VAULT_PASSWORD" > /tmp/vault_pass
ansible-playbook playbooks/site.yml \
-i inventory/staging \
--vault-password-file /tmp/vault_pass
Common Patterns
Conditional Execution
- name: Install package only on Ubuntu
apt:
name: nginx
state: present
when: ansible_distribution == "Ubuntu"
- name: Configure firewall in production only
ufw:
rule: allow
port: 443
when: environment == "production"
Loops
- name: Create multiple users
user:
name: "{{ item.name }}"
groups: "{{ item.groups }}"
loop:
- { name: 'deploy', groups: 'sudo,docker' }
- { name: 'monitor', groups: 'monitoring' }
Error Handling
- name: Try to start service, fail gracefully
service:
name: myapp
state: started
register: service_result
ignore_errors: yes
- name: Alert if service failed
debug:
msg: "WARNING: Service failed to start"
when: service_result is failed
Testing Workflow
# 1. Syntax check (catches YAML errors)
ansible-playbook playbooks/site.yml --syntax-check
# 2. Dry run (see what would change)
ansible-playbook playbooks/site.yml -i inventory/production --check --diff
# 3. Run on development first
ansible-playbook playbooks/site.yml -i inventory/development
# 4. Run on single staging server (test one)
ansible-playbook playbooks/site.yml -i inventory/staging -l staging-web-01
# 5. Run on all staging (test cluster)
ansible-playbook playbooks/site.yml -i inventory/staging
# 6. Run on production (after approval)
ansible-playbook playbooks/site.yml -i inventory/production
Security Best Practices
- Never commit plain-text secrets - Use Ansible Vault
- Use become: yes sparingly - Only when needed
- Validate configurations - Use
validateparameter - Use secure file permissions -
mode: '0600'for secrets - Limit inventory access - Different vault passwords per environment
Performance Tips
# Use strategy for faster execution
- hosts: web_servers
strategy: free # Don't wait for all hosts
# Gather facts only when needed
- hosts: web_servers
gather_facts: no # Skip if not using ansible_* variables
# Use async for long-running tasks
- name: Long running deployment
command: /opt/deploy.sh
async: 300
poll: 10
Troubleshooting
Check inventory:
ansible-inventory -i inventory/production --list
ansible-inventory -i inventory/production --host web-01
Test connectivity:
ansible all -i inventory/production -m ping
Run specific task:
ansible-playbook playbooks/site.yml --tags "nginx_config"
Verbose output:
ansible-playbook playbooks/site.yml -vvv