Skip to main content

Ansible Standards

Core Rules

  1. All infrastructure changes go through Git - No exceptions
  2. Playbooks must be idempotent - Safe to run multiple times
  3. Use roles for reusability - Don't repeat yourself
  4. Test in development first - Use --check mode
  5. Document with comments - Explain why, not just what

Repository Structure

infrastructure/
├── ansible.cfg # Global Ansible config
├── inventory/
│ ├── production/
│ │ ├── hosts.yml # Server inventory
│ │ └── group_vars/
│ │ ├── all/
│ │ │ ├── vars.yml # Plain-text vars
│ │ │ └── vault.yml # Encrypted secrets
├── playbooks/
│ ├── site.yml # Master playbook (runs all)
│ ├── web_servers.yml # Role-specific playbooks
│ └── databases.yml
├── roles/
│ ├── common/ # Base config for all servers
│ │ ├── tasks/main.yml
│ │ ├── handlers/main.yml
│ │ ├── templates/
│ │ └── defaults/main.yml
│ ├── docker/
│ ├── nginx/
│ └── monitoring/
└── .github/workflows/
└── ansible-ci.yml # CI tests for Ansible

Playbook Standards

Naming Convention

  • Playbooks: verb_noun.yml (e.g., deploy_app.yml, setup_database.yml)
  • Roles: noun (e.g., nginx, docker, postgresql)
  • Variables: snake_case (e.g., db_host, api_port)
  • Vault variables: vault_* prefix (e.g., vault_db_password)

Standard Playbook Template

---
# playbooks/deploy_app.yml
- name: Deploy application to web servers
hosts: web_servers
become: yes

pre_tasks:
- name: Verify prerequisites
assert:
that:
- app_version is defined
- app_environment in ['dev', 'staging', 'production']
msg: "Required variables not set"

roles:
- common
- docker
- nginx
- app_deployment

post_tasks:
- name: Verify application is running
uri:
url: "http://localhost:8080/health"
status_code: 200
retries: 3
delay: 5

handlers:
- name: restart nginx
service:
name: nginx
state: restarted

Role Standards

Standard Role Structure

roles/nginx/
├── tasks/
│ └── main.yml # Main task list
├── handlers/
│ └── main.yml # Handlers (restart service, etc.)
├── templates/
│ └── nginx.conf.j2 # Jinja2 templates
├── files/
│ └── ssl_params.conf # Static files
├── defaults/
│ └── main.yml # Default variables (lowest priority)
├── vars/
│ └── main.yml # Role variables (high priority)
└── meta/
└── main.yml # Role dependencies

Task Best Practices

# ✅ Good - Idempotent, clear, safe
- name: Ensure nginx is installed and running
package:
name: nginx
state: present
notify: restart nginx

- name: Configure nginx virtual host
template:
src: vhost.conf.j2
dest: /etc/nginx/sites-available/{{ domain }}.conf
owner: root
group: root
mode: '0644'
validate: 'nginx -t -c %s' # Test before applying
notify: reload nginx

# ❌ Bad - Not idempotent, risky
- name: Install nginx
shell: apt-get install -y nginx && systemctl restart nginx

Inventory Management

Host Inventory

# inventory/production/hosts.yml
all:
children:
web_servers:
hosts:
web-01:
ansible_host: 10.0.1.10
ansible_user: deploy
web-02:
ansible_host: 10.0.1.11
ansible_user: deploy

databases:
hosts:
db-01:
ansible_host: 10.0.2.10
ansible_user: deploy
postgres_primary: true

monitoring:
hosts:
monitor-01:
ansible_host: 10.0.3.10

Group Variables

# inventory/production/group_vars/all/vars.yml
---
# Global variables for all hosts
environment: production
domain: company.com
ntp_servers:
- 0.pool.ntp.org
- 1.pool.ntp.org

# Reference encrypted secrets
db_password: "{{ vault_db_password }}"
api_key: "{{ vault_api_key }}"

CI/CD Integration

GitHub Actions Workflow

# .github/workflows/ansible-ci.yml
name: Ansible CI

on:
pull_request:
paths:
- 'ansible/**'
- 'playbooks/**'
- 'roles/**'

jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run ansible-lint
run: |
pip install ansible-lint
ansible-lint playbooks/

syntax-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Syntax check all playbooks
run: |
for playbook in playbooks/*.yml; do
ansible-playbook $playbook --syntax-check
done

deploy-staging:
runs-on: ubuntu-latest
if: github.event.pull_request.merged == true
steps:
- name: Deploy to staging
env:
ANSIBLE_VAULT_PASSWORD: ${{ secrets.ANSIBLE_VAULT_PASSWORD_STAGING }}
run: |
echo "$ANSIBLE_VAULT_PASSWORD" > /tmp/vault_pass
ansible-playbook playbooks/site.yml \
-i inventory/staging \
--vault-password-file /tmp/vault_pass

Common Patterns

Conditional Execution

- name: Install package only on Ubuntu
apt:
name: nginx
state: present
when: ansible_distribution == "Ubuntu"

- name: Configure firewall in production only
ufw:
rule: allow
port: 443
when: environment == "production"

Loops

- name: Create multiple users
user:
name: "{{ item.name }}"
groups: "{{ item.groups }}"
loop:
- { name: 'deploy', groups: 'sudo,docker' }
- { name: 'monitor', groups: 'monitoring' }

Error Handling

- name: Try to start service, fail gracefully
service:
name: myapp
state: started
register: service_result
ignore_errors: yes

- name: Alert if service failed
debug:
msg: "WARNING: Service failed to start"
when: service_result is failed

Testing Workflow

# 1. Syntax check (catches YAML errors)
ansible-playbook playbooks/site.yml --syntax-check

# 2. Dry run (see what would change)
ansible-playbook playbooks/site.yml -i inventory/production --check --diff

# 3. Run on development first
ansible-playbook playbooks/site.yml -i inventory/development

# 4. Run on single staging server (test one)
ansible-playbook playbooks/site.yml -i inventory/staging -l staging-web-01

# 5. Run on all staging (test cluster)
ansible-playbook playbooks/site.yml -i inventory/staging

# 6. Run on production (after approval)
ansible-playbook playbooks/site.yml -i inventory/production

Security Best Practices

  1. Never commit plain-text secrets - Use Ansible Vault
  2. Use become: yes sparingly - Only when needed
  3. Validate configurations - Use validate parameter
  4. Use secure file permissions - mode: '0600' for secrets
  5. Limit inventory access - Different vault passwords per environment

Performance Tips

# Use strategy for faster execution
- hosts: web_servers
strategy: free # Don't wait for all hosts

# Gather facts only when needed
- hosts: web_servers
gather_facts: no # Skip if not using ansible_* variables

# Use async for long-running tasks
- name: Long running deployment
command: /opt/deploy.sh
async: 300
poll: 10

Troubleshooting

Check inventory:

ansible-inventory -i inventory/production --list
ansible-inventory -i inventory/production --host web-01

Test connectivity:

ansible all -i inventory/production -m ping

Run specific task:

ansible-playbook playbooks/site.yml --tags "nginx_config"

Verbose output:

ansible-playbook playbooks/site.yml -vvv