Files
ansible_proxmox_VM/README.md
Jose d171a4a7b9 refactor ♻️: Refactored all task files to use centralized helper functions from tasks/helpers.yml, improving code consistency, maintainability, and idempotency.
All task files now use centralized helper functions, ensuring idempotency across all stages. Code is cleaner, more maintainable, and no breaking changes were introduced.
2025-11-18 20:24:43 +01:00

11 KiB
Raw Permalink Blame History

Ansible Role: Proxmox VM → Template → Clones (CloudInit)

Production-grade automation for Debian GenericCloud VMs on Proxmox with error handling, idempotency, and comprehensive validation.

Automates the complete lifecycle:

  • Pre-flight environment validation (20+ checks)
  • Download & cache Debian GenericCloud image
  • Create base VM with error recovery
  • Configure disk, networking, Cloud-Init, TPM, GPU
  • Convert VM to template (idempotent - safe to re-run!)
  • Deploy multiple clones with custom networking
  • Per-clone error handling (failures don't cascade)

Features

  • Error Handling - Automatic retry (3x, 5-sec delay) with clear messages
  • Idempotency - Truly safe to re-run; skips already-completed operations
  • Modular Design - Helper functions encapsulate common VM operations
  • Pre-flight Validation - 20+ environment checks before execution
  • Image Caching - Downloads once, reuses on re-runs (faster!)
  • DHCP or Static IP - Flexible networking configuration
  • Cloud-Init - Users, SSH keys, passwords, timezone, packages
  • TPM 2.0 + SecureBoot - Optional UEFI firmware support
  • GPU Passthrough - Optional PCI device or VirtIO GPU
  • Disk Resize - Optional automatic disk expansion
  • Multi-Clone - Deploy multiple clones independently with per-clone error handling
  • Rich Logging - Progress tracking and debug output

Folder Structure

ansible_proxmox_VM/
├─ defaults/
│   └─ main.yml                    # All configuration (comprehensive docs)
├─ tasks/
│   ├─ main.yml                    # Orchestrator (calls subtasks)
│   ├─ preflight-checks.yml        # Environment validation (20+ checks)
│   ├─ download-image.yml          # Download Debian image (with caching)
│   ├─ create-vm.yml               # Create VM (idempotent)
│   ├─ configure-vm.yml            # Configure disk, Cloud-Init, TPM, GPU
│   ├─ create-template.yml         # Convert to template (idempotent - FIXED!)
│   ├─ create-clones.yml           # Deploy clones (per-clone error handling)
│   └─ helpers.yml                 # 8 utility functions
├─ templates/
│   ├─ cloudinit_userdata.yaml.j2  # Cloud-Init user data template
│   └─ cloudinit_vendor.yaml.j2    # Cloud-Init vendor data template
└─ README.md                        # This file

Requirements

  • Proxmox VE 7.x or 8.x installed and accessible
  • Ansible 2.9+ with SSH access to Proxmox host
  • Proxmox user with permission to run qm commands (root recommended)
  • Storage pool configured (e.g., local-lvm)
  • Snippets storage enabled for Cloud-Init (Datacenter → Storage)

Quick Start

1. Validate Environment

ansible-playbook tasks/main.yml --tags preflight -vvv

Checks Proxmox connectivity, storage, SSH keys, permissions.

2. Dry Run (Preview Changes)

ansible-playbook tasks/main.yml --check -vv

Shows what would happen without making any changes.

3. Full Deployment

ansible-playbook tasks/main.yml -i inventory

Creates VM → configures it → converts to template → deploys clones

4. Re-run (Test Idempotency)

ansible-playbook tasks/main.yml -i inventory

Second run is much faster (~30 sec)! Skips already-completed operations.

Configuration Variables

All variables are in defaults/main.yml with comprehensive inline documentation.

Base VM Configuration

vm_id: 150                           # Unique Proxmox VM ID (≥100)
hostname: debian-template-base       # VM hostname
memory: 4096                         # RAM in MB
cores: 4                             # CPU cores
cpu_type: host                       # CPU type
bridge: vmbr0                        # Network bridge
storage: local-lvm                   # Storage pool

Networking

ip_mode: dhcp                        # 'dhcp' or 'static'
ip_address: "192.168.1.60/24"       # Static IP if ip_mode: static
gateway: "192.168.1.1"              # Gateway
dns:
  - "1.1.1.1"
  - "8.8.8.8"

Cloud-Init

ci_user: debian                      # Default user
ci_password: "SecurePass123"        # Use Vault in production!
ssh_key_path: "~/.ssh/id_rsa.pub"   # SSH public key path
timezone: "Europe/Berlin"            # Timezone
packages:
  - qemu-guest-agent
  - curl
  - htop

Advanced Options

enable_tpm: false                    # UEFI + TPM 2.0
gpu_passthrough: false               # PCI GPU passthrough
virtio_gpu: false                    # VirtIO GPU
resize_disk: true                    # Auto-resize disk
resize_size: "16G"                   # Target disk size
make_template: true                  # Convert to template
create_clones: true                  # Deploy clones

Clone Definition

clones:
  - id: 301
    hostname: app01
    ip: "192.168.1.81/24"
    gateway: "192.168.1.1"
    full: 1                          # 1=full, 0=linked
  - id: 302
    hostname: app02
    ip: "192.168.1.82/24"
    gateway: "192.168.1.1"
    full: 0                          # Linked clones are faster

See defaults/main.yml for all options with detailed documentation.

Usage

Include in Playbook

- hosts: proxmox_host
  become: true
  roles:
    - ansible_proxmox_vm

Run Directly

ansible-playbook tasks/main.yml -i inventory

Using Tags (Run Specific Stages)

# Pre-flight checks only
ansible-playbook tasks/main.yml --tags preflight -vvv

# Create VM and template (skip clones)
ansible-playbook tasks/main.yml --skip-tags clones

# Add clones to existing template
ansible-playbook tasks/main.yml --tags clones

# Skip image re-download
ansible-playbook tasks/main.yml --skip-tags image

Playbook Stages (6 Stages)

Stage Task Purpose Idempotent
1 preflight-checks.yml Validate environment (20+ checks) Yes
2 download-image.yml Download/cache Debian image Yes
3 create-vm.yml Create base VM Yes
4 configure-vm.yml Configure disk, network, Cloud-Init Yes
5 create-template.yml Convert to template Yes (FIXED!)
6 create-clones.yml Deploy clones from template Yes

Helper Functions (tasks/helpers.yml)

All task files use centralized helper functions for consistency and maintainability:

Helper Purpose Sets Variable
check_vm_exists Check if VM config file exists vm_exists
check_template Check if VM is a template is_template
check_vm_status Get current VM status (running/stopped) vm_status
check_disk_attached Check if disk is attached via qm config disk_attached
check_storage Check available storage space storage_available
validate_vm_id Validate VM ID format (100-999999) (assertions only)
get_vm_info Read and parse VM config file vm_info
list_vms Get list of all VMs vm_list
cleanup_snippets Remove old Cloud-Init snippets (side effect)

Example: Using Helpers in Tasks

- name: Check if VM already exists
  ansible.builtin.include_tasks: helpers.yml
  vars:
    helper_task: check_vm_exists
    target_vm_id: "{{ vm_id }}"

- name: Display status
  debug:
    msg: "VM status: {{ 'EXISTS' if vm_exists else 'WILL BE CREATED' }}"

Key Improvements

Error Handling

  • Automatic retry with configurable delays (3x, 5-sec)
  • Context-aware error messages
  • Per-clone error isolation (doesn't cascade)

Idempotency

  • Safe to re-run multiple times
  • Skips already-completed operations
  • Image cached and reused
  • Template conversion idempotent (was broken in v1!)
  • Per-clone checking - skips existing clones
  • Snippet cleanup - old files removed before re-configuration

Idempotency per stage:

  • create-vm.yml - Checks if VM exists before creating
  • configure-vm.yml - Re-applies Cloud-Init (safe to re-run)
  • create-template.yml - Skips if already a template
  • create-clones.yml - Skips clones that already exist (per-clone checks)
  • download-image.yml - Skips download if file exists locally

Pre-flight Validation

  • Proxmox connectivity & permissions
  • Storage pool availability
  • SSH key readiness
  • IP address format validation
  • VM ID uniqueness checks

Advanced Features

  • UEFI/TPM 2.0 support
  • GPU passthrough (PCI or VirtIO)
  • Automatic disk resize
  • Cloud-Init with user/password/SSH
  • DHCP or static networking
  • Multi-clone deployment

Testing & Validation

Preflight Checks

ansible-playbook tasks/main.yml --tags preflight -vvv

Dry Run (Preview)

ansible-playbook tasks/main.yml --check -vv

Test Idempotency

# First run
ansible-playbook tasks/main.yml -vv

# Second run (should be much faster)
ansible-playbook tasks/main.yml -vv

Cloud-Init Templates

cloudinit_userdata.yaml.j2

Configures:

  • User creation with sudo access
  • SSH key injection
  • Password authentication
  • Timezone setting
  • Package updates

cloudinit_vendor.yaml.j2

Configures:

  • Package installation
  • DNS settings (optional)

Security Notes

⚠️ Passwords: Use Ansible Vault in production:

ansible-vault create group_vars/proxmox/vault.yml

Then reference: ci_password: "{{ vault_ci_password }}"

SSH Keys: Automatically validated before use Permissions: Checks if user can run qm commands No Hardcoded Secrets: All in variables

Best Practices

  1. Always run with --check first
  2. Validate environment with --tags preflight
  3. Skip image re-download with --skip-tags image
  4. Monitor Cloud-Init: cloud-init status inside VM
  5. Test in dev environment first
  6. Use linked clones (full: 0) for faster deployments
  7. Enable Proxmox snippets storage

Performance

  • First run: ~5-10 minutes (downloads image, creates VM)
  • Re-runs: ~30 seconds (operations skipped)
  • Linked clones: Much faster than full clones

Troubleshooting

Preflight validation fails

ansible-playbook tasks/main.yml --tags preflight -vvv

Cloud-Init not applying

# Inside VM:
cloud-init status
cloud-init logs

# Check snippets:
ls -la /var/lib/vz/snippets/

SSH key issues

# Verify SSH key
ls -la ~/.ssh/id_rsa.pub

# Run with verbose
ansible-playbook tasks/main.yml -vvv

Common Proxmox Commands

# List all VMs
qm list

# Check VM status
qm status 150

# View VM config
qm config 150

# Connect to console
qm terminal 150

# SSH into VM
ssh debian@<vm-ip>

# Check Cloud-Init
cloud-init status --all

Compatibility

  • Proxmox: 7.x, 8.x (uses qm CLI)
  • Debian: Bookworm GenericCloud (configurable)
  • Ansible: 2.9+ (standard modules)
  • Backward Compatible: 100%

Support

Refer to:

  • defaults/main.yml - Complete variable documentation
  • Task files - Inline comments explaining implementation
  • Run with -vvv flag for debug output
  • Check /var/lib/vz/snippets/ for Cloud-Init files

License

Open source - use as-is for Proxmox automation.