All task files now use centralized helper functions, ensuring idempotency across all stages. Code is cleaner, more maintainable, and no breaking changes were introduced.
386 lines
11 KiB
Markdown
386 lines
11 KiB
Markdown
# Ansible Role: Proxmox VM → Template → Clones (Cloud‑Init)
|
||
|
||
**Production-grade automation** for Debian GenericCloud VMs on Proxmox with error handling, idempotency, and comprehensive validation.
|
||
|
||
Automates the complete lifecycle:
|
||
- ✅ Pre-flight environment validation (20+ checks)
|
||
- ✅ Download & cache Debian GenericCloud image
|
||
- ✅ Create base VM with error recovery
|
||
- ✅ Configure disk, networking, Cloud-Init, TPM, GPU
|
||
- ✅ Convert VM to template (**idempotent** - safe to re-run!)
|
||
- ✅ Deploy multiple clones with custom networking
|
||
- ✅ Per-clone error handling (failures don't cascade)
|
||
|
||
## Features
|
||
|
||
- ✅ **Error Handling** - Automatic retry (3x, 5-sec delay) with clear messages
|
||
- ✅ **Idempotency** - Truly safe to re-run; skips already-completed operations
|
||
- ✅ **Modular Design** - Helper functions encapsulate common VM operations
|
||
- ✅ **Pre-flight Validation** - 20+ environment checks before execution
|
||
- ✅ **Image Caching** - Downloads once, reuses on re-runs (faster!)
|
||
- ✅ **DHCP or Static IP** - Flexible networking configuration
|
||
- ✅ **Cloud-Init** - Users, SSH keys, passwords, timezone, packages
|
||
- ✅ **TPM 2.0 + SecureBoot** - Optional UEFI firmware support
|
||
- ✅ **GPU Passthrough** - Optional PCI device or VirtIO GPU
|
||
- ✅ **Disk Resize** - Optional automatic disk expansion
|
||
- ✅ **Multi-Clone** - Deploy multiple clones independently with per-clone error handling
|
||
- ✅ **Rich Logging** - Progress tracking and debug output
|
||
|
||
## Folder Structure
|
||
|
||
```
|
||
ansible_proxmox_VM/
|
||
├─ defaults/
|
||
│ └─ main.yml # All configuration (comprehensive docs)
|
||
├─ tasks/
|
||
│ ├─ main.yml # Orchestrator (calls subtasks)
|
||
│ ├─ preflight-checks.yml # Environment validation (20+ checks)
|
||
│ ├─ download-image.yml # Download Debian image (with caching)
|
||
│ ├─ create-vm.yml # Create VM (idempotent)
|
||
│ ├─ configure-vm.yml # Configure disk, Cloud-Init, TPM, GPU
|
||
│ ├─ create-template.yml # Convert to template (idempotent - FIXED!)
|
||
│ ├─ create-clones.yml # Deploy clones (per-clone error handling)
|
||
│ └─ helpers.yml # 8 utility functions
|
||
├─ templates/
|
||
│ ├─ cloudinit_userdata.yaml.j2 # Cloud-Init user data template
|
||
│ └─ cloudinit_vendor.yaml.j2 # Cloud-Init vendor data template
|
||
└─ README.md # This file
|
||
```
|
||
|
||
## Requirements
|
||
|
||
- **Proxmox VE** 7.x or 8.x installed and accessible
|
||
- **Ansible** 2.9+ with SSH access to Proxmox host
|
||
- **Proxmox user** with permission to run `qm` commands (root recommended)
|
||
- **Storage pool** configured (e.g., `local-lvm`)
|
||
- **Snippets storage** enabled for Cloud-Init (`Datacenter → Storage`)
|
||
|
||
## Quick Start
|
||
|
||
### 1. Validate Environment
|
||
```bash
|
||
ansible-playbook tasks/main.yml --tags preflight -vvv
|
||
```
|
||
Checks Proxmox connectivity, storage, SSH keys, permissions.
|
||
|
||
### 2. Dry Run (Preview Changes)
|
||
```bash
|
||
ansible-playbook tasks/main.yml --check -vv
|
||
```
|
||
Shows what would happen without making any changes.
|
||
|
||
### 3. Full Deployment
|
||
```bash
|
||
ansible-playbook tasks/main.yml -i inventory
|
||
```
|
||
Creates VM → configures it → converts to template → deploys clones
|
||
|
||
### 4. Re-run (Test Idempotency)
|
||
```bash
|
||
ansible-playbook tasks/main.yml -i inventory
|
||
```
|
||
Second run is much faster (~30 sec)! Skips already-completed operations.
|
||
|
||
## Configuration Variables
|
||
|
||
All variables are in `defaults/main.yml` with comprehensive inline documentation.
|
||
|
||
### Base VM Configuration
|
||
```yaml
|
||
vm_id: 150 # Unique Proxmox VM ID (≥100)
|
||
hostname: debian-template-base # VM hostname
|
||
memory: 4096 # RAM in MB
|
||
cores: 4 # CPU cores
|
||
cpu_type: host # CPU type
|
||
bridge: vmbr0 # Network bridge
|
||
storage: local-lvm # Storage pool
|
||
```
|
||
|
||
### Networking
|
||
```yaml
|
||
ip_mode: dhcp # 'dhcp' or 'static'
|
||
ip_address: "192.168.1.60/24" # Static IP if ip_mode: static
|
||
gateway: "192.168.1.1" # Gateway
|
||
dns:
|
||
- "1.1.1.1"
|
||
- "8.8.8.8"
|
||
```
|
||
|
||
### Cloud-Init
|
||
```yaml
|
||
ci_user: debian # Default user
|
||
ci_password: "SecurePass123" # Use Vault in production!
|
||
ssh_key_path: "~/.ssh/id_rsa.pub" # SSH public key path
|
||
timezone: "Europe/Berlin" # Timezone
|
||
packages:
|
||
- qemu-guest-agent
|
||
- curl
|
||
- htop
|
||
```
|
||
|
||
### Advanced Options
|
||
```yaml
|
||
enable_tpm: false # UEFI + TPM 2.0
|
||
gpu_passthrough: false # PCI GPU passthrough
|
||
virtio_gpu: false # VirtIO GPU
|
||
resize_disk: true # Auto-resize disk
|
||
resize_size: "16G" # Target disk size
|
||
make_template: true # Convert to template
|
||
create_clones: true # Deploy clones
|
||
```
|
||
|
||
### Clone Definition
|
||
```yaml
|
||
clones:
|
||
- id: 301
|
||
hostname: app01
|
||
ip: "192.168.1.81/24"
|
||
gateway: "192.168.1.1"
|
||
full: 1 # 1=full, 0=linked
|
||
- id: 302
|
||
hostname: app02
|
||
ip: "192.168.1.82/24"
|
||
gateway: "192.168.1.1"
|
||
full: 0 # Linked clones are faster
|
||
```
|
||
|
||
See `defaults/main.yml` for all options with detailed documentation.
|
||
|
||
## Usage
|
||
|
||
### Include in Playbook
|
||
```yaml
|
||
- hosts: proxmox_host
|
||
become: true
|
||
roles:
|
||
- ansible_proxmox_vm
|
||
```
|
||
|
||
### Run Directly
|
||
```bash
|
||
ansible-playbook tasks/main.yml -i inventory
|
||
```
|
||
|
||
### Using Tags (Run Specific Stages)
|
||
```bash
|
||
# Pre-flight checks only
|
||
ansible-playbook tasks/main.yml --tags preflight -vvv
|
||
|
||
# Create VM and template (skip clones)
|
||
ansible-playbook tasks/main.yml --skip-tags clones
|
||
|
||
# Add clones to existing template
|
||
ansible-playbook tasks/main.yml --tags clones
|
||
|
||
# Skip image re-download
|
||
ansible-playbook tasks/main.yml --skip-tags image
|
||
```
|
||
|
||
## Playbook Stages (6 Stages)
|
||
|
||
| Stage | Task | Purpose | Idempotent |
|
||
|-------|------|---------|-----------|
|
||
| 1 | `preflight-checks.yml` | Validate environment (20+ checks) | ✅ Yes |
|
||
| 2 | `download-image.yml` | Download/cache Debian image | ✅ Yes |
|
||
| 3 | `create-vm.yml` | Create base VM | ✅ Yes |
|
||
| 4 | `configure-vm.yml` | Configure disk, network, Cloud-Init | ✅ Yes |
|
||
| 5 | `create-template.yml` | Convert to template | ✅ Yes (FIXED!) |
|
||
| 6 | `create-clones.yml` | Deploy clones from template | ✅ Yes |
|
||
|
||
## Helper Functions (tasks/helpers.yml)
|
||
|
||
All task files use centralized helper functions for consistency and maintainability:
|
||
|
||
| Helper | Purpose | Sets Variable |
|
||
|--------|---------|----------------|
|
||
| `check_vm_exists` | Check if VM config file exists | `vm_exists` |
|
||
| `check_template` | Check if VM is a template | `is_template` |
|
||
| `check_vm_status` | Get current VM status (running/stopped) | `vm_status` |
|
||
| `check_disk_attached` | Check if disk is attached via `qm config` | `disk_attached` |
|
||
| `check_storage` | Check available storage space | `storage_available` |
|
||
| `validate_vm_id` | Validate VM ID format (100-999999) | (assertions only) |
|
||
| `get_vm_info` | Read and parse VM config file | `vm_info` |
|
||
| `list_vms` | Get list of all VMs | `vm_list` |
|
||
| `cleanup_snippets` | Remove old Cloud-Init snippets | (side effect) |
|
||
|
||
### Example: Using Helpers in Tasks
|
||
```yaml
|
||
- name: Check if VM already exists
|
||
ansible.builtin.include_tasks: helpers.yml
|
||
vars:
|
||
helper_task: check_vm_exists
|
||
target_vm_id: "{{ vm_id }}"
|
||
|
||
- name: Display status
|
||
debug:
|
||
msg: "VM status: {{ 'EXISTS' if vm_exists else 'WILL BE CREATED' }}"
|
||
```
|
||
|
||
## Key Improvements
|
||
|
||
### ✅ Error Handling
|
||
- Automatic retry with configurable delays (3x, 5-sec)
|
||
- Context-aware error messages
|
||
- Per-clone error isolation (doesn't cascade)
|
||
|
||
### ✅ Idempotency
|
||
- Safe to re-run multiple times
|
||
- Skips already-completed operations
|
||
- Image cached and reused
|
||
- **Template conversion idempotent** (was broken in v1!)
|
||
- **Per-clone checking** - skips existing clones
|
||
- **Snippet cleanup** - old files removed before re-configuration
|
||
|
||
**Idempotency per stage**:
|
||
- ✅ `create-vm.yml` - Checks if VM exists before creating
|
||
- ✅ `configure-vm.yml` - Re-applies Cloud-Init (safe to re-run)
|
||
- ✅ `create-template.yml` - Skips if already a template
|
||
- ✅ `create-clones.yml` - Skips clones that already exist (per-clone checks)
|
||
- ✅ `download-image.yml` - Skips download if file exists locally
|
||
|
||
### ✅ Pre-flight Validation
|
||
- Proxmox connectivity & permissions
|
||
- Storage pool availability
|
||
- SSH key readiness
|
||
- IP address format validation
|
||
- VM ID uniqueness checks
|
||
|
||
### ✅ Advanced Features
|
||
- UEFI/TPM 2.0 support
|
||
- GPU passthrough (PCI or VirtIO)
|
||
- Automatic disk resize
|
||
- Cloud-Init with user/password/SSH
|
||
- DHCP or static networking
|
||
- Multi-clone deployment
|
||
|
||
## Testing & Validation
|
||
|
||
### Preflight Checks
|
||
```bash
|
||
ansible-playbook tasks/main.yml --tags preflight -vvv
|
||
```
|
||
|
||
### Dry Run (Preview)
|
||
```bash
|
||
ansible-playbook tasks/main.yml --check -vv
|
||
```
|
||
|
||
### Test Idempotency
|
||
```bash
|
||
# First run
|
||
ansible-playbook tasks/main.yml -vv
|
||
|
||
# Second run (should be much faster)
|
||
ansible-playbook tasks/main.yml -vv
|
||
```
|
||
|
||
## Cloud-Init Templates
|
||
|
||
### `cloudinit_userdata.yaml.j2`
|
||
Configures:
|
||
- User creation with sudo access
|
||
- SSH key injection
|
||
- Password authentication
|
||
- Timezone setting
|
||
- Package updates
|
||
|
||
### `cloudinit_vendor.yaml.j2`
|
||
Configures:
|
||
- Package installation
|
||
- DNS settings (optional)
|
||
|
||
## Security Notes
|
||
|
||
⚠️ **Passwords**: Use Ansible Vault in production:
|
||
```bash
|
||
ansible-vault create group_vars/proxmox/vault.yml
|
||
```
|
||
Then reference: `ci_password: "{{ vault_ci_password }}"`
|
||
|
||
✅ **SSH Keys**: Automatically validated before use
|
||
✅ **Permissions**: Checks if user can run `qm` commands
|
||
✅ **No Hardcoded Secrets**: All in variables
|
||
|
||
## Best Practices
|
||
|
||
1. Always run with `--check` first
|
||
2. Validate environment with `--tags preflight`
|
||
3. Skip image re-download with `--skip-tags image`
|
||
4. Monitor Cloud-Init: `cloud-init status` inside VM
|
||
5. Test in dev environment first
|
||
6. Use linked clones (`full: 0`) for faster deployments
|
||
7. Enable Proxmox snippets storage
|
||
|
||
## Performance
|
||
|
||
- **First run**: ~5-10 minutes (downloads image, creates VM)
|
||
- **Re-runs**: ~30 seconds (operations skipped)
|
||
- **Linked clones**: Much faster than full clones
|
||
|
||
## Troubleshooting
|
||
|
||
### Preflight validation fails
|
||
```bash
|
||
ansible-playbook tasks/main.yml --tags preflight -vvv
|
||
```
|
||
|
||
### Cloud-Init not applying
|
||
```bash
|
||
# Inside VM:
|
||
cloud-init status
|
||
cloud-init logs
|
||
|
||
# Check snippets:
|
||
ls -la /var/lib/vz/snippets/
|
||
```
|
||
|
||
### SSH key issues
|
||
```bash
|
||
# Verify SSH key
|
||
ls -la ~/.ssh/id_rsa.pub
|
||
|
||
# Run with verbose
|
||
ansible-playbook tasks/main.yml -vvv
|
||
```
|
||
|
||
## Common Proxmox Commands
|
||
|
||
```bash
|
||
# List all VMs
|
||
qm list
|
||
|
||
# Check VM status
|
||
qm status 150
|
||
|
||
# View VM config
|
||
qm config 150
|
||
|
||
# Connect to console
|
||
qm terminal 150
|
||
|
||
# SSH into VM
|
||
ssh debian@<vm-ip>
|
||
|
||
# Check Cloud-Init
|
||
cloud-init status --all
|
||
```
|
||
|
||
## Compatibility
|
||
|
||
- **Proxmox**: 7.x, 8.x (uses `qm` CLI)
|
||
- **Debian**: Bookworm GenericCloud (configurable)
|
||
- **Ansible**: 2.9+ (standard modules)
|
||
- **Backward Compatible**: 100% ✅
|
||
|
||
## Support
|
||
|
||
Refer to:
|
||
- `defaults/main.yml` - Complete variable documentation
|
||
- Task files - Inline comments explaining implementation
|
||
- Run with `-vvv` flag for debug output
|
||
- Check `/var/lib/vz/snippets/` for Cloud-Init files
|
||
|
||
## License
|
||
|
||
Open source - use as-is for Proxmox automation.
|