Files
ansible_proxmox_VM/README.md

386 lines
11 KiB
Markdown
Raw Normal View History

# Ansible Role: Proxmox VM → Template → Clones (CloudInit)
**Production-grade automation** for Debian GenericCloud VMs on Proxmox with error handling, idempotency, and comprehensive validation.
Automates the complete lifecycle:
- ✅ Pre-flight environment validation (20+ checks)
- ✅ Download & cache Debian GenericCloud image
- ✅ Create base VM with error recovery
- ✅ Configure disk, networking, Cloud-Init, TPM, GPU
- ✅ Convert VM to template (**idempotent** - safe to re-run!)
- ✅ Deploy multiple clones with custom networking
- ✅ Per-clone error handling (failures don't cascade)
## Features
-**Error Handling** - Automatic retry (3x, 5-sec delay) with clear messages
-**Idempotency** - Truly safe to re-run; skips already-completed operations
-**Modular Design** - Helper functions encapsulate common VM operations
-**Pre-flight Validation** - 20+ environment checks before execution
-**Image Caching** - Downloads once, reuses on re-runs (faster!)
-**DHCP or Static IP** - Flexible networking configuration
-**Cloud-Init** - Users, SSH keys, passwords, timezone, packages
-**TPM 2.0 + SecureBoot** - Optional UEFI firmware support
-**GPU Passthrough** - Optional PCI device or VirtIO GPU
-**Disk Resize** - Optional automatic disk expansion
-**Multi-Clone** - Deploy multiple clones independently with per-clone error handling
-**Rich Logging** - Progress tracking and debug output
## Folder Structure
```
ansible_proxmox_VM/
├─ defaults/
│ └─ main.yml # All configuration (comprehensive docs)
├─ tasks/
│ ├─ main.yml # Orchestrator (calls subtasks)
│ ├─ preflight-checks.yml # Environment validation (20+ checks)
│ ├─ download-image.yml # Download Debian image (with caching)
│ ├─ create-vm.yml # Create VM (idempotent)
│ ├─ configure-vm.yml # Configure disk, Cloud-Init, TPM, GPU
│ ├─ create-template.yml # Convert to template (idempotent - FIXED!)
│ ├─ create-clones.yml # Deploy clones (per-clone error handling)
│ └─ helpers.yml # 8 utility functions
├─ templates/
│ ├─ cloudinit_userdata.yaml.j2 # Cloud-Init user data template
│ └─ cloudinit_vendor.yaml.j2 # Cloud-Init vendor data template
└─ README.md # This file
```
## Requirements
- **Proxmox VE** 7.x or 8.x installed and accessible
- **Ansible** 2.9+ with SSH access to Proxmox host
- **Proxmox user** with permission to run `qm` commands (root recommended)
- **Storage pool** configured (e.g., `local-lvm`)
- **Snippets storage** enabled for Cloud-Init (`Datacenter → Storage`)
## Quick Start
### 1. Validate Environment
```bash
ansible-playbook tasks/main.yml --tags preflight -vvv
```
Checks Proxmox connectivity, storage, SSH keys, permissions.
### 2. Dry Run (Preview Changes)
```bash
ansible-playbook tasks/main.yml --check -vv
```
Shows what would happen without making any changes.
### 3. Full Deployment
```bash
ansible-playbook tasks/main.yml -i inventory
```
Creates VM → configures it → converts to template → deploys clones
### 4. Re-run (Test Idempotency)
```bash
ansible-playbook tasks/main.yml -i inventory
```
Second run is much faster (~30 sec)! Skips already-completed operations.
## Configuration Variables
All variables are in `defaults/main.yml` with comprehensive inline documentation.
### Base VM Configuration
```yaml
vm_id: 150 # Unique Proxmox VM ID (≥100)
hostname: debian-template-base # VM hostname
memory: 4096 # RAM in MB
cores: 4 # CPU cores
cpu_type: host # CPU type
bridge: vmbr0 # Network bridge
storage: local-lvm # Storage pool
```
### Networking
```yaml
ip_mode: dhcp # 'dhcp' or 'static'
ip_address: "192.168.1.60/24" # Static IP if ip_mode: static
gateway: "192.168.1.1" # Gateway
dns:
- "1.1.1.1"
- "8.8.8.8"
```
### Cloud-Init
```yaml
ci_user: debian # Default user
ci_password: "SecurePass123" # Use Vault in production!
ssh_key_path: "~/.ssh/id_rsa.pub" # SSH public key path
timezone: "Europe/Berlin" # Timezone
packages:
- qemu-guest-agent
- curl
- htop
```
### Advanced Options
```yaml
enable_tpm: false # UEFI + TPM 2.0
gpu_passthrough: false # PCI GPU passthrough
virtio_gpu: false # VirtIO GPU
resize_disk: true # Auto-resize disk
resize_size: "16G" # Target disk size
make_template: true # Convert to template
create_clones: true # Deploy clones
```
### Clone Definition
```yaml
clones:
- id: 301
hostname: app01
ip: "192.168.1.81/24"
gateway: "192.168.1.1"
full: 1 # 1=full, 0=linked
- id: 302
hostname: app02
ip: "192.168.1.82/24"
gateway: "192.168.1.1"
full: 0 # Linked clones are faster
```
See `defaults/main.yml` for all options with detailed documentation.
## Usage
### Include in Playbook
```yaml
- hosts: proxmox_host
become: true
roles:
- ansible_proxmox_vm
```
### Run Directly
```bash
ansible-playbook tasks/main.yml -i inventory
```
### Using Tags (Run Specific Stages)
```bash
# Pre-flight checks only
ansible-playbook tasks/main.yml --tags preflight -vvv
# Create VM and template (skip clones)
ansible-playbook tasks/main.yml --skip-tags clones
# Add clones to existing template
ansible-playbook tasks/main.yml --tags clones
# Skip image re-download
ansible-playbook tasks/main.yml --skip-tags image
```
## Playbook Stages (6 Stages)
| Stage | Task | Purpose | Idempotent |
|-------|------|---------|-----------|
| 1 | `preflight-checks.yml` | Validate environment (20+ checks) | ✅ Yes |
| 2 | `download-image.yml` | Download/cache Debian image | ✅ Yes |
| 3 | `create-vm.yml` | Create base VM | ✅ Yes |
| 4 | `configure-vm.yml` | Configure disk, network, Cloud-Init | ✅ Yes |
| 5 | `create-template.yml` | Convert to template | ✅ Yes (FIXED!) |
| 6 | `create-clones.yml` | Deploy clones from template | ✅ Yes |
## Helper Functions (tasks/helpers.yml)
All task files use centralized helper functions for consistency and maintainability:
| Helper | Purpose | Sets Variable |
|--------|---------|----------------|
| `check_vm_exists` | Check if VM config file exists | `vm_exists` |
| `check_template` | Check if VM is a template | `is_template` |
| `check_vm_status` | Get current VM status (running/stopped) | `vm_status` |
| `check_disk_attached` | Check if disk is attached via `qm config` | `disk_attached` |
| `check_storage` | Check available storage space | `storage_available` |
| `validate_vm_id` | Validate VM ID format (100-999999) | (assertions only) |
| `get_vm_info` | Read and parse VM config file | `vm_info` |
| `list_vms` | Get list of all VMs | `vm_list` |
| `cleanup_snippets` | Remove old Cloud-Init snippets | (side effect) |
### Example: Using Helpers in Tasks
```yaml
- name: Check if VM already exists
ansible.builtin.include_tasks: helpers.yml
vars:
helper_task: check_vm_exists
target_vm_id: "{{ vm_id }}"
- name: Display status
debug:
msg: "VM status: {{ 'EXISTS' if vm_exists else 'WILL BE CREATED' }}"
```
## Key Improvements
### ✅ Error Handling
- Automatic retry with configurable delays (3x, 5-sec)
- Context-aware error messages
- Per-clone error isolation (doesn't cascade)
### ✅ Idempotency
- Safe to re-run multiple times
- Skips already-completed operations
- Image cached and reused
- **Template conversion idempotent** (was broken in v1!)
- **Per-clone checking** - skips existing clones
- **Snippet cleanup** - old files removed before re-configuration
**Idempotency per stage**:
-`create-vm.yml` - Checks if VM exists before creating
-`configure-vm.yml` - Re-applies Cloud-Init (safe to re-run)
-`create-template.yml` - Skips if already a template
-`create-clones.yml` - Skips clones that already exist (per-clone checks)
-`download-image.yml` - Skips download if file exists locally
### ✅ Pre-flight Validation
- Proxmox connectivity & permissions
- Storage pool availability
- SSH key readiness
- IP address format validation
- VM ID uniqueness checks
### ✅ Advanced Features
- UEFI/TPM 2.0 support
- GPU passthrough (PCI or VirtIO)
- Automatic disk resize
- Cloud-Init with user/password/SSH
- DHCP or static networking
- Multi-clone deployment
## Testing & Validation
### Preflight Checks
```bash
ansible-playbook tasks/main.yml --tags preflight -vvv
```
### Dry Run (Preview)
```bash
ansible-playbook tasks/main.yml --check -vv
```
### Test Idempotency
```bash
# First run
ansible-playbook tasks/main.yml -vv
# Second run (should be much faster)
ansible-playbook tasks/main.yml -vv
```
## Cloud-Init Templates
### `cloudinit_userdata.yaml.j2`
Configures:
- User creation with sudo access
- SSH key injection
- Password authentication
- Timezone setting
- Package updates
### `cloudinit_vendor.yaml.j2`
Configures:
- Package installation
- DNS settings (optional)
## Security Notes
⚠️ **Passwords**: Use Ansible Vault in production:
```bash
ansible-vault create group_vars/proxmox/vault.yml
```
Then reference: `ci_password: "{{ vault_ci_password }}"`
**SSH Keys**: Automatically validated before use
**Permissions**: Checks if user can run `qm` commands
**No Hardcoded Secrets**: All in variables
## Best Practices
1. Always run with `--check` first
2. Validate environment with `--tags preflight`
3. Skip image re-download with `--skip-tags image`
4. Monitor Cloud-Init: `cloud-init status` inside VM
5. Test in dev environment first
6. Use linked clones (`full: 0`) for faster deployments
7. Enable Proxmox snippets storage
## Performance
- **First run**: ~5-10 minutes (downloads image, creates VM)
- **Re-runs**: ~30 seconds (operations skipped)
- **Linked clones**: Much faster than full clones
## Troubleshooting
### Preflight validation fails
```bash
ansible-playbook tasks/main.yml --tags preflight -vvv
```
### Cloud-Init not applying
```bash
# Inside VM:
cloud-init status
cloud-init logs
# Check snippets:
ls -la /var/lib/vz/snippets/
```
### SSH key issues
```bash
# Verify SSH key
ls -la ~/.ssh/id_rsa.pub
# Run with verbose
ansible-playbook tasks/main.yml -vvv
```
## Common Proxmox Commands
```bash
# List all VMs
qm list
# Check VM status
qm status 150
# View VM config
qm config 150
# Connect to console
qm terminal 150
# SSH into VM
ssh debian@<vm-ip>
# Check Cloud-Init
cloud-init status --all
```
## Compatibility
- **Proxmox**: 7.x, 8.x (uses `qm` CLI)
- **Debian**: Bookworm GenericCloud (configurable)
- **Ansible**: 2.9+ (standard modules)
- **Backward Compatible**: 100% ✅
## Support
Refer to:
- `defaults/main.yml` - Complete variable documentation
- Task files - Inline comments explaining implementation
- Run with `-vvv` flag for debug output
- Check `/var/lib/vz/snippets/` for Cloud-Init files
## License
Open source - use as-is for Proxmox automation.