Files
ansible_proxmox_VM/README.md
Jose d171a4a7b9 refactor ♻️: Refactored all task files to use centralized helper functions from tasks/helpers.yml, improving code consistency, maintainability, and idempotency.
All task files now use centralized helper functions, ensuring idempotency across all stages. Code is cleaner, more maintainable, and no breaking changes were introduced.
2025-11-18 20:24:43 +01:00

386 lines
11 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Ansible Role: Proxmox VM → Template → Clones (CloudInit)
**Production-grade automation** for Debian GenericCloud VMs on Proxmox with error handling, idempotency, and comprehensive validation.
Automates the complete lifecycle:
- ✅ Pre-flight environment validation (20+ checks)
- ✅ Download & cache Debian GenericCloud image
- ✅ Create base VM with error recovery
- ✅ Configure disk, networking, Cloud-Init, TPM, GPU
- ✅ Convert VM to template (**idempotent** - safe to re-run!)
- ✅ Deploy multiple clones with custom networking
- ✅ Per-clone error handling (failures don't cascade)
## Features
-**Error Handling** - Automatic retry (3x, 5-sec delay) with clear messages
-**Idempotency** - Truly safe to re-run; skips already-completed operations
-**Modular Design** - Helper functions encapsulate common VM operations
-**Pre-flight Validation** - 20+ environment checks before execution
-**Image Caching** - Downloads once, reuses on re-runs (faster!)
-**DHCP or Static IP** - Flexible networking configuration
-**Cloud-Init** - Users, SSH keys, passwords, timezone, packages
-**TPM 2.0 + SecureBoot** - Optional UEFI firmware support
-**GPU Passthrough** - Optional PCI device or VirtIO GPU
-**Disk Resize** - Optional automatic disk expansion
-**Multi-Clone** - Deploy multiple clones independently with per-clone error handling
-**Rich Logging** - Progress tracking and debug output
## Folder Structure
```
ansible_proxmox_VM/
├─ defaults/
│ └─ main.yml # All configuration (comprehensive docs)
├─ tasks/
│ ├─ main.yml # Orchestrator (calls subtasks)
│ ├─ preflight-checks.yml # Environment validation (20+ checks)
│ ├─ download-image.yml # Download Debian image (with caching)
│ ├─ create-vm.yml # Create VM (idempotent)
│ ├─ configure-vm.yml # Configure disk, Cloud-Init, TPM, GPU
│ ├─ create-template.yml # Convert to template (idempotent - FIXED!)
│ ├─ create-clones.yml # Deploy clones (per-clone error handling)
│ └─ helpers.yml # 8 utility functions
├─ templates/
│ ├─ cloudinit_userdata.yaml.j2 # Cloud-Init user data template
│ └─ cloudinit_vendor.yaml.j2 # Cloud-Init vendor data template
└─ README.md # This file
```
## Requirements
- **Proxmox VE** 7.x or 8.x installed and accessible
- **Ansible** 2.9+ with SSH access to Proxmox host
- **Proxmox user** with permission to run `qm` commands (root recommended)
- **Storage pool** configured (e.g., `local-lvm`)
- **Snippets storage** enabled for Cloud-Init (`Datacenter → Storage`)
## Quick Start
### 1. Validate Environment
```bash
ansible-playbook tasks/main.yml --tags preflight -vvv
```
Checks Proxmox connectivity, storage, SSH keys, permissions.
### 2. Dry Run (Preview Changes)
```bash
ansible-playbook tasks/main.yml --check -vv
```
Shows what would happen without making any changes.
### 3. Full Deployment
```bash
ansible-playbook tasks/main.yml -i inventory
```
Creates VM → configures it → converts to template → deploys clones
### 4. Re-run (Test Idempotency)
```bash
ansible-playbook tasks/main.yml -i inventory
```
Second run is much faster (~30 sec)! Skips already-completed operations.
## Configuration Variables
All variables are in `defaults/main.yml` with comprehensive inline documentation.
### Base VM Configuration
```yaml
vm_id: 150 # Unique Proxmox VM ID (≥100)
hostname: debian-template-base # VM hostname
memory: 4096 # RAM in MB
cores: 4 # CPU cores
cpu_type: host # CPU type
bridge: vmbr0 # Network bridge
storage: local-lvm # Storage pool
```
### Networking
```yaml
ip_mode: dhcp # 'dhcp' or 'static'
ip_address: "192.168.1.60/24" # Static IP if ip_mode: static
gateway: "192.168.1.1" # Gateway
dns:
- "1.1.1.1"
- "8.8.8.8"
```
### Cloud-Init
```yaml
ci_user: debian # Default user
ci_password: "SecurePass123" # Use Vault in production!
ssh_key_path: "~/.ssh/id_rsa.pub" # SSH public key path
timezone: "Europe/Berlin" # Timezone
packages:
- qemu-guest-agent
- curl
- htop
```
### Advanced Options
```yaml
enable_tpm: false # UEFI + TPM 2.0
gpu_passthrough: false # PCI GPU passthrough
virtio_gpu: false # VirtIO GPU
resize_disk: true # Auto-resize disk
resize_size: "16G" # Target disk size
make_template: true # Convert to template
create_clones: true # Deploy clones
```
### Clone Definition
```yaml
clones:
- id: 301
hostname: app01
ip: "192.168.1.81/24"
gateway: "192.168.1.1"
full: 1 # 1=full, 0=linked
- id: 302
hostname: app02
ip: "192.168.1.82/24"
gateway: "192.168.1.1"
full: 0 # Linked clones are faster
```
See `defaults/main.yml` for all options with detailed documentation.
## Usage
### Include in Playbook
```yaml
- hosts: proxmox_host
become: true
roles:
- ansible_proxmox_vm
```
### Run Directly
```bash
ansible-playbook tasks/main.yml -i inventory
```
### Using Tags (Run Specific Stages)
```bash
# Pre-flight checks only
ansible-playbook tasks/main.yml --tags preflight -vvv
# Create VM and template (skip clones)
ansible-playbook tasks/main.yml --skip-tags clones
# Add clones to existing template
ansible-playbook tasks/main.yml --tags clones
# Skip image re-download
ansible-playbook tasks/main.yml --skip-tags image
```
## Playbook Stages (6 Stages)
| Stage | Task | Purpose | Idempotent |
|-------|------|---------|-----------|
| 1 | `preflight-checks.yml` | Validate environment (20+ checks) | ✅ Yes |
| 2 | `download-image.yml` | Download/cache Debian image | ✅ Yes |
| 3 | `create-vm.yml` | Create base VM | ✅ Yes |
| 4 | `configure-vm.yml` | Configure disk, network, Cloud-Init | ✅ Yes |
| 5 | `create-template.yml` | Convert to template | ✅ Yes (FIXED!) |
| 6 | `create-clones.yml` | Deploy clones from template | ✅ Yes |
## Helper Functions (tasks/helpers.yml)
All task files use centralized helper functions for consistency and maintainability:
| Helper | Purpose | Sets Variable |
|--------|---------|----------------|
| `check_vm_exists` | Check if VM config file exists | `vm_exists` |
| `check_template` | Check if VM is a template | `is_template` |
| `check_vm_status` | Get current VM status (running/stopped) | `vm_status` |
| `check_disk_attached` | Check if disk is attached via `qm config` | `disk_attached` |
| `check_storage` | Check available storage space | `storage_available` |
| `validate_vm_id` | Validate VM ID format (100-999999) | (assertions only) |
| `get_vm_info` | Read and parse VM config file | `vm_info` |
| `list_vms` | Get list of all VMs | `vm_list` |
| `cleanup_snippets` | Remove old Cloud-Init snippets | (side effect) |
### Example: Using Helpers in Tasks
```yaml
- name: Check if VM already exists
ansible.builtin.include_tasks: helpers.yml
vars:
helper_task: check_vm_exists
target_vm_id: "{{ vm_id }}"
- name: Display status
debug:
msg: "VM status: {{ 'EXISTS' if vm_exists else 'WILL BE CREATED' }}"
```
## Key Improvements
### ✅ Error Handling
- Automatic retry with configurable delays (3x, 5-sec)
- Context-aware error messages
- Per-clone error isolation (doesn't cascade)
### ✅ Idempotency
- Safe to re-run multiple times
- Skips already-completed operations
- Image cached and reused
- **Template conversion idempotent** (was broken in v1!)
- **Per-clone checking** - skips existing clones
- **Snippet cleanup** - old files removed before re-configuration
**Idempotency per stage**:
-`create-vm.yml` - Checks if VM exists before creating
-`configure-vm.yml` - Re-applies Cloud-Init (safe to re-run)
-`create-template.yml` - Skips if already a template
-`create-clones.yml` - Skips clones that already exist (per-clone checks)
-`download-image.yml` - Skips download if file exists locally
### ✅ Pre-flight Validation
- Proxmox connectivity & permissions
- Storage pool availability
- SSH key readiness
- IP address format validation
- VM ID uniqueness checks
### ✅ Advanced Features
- UEFI/TPM 2.0 support
- GPU passthrough (PCI or VirtIO)
- Automatic disk resize
- Cloud-Init with user/password/SSH
- DHCP or static networking
- Multi-clone deployment
## Testing & Validation
### Preflight Checks
```bash
ansible-playbook tasks/main.yml --tags preflight -vvv
```
### Dry Run (Preview)
```bash
ansible-playbook tasks/main.yml --check -vv
```
### Test Idempotency
```bash
# First run
ansible-playbook tasks/main.yml -vv
# Second run (should be much faster)
ansible-playbook tasks/main.yml -vv
```
## Cloud-Init Templates
### `cloudinit_userdata.yaml.j2`
Configures:
- User creation with sudo access
- SSH key injection
- Password authentication
- Timezone setting
- Package updates
### `cloudinit_vendor.yaml.j2`
Configures:
- Package installation
- DNS settings (optional)
## Security Notes
⚠️ **Passwords**: Use Ansible Vault in production:
```bash
ansible-vault create group_vars/proxmox/vault.yml
```
Then reference: `ci_password: "{{ vault_ci_password }}"`
**SSH Keys**: Automatically validated before use
**Permissions**: Checks if user can run `qm` commands
**No Hardcoded Secrets**: All in variables
## Best Practices
1. Always run with `--check` first
2. Validate environment with `--tags preflight`
3. Skip image re-download with `--skip-tags image`
4. Monitor Cloud-Init: `cloud-init status` inside VM
5. Test in dev environment first
6. Use linked clones (`full: 0`) for faster deployments
7. Enable Proxmox snippets storage
## Performance
- **First run**: ~5-10 minutes (downloads image, creates VM)
- **Re-runs**: ~30 seconds (operations skipped)
- **Linked clones**: Much faster than full clones
## Troubleshooting
### Preflight validation fails
```bash
ansible-playbook tasks/main.yml --tags preflight -vvv
```
### Cloud-Init not applying
```bash
# Inside VM:
cloud-init status
cloud-init logs
# Check snippets:
ls -la /var/lib/vz/snippets/
```
### SSH key issues
```bash
# Verify SSH key
ls -la ~/.ssh/id_rsa.pub
# Run with verbose
ansible-playbook tasks/main.yml -vvv
```
## Common Proxmox Commands
```bash
# List all VMs
qm list
# Check VM status
qm status 150
# View VM config
qm config 150
# Connect to console
qm terminal 150
# SSH into VM
ssh debian@<vm-ip>
# Check Cloud-Init
cloud-init status --all
```
## Compatibility
- **Proxmox**: 7.x, 8.x (uses `qm` CLI)
- **Debian**: Bookworm GenericCloud (configurable)
- **Ansible**: 2.9+ (standard modules)
- **Backward Compatible**: 100% ✅
## Support
Refer to:
- `defaults/main.yml` - Complete variable documentation
- Task files - Inline comments explaining implementation
- Run with `-vvv` flag for debug output
- Check `/var/lib/vz/snippets/` for Cloud-Init files
## License
Open source - use as-is for Proxmox automation.