358 lines
8.8 KiB
Markdown
358 lines
8.8 KiB
Markdown
# Ansible Role: Proxmox VM → Template → Clones (Cloud‑Init)
|
||
|
||
Automates the entire lifecycle of a Debian GenericCloud VM on Proxmox:
|
||
- Download the Debian image
|
||
- Create a base VM
|
||
- Optionally enable UEFI, SecureBoot, TPM 2.0, GPU passthrough
|
||
- Convert the VM into a template
|
||
- Spin up any number of Cloud‑Init clones with static or dynamic networking
|
||
|
||
## Features
|
||
|
||
- ✅ Auto‑download Debian Bookworm GenericCloud image
|
||
- ✅ Create VM (CPU, RAM, networking, storage)
|
||
- ✅ DHCP or static IP support
|
||
- ✅ Cloud‑Init: users, SSH keys, passwords, timezone, packages
|
||
- ✅ Optional TPM 2.0 + SecureBoot (OVMF)
|
||
- ✅ Optional GPU passthrough or VirtIO GPU
|
||
- ✅ Optional disk resize
|
||
- ✅ Convert base VM into a template
|
||
- ✅ Create multiple clones from template
|
||
- ✅ Start clones after creation
|
||
|
||
## Folder Structure
|
||
|
||
```
|
||
ANSIBLE_PROXMOX_VM/
|
||
├─ defaults/
|
||
│ └─ main.yml
|
||
├─ tasks/
|
||
│ └─ main.yml
|
||
├─ templates/
|
||
│ ├─ cloudinit_userdata.yaml.j2
|
||
│ └─ cloudinit_vendor.yaml.j2
|
||
└─ README.md
|
||
```
|
||
|
||
## Requirements
|
||
|
||
- Proxmox VE installed and accessible
|
||
- Role runs on the Proxmox host via localhost, using `qm` CLI commands
|
||
- Ansible must have SSH access to the Proxmox node
|
||
- User must have permission to run `qm` commands (root recommended)
|
||
- Proxmox storage pool configured (e.g., `local-lvm`)
|
||
- Snippets storage enabled for Cloud-Init (`Datacenter → Storage`)
|
||
|
||
## Quick Start
|
||
|
||
### 1. Validate Environment
|
||
```bash
|
||
ansible-playbook tasks/main.yml --tags preflight -vvv
|
||
```
|
||
Checks Proxmox, storage, SSH keys, permissions before running.
|
||
|
||
### 2. Dry Run (No Changes)
|
||
```bash
|
||
ansible-playbook tasks/main.yml --check -vv
|
||
```
|
||
Shows what would happen without making changes.
|
||
|
||
### 3. Full Deployment
|
||
```bash
|
||
ansible-playbook tasks/main.yml -i inventory
|
||
```
|
||
Runs all stages: preflight → image → VM → configure → template → clones
|
||
|
||
### 4. Re-run (Test Idempotency)
|
||
```bash
|
||
ansible-playbook tasks/main.yml -i inventory
|
||
```
|
||
Much faster! Skips already-completed operations (image cached, VM exists, etc.)
|
||
|
||
## Configuration Variables
|
||
|
||
All variables are in `defaults/main.yml` with comprehensive documentation:
|
||
|
||
### Base VM Configuration
|
||
```yaml
|
||
vm_id: 150 # Unique Proxmox VM ID (≥100)
|
||
hostname: debian-template-base # VM hostname
|
||
memory: 4096 # RAM in MB
|
||
cores: 4 # CPU cores
|
||
cpu_type: host # CPU type (host, kvm64, etc.)
|
||
bridge: vmbr0 # Network bridge
|
||
storage: local-lvm # Storage pool
|
||
```
|
||
|
||
### Networking
|
||
```yaml
|
||
ip_mode: dhcp # 'dhcp' or 'static'
|
||
ip_address: "192.168.1.60/24" # Static IP (CIDR, if static)
|
||
gateway: "192.168.1.1" # Gateway IP
|
||
dns:
|
||
- "1.1.1.1"
|
||
- "8.8.8.8" # DNS servers
|
||
```
|
||
|
||
### Cloud-Init
|
||
```yaml
|
||
ci_user: debian # Default user
|
||
ci_password: "SecurePass123" # Password (use Vault in production!)
|
||
ssh_key_path: "~/.ssh/id_rsa.pub" # SSH public key
|
||
timezone: "Europe/Berlin" # Timezone
|
||
packages:
|
||
- qemu-guest-agent
|
||
- curl
|
||
- htop
|
||
```
|
||
|
||
### Advanced Options
|
||
```yaml
|
||
# UEFI + TPM 2.0
|
||
enable_tpm: false
|
||
|
||
# GPU Passthrough
|
||
gpu_passthrough: false
|
||
gpu_device: "0000:01:00.0"
|
||
virtio_gpu: false
|
||
|
||
# Disk
|
||
resize_disk: true
|
||
resize_size: "16G"
|
||
|
||
# Template & Clones
|
||
make_template: true # Convert VM to template
|
||
create_clones: true # Create clones from template
|
||
```
|
||
|
||
### Clone Definition
|
||
```yaml
|
||
clones:
|
||
- id: 301 # Unique VM ID
|
||
hostname: app01 # Clone hostname
|
||
ip: "192.168.1.81/24" # Clone IP (CIDR)
|
||
gateway: "192.168.1.1"
|
||
full: 1 # 1=full clone, 0=linked clone
|
||
- id: 302
|
||
hostname: app02
|
||
ip: "192.168.1.82/24"
|
||
gateway: "192.168.1.1"
|
||
full: 0 # Faster, space-saving
|
||
```
|
||
|
||
See `defaults/main.yml` for all available options with documentation.
|
||
|
||
|
||
## Usage
|
||
|
||
### 1. Include in a Playbook
|
||
```yaml
|
||
- hosts: proxmox_host
|
||
become: true
|
||
roles:
|
||
- ansible_proxmox_vm
|
||
```
|
||
|
||
### 2. Run Directly
|
||
```bash
|
||
ansible-playbook tasks/main.yml -i inventory
|
||
```
|
||
|
||
### 3. Run Specific Stages (with tags)
|
||
```bash
|
||
# Pre-flight checks only
|
||
ansible-playbook tasks/main.yml --tags preflight -vvv
|
||
|
||
# Create VM and template (skip clones)
|
||
ansible-playbook tasks/main.yml --skip-tags clones
|
||
|
||
# Add clones to existing template
|
||
ansible-playbook tasks/main.yml --tags clones
|
||
|
||
# Skip re-downloading image
|
||
ansible-playbook tasks/main.yml --skip-tags image
|
||
```
|
||
|
||
## Playbook Stages
|
||
|
||
The playbook executes in 6 stages:
|
||
|
||
| Stage | Task | Purpose |
|
||
|-------|------|---------|
|
||
| 1 | `preflight-checks.yml` | Validate environment (20+ checks) |
|
||
| 2 | `download-image.yml` | Download/cache Debian image |
|
||
| 3 | `create-vm.yml` | Create base VM |
|
||
| 4 | `configure-vm.yml` | Configure disk, network, Cloud-Init |
|
||
| 5 | `create-template.yml` | Convert VM to template (idempotent!) |
|
||
| 6 | `create-clones.yml` | Deploy clones from template |
|
||
|
||
Each stage can be skipped or re-run independently using tags.
|
||
|
||
## Key Improvements
|
||
|
||
### ✅ Error Handling
|
||
- Automatic retry (3x, 5-second delays)
|
||
- Context-aware error messages
|
||
- Per-clone error isolation (failures don't cascade)
|
||
|
||
### ✅ Idempotency
|
||
- Safe to re-run multiple times
|
||
- Already-created VMs/templates are skipped
|
||
- Image is cached and reused
|
||
- **Template conversion is now idempotent!** (was broken in v1)
|
||
|
||
### ✅ Pre-flight Validation
|
||
- Proxmox connectivity
|
||
- Storage pool availability
|
||
- SSH key readiness
|
||
- IP address format validation
|
||
- Permission verification
|
||
- VM ID uniqueness checks
|
||
|
||
### ✅ Advanced Features
|
||
- UEFI/TPM 2.0 support
|
||
- GPU passthrough (PCI or VirtIO)
|
||
- Disk automatic resize
|
||
- Cloud-Init user/password/SSH keys
|
||
- DHCP or static networking
|
||
- Multi-clone deployment
|
||
|
||
## Cloud-Init Templates
|
||
|
||
### `cloudinit_userdata.yaml.j2`
|
||
Configured with:
|
||
- User creation ({{ ci_user }})
|
||
- SSH key injection
|
||
- Password authentication
|
||
- Timezone setting
|
||
- Package updates
|
||
- Custom commands
|
||
|
||
### `cloudinit_vendor.yaml.j2`
|
||
Configured with:
|
||
- Package installation
|
||
- DNS configuration (optional)
|
||
|
||
## Testing & Validation
|
||
|
||
### Preflight Checks
|
||
```bash
|
||
ansible-playbook tasks/main.yml --tags preflight -vvv
|
||
```
|
||
Shows all validation checks (Proxmox, storage, SSH, IPs, permissions, etc.)
|
||
|
||
### Dry Run (Preview Changes)
|
||
```bash
|
||
ansible-playbook tasks/main.yml --check -vv
|
||
```
|
||
Shows what would happen without making any changes.
|
||
|
||
### Idempotency Test
|
||
```bash
|
||
# Run once
|
||
ansible-playbook tasks/main.yml -vv
|
||
|
||
# Run again (should be much faster)
|
||
ansible-playbook tasks/main.yml -vv
|
||
```
|
||
|
||
Second run should skip most operations and complete in ~30 seconds.
|
||
|
||
## Security Notes
|
||
|
||
- ⚠️ **Password**: Use Ansible Vault for `ci_password` in production:
|
||
```bash
|
||
ansible-vault create group_vars/proxmox/vault.yml
|
||
```
|
||
Then reference: `ci_password: "{{ vault_ci_password }}"`
|
||
|
||
- ✅ **SSH Key**: Automatically validated before use
|
||
- ✅ **Permissions**: Role checks if user can run `qm` commands
|
||
- ✅ **No Hardcoded Secrets**: All sensitive data in variables
|
||
|
||
## Best Practices
|
||
|
||
1. **Always run with `--check` first** to preview changes
|
||
2. **Run `--tags preflight` to validate** environment setup
|
||
3. **Use `--skip-tags image`** when re-running to save time
|
||
4. **Monitor Cloud-Init inside VMs**: `cloud-init status`
|
||
5. **Test in dev environment first** before production
|
||
6. **Use linked clones** (`full: 0`) for faster deployments
|
||
7. **Enable Proxmox snippets storage** for Cloud-Init
|
||
|
||
## Troubleshooting
|
||
|
||
### VM creation fails
|
||
```bash
|
||
# Validate environment first
|
||
ansible-playbook tasks/main.yml --tags preflight -vvv
|
||
|
||
# Check Proxmox
|
||
qm list
|
||
qm version
|
||
pvesm status local-lvm
|
||
```
|
||
|
||
### Cloud-Init not applying
|
||
```bash
|
||
# Check inside VM
|
||
cloud-init status
|
||
cloud-init logs
|
||
|
||
# Check snippets directory
|
||
ls -la /var/lib/vz/snippets/
|
||
```
|
||
|
||
### SSH key issues
|
||
```bash
|
||
# Verify SSH key exists and is readable
|
||
ls -la ~/.ssh/id_rsa.pub
|
||
|
||
# Run with verbose output
|
||
ansible-playbook tasks/main.yml -vvv
|
||
```
|
||
|
||
## Common Commands
|
||
|
||
```bash
|
||
# List all VMs
|
||
qm list
|
||
|
||
# Check VM status
|
||
qm status 150
|
||
|
||
# Connect to VM console
|
||
qm terminal 150
|
||
|
||
# View VM config
|
||
qm config 150
|
||
|
||
# SSH into VM
|
||
ssh debian@<vm-ip>
|
||
|
||
# Check Cloud-Init status
|
||
cloud-init status --all
|
||
```
|
||
|
||
## Performance Tips
|
||
|
||
- **First run**: ~5-10 minutes (downloads image, creates VM)
|
||
- **Re-runs**: ~30 seconds (image cached, operations skipped)
|
||
- **Linked clones**: Much faster than full clones
|
||
- **Tag-based execution**: Skip expensive operations
|
||
|
||
## Compatibility
|
||
|
||
- **Proxmox**: 7.x, 8.x (uses `qm` CLI)
|
||
- **Debian**: Bookworm GenericCloud (configurable)
|
||
- **Ansible**: 2.9+ (uses standard modules)
|
||
- **Backward Compatible**: 100% (all old variables still work)
|
||
|
||
## Support & Documentation
|
||
|
||
Refer to `defaults/main.yml` for complete variable documentation with examples and explanations for every option.
|
||
|
||
## License
|
||
|
||
This role is provided as-is for Proxmox automation.
|