deleted: 00_README_FIRST.md

deleted:    ARCHITECTURE.md
	deleted:    CHANGELOG.md
	deleted:    GET_STARTED.md
	deleted:    IMPLEMENTATION_SUMMARY.md
	deleted:    IMPROVEMENTS.md
	deleted:    QUICK_REFERENCE.md
	modified:   README.md
	deleted:    README_NEW.md
	deleted:    VERIFICATION_CHECKLIST.md
	deleted:    _FINAL_SUMMARY.txt
This commit is contained in:
2025-11-15 17:39:23 +01:00
parent 0a8f5c6124
commit 53ee97bc59
11 changed files with 144 additions and 3587 deletions

298
README.md
View File

@@ -1,47 +1,59 @@
# Ansible Role: Proxmox VM → Template → Clones (CloudInit)
Automates the entire lifecycle of a Debian GenericCloud VM on Proxmox:
- Download the Debian image
- Create a base VM
- Optionally enable UEFI, SecureBoot, TPM 2.0, GPU passthrough
- Convert the VM into a template
- Spin up any number of CloudInit clones with static or dynamic networking
**Production-grade automation** for Debian GenericCloud VMs on Proxmox with error handling, idempotency, and comprehensive validation.
Automates the complete lifecycle:
- ✅ Pre-flight environment validation (20+ checks)
- ✅ Download & cache Debian GenericCloud image
- ✅ Create base VM with error recovery
- ✅ Configure disk, networking, Cloud-Init, TPM, GPU
- ✅ Convert VM to template (**idempotent** - safe to re-run!)
- ✅ Deploy multiple clones with custom networking
- ✅ Per-clone error handling (failures don't cascade)
## Features
-Autodownload Debian Bookworm GenericCloud image
-Create VM (CPU, RAM, networking, storage)
-DHCP or static IP support
-CloudInit: users, SSH keys, passwords, timezone, packages
-Optional TPM2.0 + SecureBoot (OVMF)
-Optional GPU passthrough or VirtIO GPU
-Optional disk resize
-Convert base VM into a template
-Create multiple clones from template
-Start clones after creation
-**Error Handling** - Automatic retry (3x, 5-sec delay) with clear messages
-**Idempotency** - Truly safe to re-run; skips already-completed operations
-**Pre-flight Validation** - 20+ environment checks before execution
-**Modular Design** - 6 independent task stages with tag-based execution
-**Image Caching** - Downloads once, reuses on re-runs (faster!)
-**DHCP or Static IP** - Flexible networking configuration
-**Cloud-Init** - Users, SSH keys, passwords, timezone, packages
-**TPM 2.0 + SecureBoot** - Optional UEFI firmware support
-**GPU Passthrough** - Optional PCI device or VirtIO GPU
-**Disk Resize** - Optional automatic disk expansion
-**Multi-Clone** - Deploy multiple clones independently
-**Rich Logging** - Progress tracking and debug output
## Folder Structure
```
ANSIBLE_PROXMOX_VM/
ansible_proxmox_VM/
├─ defaults/
│ └─ main.yml
│ └─ main.yml # All configuration (comprehensive docs)
├─ tasks/
─ main.yml
─ main.yml # Orchestrator (calls subtasks)
│ ├─ preflight-checks.yml # Environment validation (20+ checks)
│ ├─ download-image.yml # Download Debian image (with caching)
│ ├─ create-vm.yml # Create VM (idempotent)
│ ├─ configure-vm.yml # Configure disk, Cloud-Init, TPM, GPU
│ ├─ create-template.yml # Convert to template (idempotent - FIXED!)
│ ├─ create-clones.yml # Deploy clones (per-clone error handling)
│ └─ helpers.yml # 8 utility functions
├─ templates/
│ ├─ cloudinit_userdata.yaml.j2
│ └─ cloudinit_vendor.yaml.j2
└─ README.md
│ ├─ cloudinit_userdata.yaml.j2 # Cloud-Init user data template
│ └─ cloudinit_vendor.yaml.j2 # Cloud-Init vendor data template
└─ README.md # This file
```
## Requirements
- Proxmox VE installed and accessible
- Role runs on the Proxmox host via localhost, using `qm` CLI commands
- Ansible must have SSH access to the Proxmox node
- User must have permission to run `qm` commands (root recommended)
- Proxmox storage pool configured (e.g., `local-lvm`)
- Snippets storage enabled for Cloud-Init (`Datacenter → Storage`)
- **Proxmox VE** 7.x or 8.x installed and accessible
- **Ansible** 2.9+ with SSH access to Proxmox host
- **Proxmox user** with permission to run `qm` commands (root recommended)
- **Storage pool** configured (e.g., `local-lvm`)
- **Snippets storage** enabled for Cloud-Init (`Datacenter → Storage`)
## Quick Start
@@ -49,29 +61,29 @@ ANSIBLE_PROXMOX_VM/
```bash
ansible-playbook tasks/main.yml --tags preflight -vvv
```
Checks Proxmox, storage, SSH keys, permissions before running.
Checks Proxmox connectivity, storage, SSH keys, permissions.
### 2. Dry Run (No Changes)
### 2. Dry Run (Preview Changes)
```bash
ansible-playbook tasks/main.yml --check -vv
```
Shows what would happen without making changes.
Shows what would happen without making any changes.
### 3. Full Deployment
```bash
ansible-playbook tasks/main.yml -i inventory
```
Runs all stages: preflight → image → VM → configure → template → clones
Creates VM → configures it → converts to template → deploys clones
### 4. Re-run (Test Idempotency)
```bash
ansible-playbook tasks/main.yml -i inventory
```
Much faster! Skips already-completed operations (image cached, VM exists, etc.)
Second run is much faster (~30 sec)! Skips already-completed operations.
## Configuration Variables
All variables are in `defaults/main.yml` with comprehensive documentation:
All variables are in `defaults/main.yml` with comprehensive inline documentation.
### Base VM Configuration
```yaml
@@ -79,7 +91,7 @@ vm_id: 150 # Unique Proxmox VM ID (≥100)
hostname: debian-template-base # VM hostname
memory: 4096 # RAM in MB
cores: 4 # CPU cores
cpu_type: host # CPU type (host, kvm64, etc.)
cpu_type: host # CPU type
bridge: vmbr0 # Network bridge
storage: local-lvm # Storage pool
```
@@ -87,18 +99,18 @@ storage: local-lvm # Storage pool
### Networking
```yaml
ip_mode: dhcp # 'dhcp' or 'static'
ip_address: "192.168.1.60/24" # Static IP (CIDR, if static)
gateway: "192.168.1.1" # Gateway IP
ip_address: "192.168.1.60/24" # Static IP if ip_mode: static
gateway: "192.168.1.1" # Gateway
dns:
- "1.1.1.1"
- "8.8.8.8" # DNS servers
- "8.8.8.8"
```
### Cloud-Init
```yaml
ci_user: debian # Default user
ci_password: "SecurePass123" # Password (use Vault in production!)
ssh_key_path: "~/.ssh/id_rsa.pub" # SSH public key
ci_password: "SecurePass123" # Use Vault in production!
ssh_key_path: "~/.ssh/id_rsa.pub" # SSH public key path
timezone: "Europe/Berlin" # Timezone
packages:
- qemu-guest-agent
@@ -108,44 +120,35 @@ packages:
### Advanced Options
```yaml
# UEFI + TPM 2.0
enable_tpm: false
# GPU Passthrough
gpu_passthrough: false
gpu_device: "0000:01:00.0"
virtio_gpu: false
# Disk
resize_disk: true
resize_size: "16G"
# Template & Clones
make_template: true # Convert VM to template
create_clones: true # Create clones from template
enable_tpm: false # UEFI + TPM 2.0
gpu_passthrough: false # PCI GPU passthrough
virtio_gpu: false # VirtIO GPU
resize_disk: true # Auto-resize disk
resize_size: "16G" # Target disk size
make_template: true # Convert to template
create_clones: true # Deploy clones
```
### Clone Definition
```yaml
clones:
- id: 301 # Unique VM ID
hostname: app01 # Clone hostname
ip: "192.168.1.81/24" # Clone IP (CIDR)
- id: 301
hostname: app01
ip: "192.168.1.81/24"
gateway: "192.168.1.1"
full: 1 # 1=full clone, 0=linked clone
full: 1 # 1=full, 0=linked
- id: 302
hostname: app02
ip: "192.168.1.82/24"
gateway: "192.168.1.1"
full: 0 # Faster, space-saving
full: 0 # Linked clones are faster
```
See `defaults/main.yml` for all available options with documentation.
See `defaults/main.yml` for all options with detailed documentation.
## Usage
### 1. Include in a Playbook
### Include in Playbook
```yaml
- hosts: proxmox_host
become: true
@@ -153,12 +156,12 @@ See `defaults/main.yml` for all available options with documentation.
- ansible_proxmox_vm
```
### 2. Run Directly
### Run Directly
```bash
ansible-playbook tasks/main.yml -i inventory
```
### 3. Run Specific Stages (with tags)
### Using Tags (Run Specific Stages)
```bash
# Pre-flight checks only
ansible-playbook tasks/main.yml --tags preflight -vvv
@@ -169,150 +172,140 @@ ansible-playbook tasks/main.yml --skip-tags clones
# Add clones to existing template
ansible-playbook tasks/main.yml --tags clones
# Skip re-downloading image
# Skip image re-download
ansible-playbook tasks/main.yml --skip-tags image
```
## Playbook Stages
## Playbook Stages (6 Stages)
The playbook executes in 6 stages:
| Stage | Task | Purpose |
|-------|------|---------|
| 1 | `preflight-checks.yml` | Validate environment (20+ checks) |
| 2 | `download-image.yml` | Download/cache Debian image |
| 3 | `create-vm.yml` | Create base VM |
| 4 | `configure-vm.yml` | Configure disk, network, Cloud-Init |
| 5 | `create-template.yml` | Convert VM to template (idempotent!) |
| 6 | `create-clones.yml` | Deploy clones from template |
Each stage can be skipped or re-run independently using tags.
| Stage | Task | Purpose | Idempotent |
|-------|------|---------|-----------|
| 1 | `preflight-checks.yml` | Validate environment (20+ checks) | ✅ Yes |
| 2 | `download-image.yml` | Download/cache Debian image | ✅ Yes |
| 3 | `create-vm.yml` | Create base VM | ✅ Yes |
| 4 | `configure-vm.yml` | Configure disk, network, Cloud-Init | ✅ Yes |
| 5 | `create-template.yml` | Convert to template | ✅ Yes (FIXED!) |
| 6 | `create-clones.yml` | Deploy clones from template | ✅ Yes |
## Key Improvements
### ✅ Error Handling
- Automatic retry (3x, 5-second delays)
- Automatic retry with configurable delays (3x, 5-sec)
- Context-aware error messages
- Per-clone error isolation (failures don't cascade)
- Per-clone error isolation (doesn't cascade)
### ✅ Idempotency
- Safe to re-run multiple times
- Already-created VMs/templates are skipped
- Image is cached and reused
- **Template conversion is now idempotent!** (was broken in v1)
- Skips already-completed operations
- Image cached and reused
- **Template conversion idempotent** (was broken in v1!)
### ✅ Pre-flight Validation
- Proxmox connectivity
- Proxmox connectivity & permissions
- Storage pool availability
- SSH key readiness
- IP address format validation
- Permission verification
- VM ID uniqueness checks
### ✅ Advanced Features
- UEFI/TPM 2.0 support
- GPU passthrough (PCI or VirtIO)
- Disk automatic resize
- Cloud-Init user/password/SSH keys
- Automatic disk resize
- Cloud-Init with user/password/SSH
- DHCP or static networking
- Multi-clone deployment
## Cloud-Init Templates
### `cloudinit_userdata.yaml.j2`
Configured with:
- User creation ({{ ci_user }})
- SSH key injection
- Password authentication
- Timezone setting
- Package updates
- Custom commands
### `cloudinit_vendor.yaml.j2`
Configured with:
- Package installation
- DNS configuration (optional)
## Testing & Validation
### Preflight Checks
```bash
ansible-playbook tasks/main.yml --tags preflight -vvv
```
Shows all validation checks (Proxmox, storage, SSH, IPs, permissions, etc.)
### Dry Run (Preview Changes)
### Dry Run (Preview)
```bash
ansible-playbook tasks/main.yml --check -vv
```
Shows what would happen without making any changes.
### Idempotency Test
### Test Idempotency
```bash
# Run once
# First run
ansible-playbook tasks/main.yml -vv
# Run again (should be much faster)
# Second run (should be much faster)
ansible-playbook tasks/main.yml -vv
```
Second run should skip most operations and complete in ~30 seconds.
## Cloud-Init Templates
### `cloudinit_userdata.yaml.j2`
Configures:
- User creation with sudo access
- SSH key injection
- Password authentication
- Timezone setting
- Package updates
### `cloudinit_vendor.yaml.j2`
Configures:
- Package installation
- DNS settings (optional)
## Security Notes
- ⚠️ **Password**: Use Ansible Vault for `ci_password` in production:
```bash
ansible-vault create group_vars/proxmox/vault.yml
```
Then reference: `ci_password: "{{ vault_ci_password }}"`
⚠️ **Passwords**: Use Ansible Vault in production:
```bash
ansible-vault create group_vars/proxmox/vault.yml
```
Then reference: `ci_password: "{{ vault_ci_password }}"`
- ✅ **SSH Key**: Automatically validated before use
- ✅ **Permissions**: Role checks if user can run `qm` commands
- ✅ **No Hardcoded Secrets**: All sensitive data in variables
**SSH Keys**: Automatically validated before use
**Permissions**: Checks if user can run `qm` commands
**No Hardcoded Secrets**: All in variables
## Best Practices
1. **Always run with `--check` first** to preview changes
2. **Run `--tags preflight` to validate** environment setup
3. **Use `--skip-tags image`** when re-running to save time
4. **Monitor Cloud-Init inside VMs**: `cloud-init status`
5. **Test in dev environment first** before production
6. **Use linked clones** (`full: 0`) for faster deployments
7. **Enable Proxmox snippets storage** for Cloud-Init
1. Always run with `--check` first
2. Validate environment with `--tags preflight`
3. Skip image re-download with `--skip-tags image`
4. Monitor Cloud-Init: `cloud-init status` inside VM
5. Test in dev environment first
6. Use linked clones (`full: 0`) for faster deployments
7. Enable Proxmox snippets storage
## Performance
- **First run**: ~5-10 minutes (downloads image, creates VM)
- **Re-runs**: ~30 seconds (operations skipped)
- **Linked clones**: Much faster than full clones
## Troubleshooting
### VM creation fails
### Preflight validation fails
```bash
# Validate environment first
ansible-playbook tasks/main.yml --tags preflight -vvv
# Check Proxmox
qm list
qm version
pvesm status local-lvm
```
### Cloud-Init not applying
```bash
# Check inside VM
# Inside VM:
cloud-init status
cloud-init logs
# Check snippets directory
# Check snippets:
ls -la /var/lib/vz/snippets/
```
### SSH key issues
```bash
# Verify SSH key exists and is readable
# Verify SSH key
ls -la ~/.ssh/id_rsa.pub
# Run with verbose output
# Run with verbose
ansible-playbook tasks/main.yml -vvv
```
## Common Commands
## Common Proxmox Commands
```bash
# List all VMs
@@ -321,37 +314,34 @@ qm list
# Check VM status
qm status 150
# Connect to VM console
qm terminal 150
# View VM config
qm config 150
# Connect to console
qm terminal 150
# SSH into VM
ssh debian@<vm-ip>
# Check Cloud-Init status
# Check Cloud-Init
cloud-init status --all
```
## Performance Tips
- **First run**: ~5-10 minutes (downloads image, creates VM)
- **Re-runs**: ~30 seconds (image cached, operations skipped)
- **Linked clones**: Much faster than full clones
- **Tag-based execution**: Skip expensive operations
## Compatibility
- **Proxmox**: 7.x, 8.x (uses `qm` CLI)
- **Debian**: Bookworm GenericCloud (configurable)
- **Ansible**: 2.9+ (uses standard modules)
- **Backward Compatible**: 100% (all old variables still work)
- **Ansible**: 2.9+ (standard modules)
- **Backward Compatible**: 100%
## Support & Documentation
## Support
Refer to `defaults/main.yml` for complete variable documentation with examples and explanations for every option.
Refer to:
- `defaults/main.yml` - Complete variable documentation
- Task files - Inline comments explaining implementation
- Run with `-vvv` flag for debug output
- Check `/var/lib/vz/snippets/` for Cloud-Init files
## License
This role is provided as-is for Proxmox automation.
Open source - use as-is for Proxmox automation.