# ESP32 Piano Note Detection System A real-time piano note detection system implemented on ESP32 using I2S microphone input. This system can detect musical notes from C2 to C6 with adjustable sensitivity and visualization options. ## Features - Real-time audio processing using I2S microphone - FFT-based frequency analysis - Note detection from C2 (65.41 Hz) to C6 (1046.50 Hz) - Dynamic threshold calibration - Multiple note detection (up to 7 simultaneous notes) - Harmonic filtering - Real-time spectrum visualization - Note timing and duration tracking - Interactive Serial commands for system tuning ## Hardware Requirements - ESP32 development board - I2S MEMS microphone (e.g., INMP441, SPH0645) - USB connection for Serial monitoring ## Pin Configuration The system uses the following I2S pins by default (configurable in Config.h): - SCK (Serial Clock): GPIO 8 - WS/LRC (Word Select/Left-Right Clock): GPIO 9 - SD (Serial Data): GPIO 10 ## Getting Started 1. Connect the I2S microphone to the ESP32 according to the pin configuration 2. Build and flash the project to your ESP32 3. Open a Serial monitor at 115200 baud 4. Follow the calibration process on first run ## Serial Commands The system can be controlled via Serial commands: - `h` - Display help menu - `c` - Start calibration process - `+` - Increase sensitivity (threshold up) - `-` - Decrease sensitivity (threshold down) - `s` - Toggle spectrum visualization ## Configuration Options All system parameters can be adjusted in `Config.h`: ### Audio Processing - `SAMPLE_RATE`: 8000 Hz (good for frequencies up to 4kHz) - `BITS_PER_SAMPLE`: 16-bit resolution - `SAMPLE_BUFFER_SIZE`: 1024 samples - `FFT_SIZE`: 1024 points ### Note Detection - `NOTE_FREQ_C2`: 65.41 Hz (lowest detectable note) - `NOTE_FREQ_C6`: 1046.50 Hz (highest detectable note) - `FREQUENCY_TOLERANCE`: 3.0 Hz - `MAX_SIMULTANEOUS_NOTES`: 7 - `MIN_NOTE_DURATION_MS`: 50ms - `NOTE_RELEASE_TIME_MS`: 100ms ### Calibration - `CALIBRATION_DURATION_MS`: 5000ms - `CALIBRATION_PEAK_PERCENTILE`: 0.95 (95th percentile) ## Visualization The system provides two visualization modes: 1. Note Display: ``` Current Notes: A4 (440.0 Hz, Magnitude: 2500, Duration: 250ms) E5 (659.3 Hz, Magnitude: 1800, Duration: 150ms) ``` 2. Spectrum Display (when enabled): ``` Frequency Spectrum: 0Hz |▄▄▄▄▄ 100Hz |██████▄ 200Hz |▄▄▄ ... ``` ## Performance Tuning 1. Start with calibration by pressing 'c' in a quiet environment 2. Play notes and observe the detection accuracy 3. Use '+' and '-' to adjust sensitivity if needed 4. Enable spectrum display with 's' to visualize frequency content 5. Adjust `Config.h` parameters if needed for your specific setup ## Implementation Details - Uses FFT for frequency analysis - Implements peak detection with dynamic thresholding - Filters out harmonics to prevent duplicate detections - Tracks note timing and duration - Uses ring buffer for real-time processing - Calibration collects ambient noise profile ## Troubleshooting 1. No notes detected: - Check microphone connection - Run calibration - Increase sensitivity with '+' - Verify audio input level in spectrum display 2. False detections: - Run calibration in a quiet environment - Decrease sensitivity with '-' - Adjust `PEAK_RATIO_THRESHOLD` in Config.h 3. Missing notes: - Check if notes are within C2-C6 range - Increase `FREQUENCY_TOLERANCE` - Decrease `MIN_MAGNITUDE_THRESHOLD` ## Contributing Contributions are welcome! Please read the contributing guidelines before submitting pull requests. ## License This project is licensed under the MIT License - see the LICENSE file for details. ## Development Environment Setup ### Prerequisites - PlatformIO IDE (recommended) or Arduino IDE - ESP32 board support package - Required libraries: - arduino-audio-tools - arduino-audio-driver - WiFiManager - AsyncTCP - ESPAsyncWebServer - arduinoFFT ### Building with PlatformIO 1. Clone the repository 2. Open the project in PlatformIO 3. Install dependencies: ``` pio lib install ``` 4. Build and upload: ``` pio run -t upload ``` ## Memory Management ### Memory Usage - Program Memory: ~800KB - RAM Usage: ~100KB - DMA Buffers: 4 x 512 bytes - FFT Working Buffer: 2048 bytes (1024 samples x 2 bytes) ### Optimization Tips - Adjust `DMA_BUFFER_COUNT` based on available RAM - Reduce `SAMPLE_BUFFER_SIZE` for lower latency - Use `PSRAM` if available for larger buffer sizes ## Advanced Configuration ### Task Management - Audio processing task on Core 1: - I2S sample reading - Audio level tracking - Note detection and FFT analysis - Visualization task on Core 0: - WebSocket communication - Spectrum visualization - Serial interface - Network operations - Inter-core communication via FreeRTOS queue - Configurable priorities in `Config.h` ### Audio Pipeline 1. I2S DMA Input 2. Sample Buffer Collection 3. FFT Processing 4. Peak Detection 5. Note Identification 6. Output Generation ### Timing Parameters - Audio Buffer Processing: ~8ms - FFT Computation: ~5ms - Note Detection: ~2ms - Total Latency: ~15-20ms ## Performance Optimization ### CPU Usage - Core 1 (Audio Processing): - I2S DMA handling: ~15% - Audio analysis: ~20% - FFT processing: ~15% - Core 0 (Visualization): - WebSocket updates: ~5% - Visualization: ~5% - Network handling: ~5% ### Memory Optimization 1. Buffer Size Selection: - Larger buffers: Better frequency resolution - Smaller buffers: Lower latency 2. DMA Configuration: - More buffers: Better continuity - Fewer buffers: Lower memory usage ### Frequency Analysis - FFT Resolution: 7.8125 Hz (8000/1024) - Frequency Bins: 512 (Nyquist limit) - Useful Range: 65.41 Hz to 1046.50 Hz - Window Function: Hamming ## Technical Details ### Microphone Specifications - Supply Voltage: 3.3V - Sampling Rate: 8kHz - Bit Depth: 16-bit - SNR: >65dB (typical) ### Signal Processing 1. Pre-processing: - DC offset removal - Windowing function application 2. FFT Processing: - 1024-point real FFT - Magnitude calculation 3. Post-processing: - Peak detection - Harmonic filtering - Note matching ### Calibration Process 1. Ambient Noise Collection (5 seconds) 2. Frequency Bin Analysis 3. Threshold Calculation: - Base threshold from 95th percentile - Per-bin noise floor mapping 4. Dynamic Adjustment ## Error Handling ### Common Issues 1. I2S Communication Errors: - Check pin connections - Verify I2S configuration - Monitor serial output for error codes 2. Memory Issues: - Watch heap fragmentation - Monitor stack usage - Check DMA buffer allocation ### Error Recovery - Automatic I2S reset on communication errors - Dynamic threshold adjustment - Watchdog timer protection ## Project Structure ### Core Components 1. AudioLevelTracker - Real-time audio level monitoring - Peak detection - Threshold management 2. NoteDetector - Frequency analysis - Note identification - Harmonic filtering 3. SpectrumVisualizer - Real-time spectrum display - Magnitude scaling - ASCII visualization ### File Organization - `/src`: Core implementation files - `/include`: Header files and configurations - `/data`: Additional resources - `/test`: Unit tests ## Inter-Core Communication ### Queue Management - FreeRTOS queue for audio data transfer - 4-slot queue buffer - Zero-copy data passing - Non-blocking queue operations - Automatic overflow protection ### Data Flow 1. Core 1 (Audio Task): - Processes audio samples - Performs FFT analysis - Queues processed data 2. Core 0 (Visualization Task): - Receives processed data - Updates visualization - Handles network communication ### Network Communication - Asynchronous WebSocket updates - JSON-formatted spectrum data - Configurable update rate (50ms default) - Automatic client cleanup - Efficient connection management