- Updated the `y-axis scale` based on the maximum value of the frequency spectrum, added logarithmic scaling to the x-axis labels, and improved interpolation logic for better display.
feat ✨: Improved visualization by adding logarithmic scaling to the x-axis labels and updating the y-axis scale based on the maximum value of the frequency spectrum.
feat ✨: Added .gitignore, README files, partitions, build options, dependencies, and configuration for ESP32-S3 development board using Arduino.
refactor ♻️: Refactored the audio processing and visualization tasks into separate cores, improved CPU usage monitoring, optimized memory usage, managed inter-core communication, and enhanced network functionality.
feat ✨: Added .gitignore, README files, partitions, build options, dependencies, and configuration for ESP32-S3 development board using Arduino.
feat ✨: Added .gitignore, README files, partitions, build options, dependencies, and configuration for ESP32-S3 development board using Arduino.
feat ✨: Added .gitignore, README files, partitions, build options, dependencies, and configuration for ESP32-S3 development board using Arduino.
ESP32 Piano Note Detection System
A real-time piano note detection system implemented on ESP32 using I2S microphone input. This system can detect musical notes from C2 to C6 with adjustable sensitivity and visualization options.
Features
- Real-time audio processing using I2S microphone
- FFT-based frequency analysis
- Note detection from C2 (65.41 Hz) to C6 (1046.50 Hz)
- Dynamic threshold calibration
- Multiple note detection (up to 7 simultaneous notes)
- Harmonic filtering
- Real-time spectrum visualization
- Note timing and duration tracking
- Interactive Serial commands for system tuning
Hardware Requirements
- ESP32 development board
- I2S MEMS microphone (e.g., INMP441, SPH0645)
- USB connection for Serial monitoring
Pin Configuration
The system uses the following I2S pins by default (configurable in Config.h):
- SCK (Serial Clock): GPIO 8
- WS/LRC (Word Select/Left-Right Clock): GPIO 9
- SD (Serial Data): GPIO 10
Getting Started
- Connect the I2S microphone to the ESP32 according to the pin configuration
- Build and flash the project to your ESP32
- Open a Serial monitor at 115200 baud
- Follow the calibration process on first run
Serial Commands
The system can be controlled via Serial commands:
h- Display help menuc- Start calibration process+- Increase sensitivity (threshold up)-- Decrease sensitivity (threshold down)s- Toggle spectrum visualization
Configuration Options
All system parameters can be adjusted in Config.h:
Audio Processing
SAMPLE_RATE: 8000 Hz (good for frequencies up to 4kHz)BITS_PER_SAMPLE: 16-bit resolutionSAMPLE_BUFFER_SIZE: 1024 samplesFFT_SIZE: 1024 points
Note Detection
NOTE_FREQ_C2: 65.41 Hz (lowest detectable note)NOTE_FREQ_C6: 1046.50 Hz (highest detectable note)FREQUENCY_TOLERANCE: 3.0 HzMAX_SIMULTANEOUS_NOTES: 7MIN_NOTE_DURATION_MS: 50msNOTE_RELEASE_TIME_MS: 100ms
Calibration
CALIBRATION_DURATION_MS: 5000msCALIBRATION_PEAK_PERCENTILE: 0.95 (95th percentile)
Visualization
The system provides two visualization modes:
- Note Display:
Current Notes:
A4 (440.0 Hz, Magnitude: 2500, Duration: 250ms)
E5 (659.3 Hz, Magnitude: 1800, Duration: 150ms)
- Spectrum Display (when enabled):
Frequency Spectrum:
0Hz |▄▄▄▄▄
100Hz |██████▄
200Hz |▄▄▄
...
Performance Tuning
- Start with calibration by pressing 'c' in a quiet environment
- Play notes and observe the detection accuracy
- Use '+' and '-' to adjust sensitivity if needed
- Enable spectrum display with 's' to visualize frequency content
- Adjust
Config.hparameters if needed for your specific setup
Implementation Details
- Uses FFT for frequency analysis
- Implements peak detection with dynamic thresholding
- Filters out harmonics to prevent duplicate detections
- Tracks note timing and duration
- Uses ring buffer for real-time processing
- Calibration collects ambient noise profile
Troubleshooting
-
No notes detected:
- Check microphone connection
- Run calibration
- Increase sensitivity with '+'
- Verify audio input level in spectrum display
-
False detections:
- Run calibration in a quiet environment
- Decrease sensitivity with '-'
- Adjust
PEAK_RATIO_THRESHOLDin Config.h
-
Missing notes:
- Check if notes are within C2-C6 range
- Increase
FREQUENCY_TOLERANCE - Decrease
MIN_MAGNITUDE_THRESHOLD
Contributing
Contributions are welcome! Please read the contributing guidelines before submitting pull requests.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Development Environment Setup
Prerequisites
- PlatformIO IDE (recommended) or Arduino IDE
- ESP32 board support package
- Required libraries:
- arduino-audio-tools
- arduino-audio-driver
- WiFiManager
- AsyncTCP
- ESPAsyncWebServer
- arduinoFFT
Building with PlatformIO
- Clone the repository
- Open the project in PlatformIO
- Install dependencies:
pio lib install - Build and upload:
pio run -t upload
Memory Management
Memory Usage
- Program Memory: ~800KB
- RAM Usage: ~100KB
- DMA Buffers: 4 x 512 bytes
- FFT Working Buffer: 2048 bytes (1024 samples x 2 bytes)
Optimization Tips
- Adjust
DMA_BUFFER_COUNTbased on available RAM - Reduce
SAMPLE_BUFFER_SIZEfor lower latency - Use
PSRAMif available for larger buffer sizes
Advanced Configuration
Task Management
- Audio processing task on Core 1:
- I2S sample reading
- Audio level tracking
- Note detection and FFT analysis
- Visualization task on Core 0:
- WebSocket communication
- Spectrum visualization
- Serial interface
- Network operations
- Inter-core communication via FreeRTOS queue
- Configurable priorities in
Config.h
Audio Pipeline
- I2S DMA Input
- Sample Buffer Collection
- FFT Processing
- Peak Detection
- Note Identification
- Output Generation
Timing Parameters
- Audio Buffer Processing: ~8ms
- FFT Computation: ~5ms
- Note Detection: ~2ms
- Total Latency: ~15-20ms
Performance Optimization
CPU Usage
- Core 1 (Audio Processing):
- I2S DMA handling: ~15%
- Audio analysis: ~20%
- FFT processing: ~15%
- Core 0 (Visualization):
- WebSocket updates: ~5%
- Visualization: ~5%
- Network handling: ~5%
Memory Optimization
- Buffer Size Selection:
- Larger buffers: Better frequency resolution
- Smaller buffers: Lower latency
- DMA Configuration:
- More buffers: Better continuity
- Fewer buffers: Lower memory usage
Frequency Analysis
- FFT Resolution: 7.8125 Hz (8000/1024)
- Frequency Bins: 512 (Nyquist limit)
- Useful Range: 65.41 Hz to 1046.50 Hz
- Window Function: Hamming
Technical Details
Microphone Specifications
- Supply Voltage: 3.3V
- Sampling Rate: 8kHz
- Bit Depth: 16-bit
- SNR: >65dB (typical)
Signal Processing
- Pre-processing:
- DC offset removal
- Windowing function application
- FFT Processing:
- 1024-point real FFT
- Magnitude calculation
- Post-processing:
- Peak detection
- Harmonic filtering
- Note matching
Calibration Process
- Ambient Noise Collection (5 seconds)
- Frequency Bin Analysis
- Threshold Calculation:
- Base threshold from 95th percentile
- Per-bin noise floor mapping
- Dynamic Adjustment
Error Handling
Common Issues
- I2S Communication Errors:
- Check pin connections
- Verify I2S configuration
- Monitor serial output for error codes
- Memory Issues:
- Watch heap fragmentation
- Monitor stack usage
- Check DMA buffer allocation
Error Recovery
- Automatic I2S reset on communication errors
- Dynamic threshold adjustment
- Watchdog timer protection
Project Structure
Core Components
- AudioLevelTracker
- Real-time audio level monitoring
- Peak detection
- Threshold management
- NoteDetector
- Frequency analysis
- Note identification
- Harmonic filtering
- SpectrumVisualizer
- Real-time spectrum display
- Magnitude scaling
- ASCII visualization
File Organization
/src: Core implementation files/include: Header files and configurations/data: Additional resources/test: Unit tests
Inter-Core Communication
Queue Management
- FreeRTOS queue for audio data transfer
- 4-slot queue buffer
- Zero-copy data passing
- Non-blocking queue operations
- Automatic overflow protection
Data Flow
- Core 1 (Audio Task):
- Processes audio samples
- Performs FFT analysis
- Queues processed data
- Core 0 (Visualization Task):
- Receives processed data
- Updates visualization
- Handles network communication
Network Communication
- Asynchronous WebSocket updates
- JSON-formatted spectrum data
- Configurable update rate (50ms default)
- Automatic client cleanup
- Efficient connection management
Description
Languages
C++
100%