esp32i2s/README.md

# ESP32 Piano Note Detection System

A real-time piano note detection system implemented on ESP32 using I2S microphone input. This system can detect musical notes from C2 to C6 with adjustable sensitivity and visualization options.

## Features

- Real-time audio processing using I2S microphone
- FFT-based frequency analysis
- Note detection from C2 (65.41 Hz) to C6 (1046.50 Hz)
- Dynamic threshold calibration
- Multiple note detection (up to 7 simultaneous notes)
- Harmonic filtering
- Real-time spectrum visualization
- Note timing and duration tracking
- Interactive Serial commands for system tuning

## Hardware Requirements

- ESP32 development board
- I2S MEMS microphone (e.g., INMP441, SPH0645)
- USB connection for Serial monitoring

## Pin Configuration

The system uses the following I2S pins by default (configurable in Config.h):
- SCK (Serial Clock): GPIO 8
- WS/LRC (Word Select/Left-Right Clock): GPIO 9
- SD (Serial Data): GPIO 10

## Getting Started

1. Connect the I2S microphone to the ESP32 according to the pin configuration
2. Build and flash the project to your ESP32
3. Open a Serial monitor at 115200 baud
4. Follow the calibration process on first run

## Serial Commands

The system can be controlled via Serial commands:

- `h` - Display help menu
- `c` - Start calibration process
- `+` - Increase sensitivity (threshold up)
- `-` - Decrease sensitivity (threshold down)
- `s` - Toggle spectrum visualization

## Configuration Options

All system parameters can be adjusted in `Config.h`:

### Audio Processing
- `SAMPLE_RATE`: 8000 Hz (good for frequencies up to 4kHz)
- `BITS_PER_SAMPLE`: 16-bit resolution
- `SAMPLE_BUFFER_SIZE`: 1024 samples
- `FFT_SIZE`: 1024 points

### Note Detection
- `NOTE_FREQ_C2`: 65.41 Hz (lowest detectable note)
- `NOTE_FREQ_C6`: 1046.50 Hz (highest detectable note)
- `FREQUENCY_TOLERANCE`: 3.0 Hz
- `MAX_SIMULTANEOUS_NOTES`: 7
- `MIN_NOTE_DURATION_MS`: 50ms
- `NOTE_RELEASE_TIME_MS`: 100ms

### Calibration
- `CALIBRATION_DURATION_MS`: 5000ms
- `CALIBRATION_PEAK_PERCENTILE`: 0.95 (95th percentile)

## Visualization

The system provides two visualization modes:

1. Note Display:
```
Current Notes:
A4 (440.0 Hz, Magnitude: 2500, Duration: 250ms)
E5 (659.3 Hz, Magnitude: 1800, Duration: 150ms)
```

2. Spectrum Display (when enabled):
```
Frequency Spectrum:
0Hz    |▄▄▄▄▄
100Hz  |██████▄
200Hz  |▄▄▄
...
```

## Performance Tuning

1. Start with calibration by pressing 'c' in a quiet environment
2. Play notes and observe the detection accuracy
3. Use '+' and '-' to adjust sensitivity if needed
4. Enable spectrum display with 's' to visualize frequency content
5. Adjust `Config.h` parameters if needed for your specific setup

## Implementation Details

- Uses FFT for frequency analysis
- Implements peak detection with dynamic thresholding
- Filters out harmonics to prevent duplicate detections
- Tracks note timing and duration
- Uses ring buffer for real-time processing
- Calibration collects ambient noise profile

## Troubleshooting

1. No notes detected:
   - Check microphone connection
   - Run calibration
   - Increase sensitivity with '+'
   - Verify audio input level in spectrum display

2. False detections:
   - Run calibration in a quiet environment
   - Decrease sensitivity with '-'
   - Adjust `PEAK_RATIO_THRESHOLD` in Config.h

3. Missing notes:
   - Check if notes are within C2-C6 range
   - Increase `FREQUENCY_TOLERANCE`
   - Decrease `MIN_MAGNITUDE_THRESHOLD`

## Contributing

Contributions are welcome! Please read the contributing guidelines before submitting pull requests.

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Development Environment Setup

### Prerequisites
- PlatformIO IDE (recommended) or Arduino IDE
- ESP32 board support package
- Required libraries:
  - arduino-audio-tools
  - arduino-audio-driver
  - WiFiManager
  - AsyncTCP
  - ESPAsyncWebServer
  - arduinoFFT

### Building with PlatformIO
1. Clone the repository
2. Open the project in PlatformIO
3. Install dependencies:
   ```
   pio lib install
   ```
4. Build and upload:
   ```
   pio run -t upload
   ```

## Memory Management

### Memory Usage
- Program Memory: ~800KB
- RAM Usage: ~100KB
- DMA Buffers: 4 x 512 bytes
- FFT Working Buffer: 2048 bytes (1024 samples x 2 bytes)

### Optimization Tips
- Adjust `DMA_BUFFER_COUNT` based on available RAM
- Reduce `SAMPLE_BUFFER_SIZE` for lower latency
- Use `PSRAM` if available for larger buffer sizes

## Advanced Configuration

### Task Management
- Audio processing task on Core 1:
  - I2S sample reading
  - Audio level tracking
  - Note detection and FFT analysis
- Visualization task on Core 0:
  - WebSocket communication
  - Spectrum visualization
  - Serial interface
  - Network operations
- Inter-core communication via FreeRTOS queue
- Configurable priorities in `Config.h`

### Audio Pipeline
1. I2S DMA Input
2. Sample Buffer Collection
3. FFT Processing
4. Peak Detection
5. Note Identification
6. Output Generation

### Timing Parameters
- Audio Buffer Processing: ~8ms
- FFT Computation: ~5ms
- Note Detection: ~2ms
- Total Latency: ~15-20ms

## Performance Optimization

### CPU Usage
- Core 1 (Audio Processing):
  - I2S DMA handling: ~15%
  - Audio analysis: ~20%
  - FFT processing: ~15%
- Core 0 (Visualization):
  - WebSocket updates: ~5%
  - Visualization: ~5%
  - Network handling: ~5%

### Memory Optimization
1. Buffer Size Selection:
   - Larger buffers: Better frequency resolution
   - Smaller buffers: Lower latency
2. DMA Configuration:
   - More buffers: Better continuity
   - Fewer buffers: Lower memory usage

### Frequency Analysis
- FFT Resolution: 7.8125 Hz (8000/1024)
- Frequency Bins: 512 (Nyquist limit)
- Useful Range: 65.41 Hz to 1046.50 Hz
- Window Function: Hamming

## Technical Details

### Microphone Specifications
- Supply Voltage: 3.3V
- Sampling Rate: 8kHz
- Bit Depth: 16-bit
- SNR: >65dB (typical)

### Signal Processing
1. Pre-processing:
   - DC offset removal
   - Windowing function application
2. FFT Processing:
   - 1024-point real FFT
   - Magnitude calculation
3. Post-processing:
   - Peak detection
   - Harmonic filtering
   - Note matching

### Calibration Process
1. Ambient Noise Collection (5 seconds)
2. Frequency Bin Analysis
3. Threshold Calculation:
   - Base threshold from 95th percentile
   - Per-bin noise floor mapping
4. Dynamic Adjustment

## Error Handling

### Common Issues
1. I2S Communication Errors:
   - Check pin connections
   - Verify I2S configuration
   - Monitor serial output for error codes
2. Memory Issues:
   - Watch heap fragmentation
   - Monitor stack usage
   - Check DMA buffer allocation

### Error Recovery
- Automatic I2S reset on communication errors
- Dynamic threshold adjustment
- Watchdog timer protection

## Project Structure

### Core Components
1. AudioLevelTracker
   - Real-time audio level monitoring
   - Peak detection
   - Threshold management
2. NoteDetector
   - Frequency analysis
   - Note identification
   - Harmonic filtering
3. SpectrumVisualizer
   - Real-time spectrum display
   - Magnitude scaling
   - ASCII visualization

### File Organization
- `/src`: Core implementation files
- `/include`: Header files and configurations
- `/data`: Additional resources
- `/test`: Unit tests

## Inter-Core Communication

### Queue Management
- FreeRTOS queue for audio data transfer
- 4-slot queue buffer
- Zero-copy data passing
- Non-blocking queue operations
- Automatic overflow protection

### Data Flow
1. Core 1 (Audio Task):
   - Processes audio samples
   - Performs FFT analysis
   - Queues processed data
2. Core 0 (Visualization Task):
   - Receives processed data
   - Updates visualization
   - Handles network communication

### Network Communication
- Asynchronous WebSocket updates
- JSON-formatted spectrum data
- Configurable update rate (50ms default)
- Automatic client cleanup
- Efficient connection management