hannabox/MEMORY_OPTIMIZATIONS.md

9.1 KiB
Raw Permalink Blame History

ESP32 MP3 Player Memory Optimizations

This document summarizes the memory optimizations implemented to resolve out-of-memory issues in your ESP32 MP3 player.

Implemented Optimizations

Improvement for /directory

Implemented a backpressure-safe, low-heap HTML streaming solution to prevent AsyncTCP cbuf resize OOM during /directory.

Root cause

  • The previous implementation used AsyncResponseStream (a Print) and wrote faster than the TCP stack could drain. Under client/network backpressure, AsyncTCPs cbuf tried to grow and failed: cbuf.resize() -> WebResponses write(): Failed to allocate.

Fix implemented

  • Switched /directory to AsyncChunkedResponse with a stateful generator that only produces bytes when the TCP layer is ready.
  • Generates one entry at a time, respecting maxLen provided by the framework. This prevents buffer growth and heap spikes.
  • No yield() needed; backpressure is handled by the chunked response callback scheduling.

Code changes

  1. Added a tiny accessor to fetch file id at index
  • Header: src/DirectoryNode.h

    • Added: uint16_t getFileIdAt(size_t i) const;
  • Source: src/DirectoryNode.cpp

    • Implemented: uint16_t DirectoryNode::getFileIdAt(size_t i) const { return (i < ids.size()) ? ids[i] : 0; }
  1. Replaced /directory handler with AsyncChunkedResponse generator
  • File: src/main.cpp

  • New logic (high level):

    • DirectoryHtmlStreamState holds an explicit traversal stack of frames {node, fileIdx, childIdx, headerDone}.

    • next(buffer, maxLen) fills output up to maxLen with:

      • Single top-level \n
      • A name\n for non-root directories (kept original behavior—no nested per subdir)
      • One filename\n per file
      • Depth-first traversal across subdirectories
      • Closes with \n when done
    • Uses snprintf into the chunk buffer and a simple copy loop for filenames, avoiding extra heap allocations.

    • Frees generator state when finished and also on client disconnect.

  1. Minor improvements in the chunked generator
  • Normalized newline literals to \n (not escaped).
  • Used single quotes around HTML attribute values to simplify C string escaping and reduce mistakes.

What remains unchanged

  • DirectoryNode::streamDirectoryHTML(Print&) is left intact but no longer used by /directory. Mapping/State endpoints continue using their existing streaming; they are small and safe.

Why this eliminates the crashes

  • AsyncChunkedResponse only invokes the generator when theres space to send more, so AsyncTCPs cbuf wont grow unbounded. The generator respects the maxLen and yields 0 on completion, eliminating the resize path that previously caused OOM.

Build and flash instructions

  • Your environment doesnt have PlatformIO CLI available. Options:

    1. VSCode PlatformIO extension: Use the “Build” and “Upload” tasks from the PlatformIO toolbar.

    2. Install PlatformIO CLI:

      • python3 -m pip install --user platformio
      • $HOME/.local/bin must be in PATH (or use full path).
      • Then build: pio run -e d1_mini32
      • Upload: pio run -e d1_mini32 -t upload
    3. Arduino IDE/CLI: Import and build the sketch there if preferred.

Runtime test checklist

  • Open serial monitor at 115200, reset device.

  • Hit http://DEVICE_IP/directory in a browser; the page should render fully without OOM or crash.

  • Simulate slow client backpressure:

    • curl --limit-rate 5k http://DEVICE_IP/directory -v -o /dev/null
    • Observe no “[E][cbuf.cpp:104] resize(): failed to allocate temporary buffer” or “WebResponses write(): Failed to allocate”
  • Watch heap logs during serving; you should see stable heap with no large dips.

  • If desired, repeat with multiple concurrent connections to /directory to verify robustness.

Optional follow-ups

  • If mapping ever grows large, convert /mapping to AsyncChunkedResponse using the same pattern.
  • If your ESP32 has PSRAM, enabling it can further reduce heap pressure, but the chunked approach is already robust.
  • Consider enabling CONFIG_ASYNC_TCP_MAX_ACK_TIME tune if you want more aggressive backpressure timing; your platformio.ini already has some AsyncTCP stack tweaks noted.

Summary

  • Replaced Print-based recursive streaming with a chunked, backpressure-aware generator for /directory.
  • This removes the cbuf resize failure path and should stop the crashes you observed while still using minimal heap.

2. DirectoryNode Structure Optimization ( COMPLETED)

  • Added vector reserve calls in buildDirectoryTree() to reduce heap fragmentation
  • Memory saved: Reduces fragmentation and improves allocation efficiency
  • Location: src/DirectoryNode.cpp - lines with reserve(8), reserve(16)

3. Memory Pool Management ( COMPLETED)

  • Pre-allocated vector memory to prevent frequent reallocations
  • Subdirectories: Reserved space for 8 subdirectories
  • MP3 files: Reserved space for 16 MP3 files per directory
  • IDs: Reserved space for 16 IDs per directory
  • Memory saved: ~1-2KB depending on directory structure

4. Task Stack Optimization ( COMPLETED)

  • Reduced RFID task stack size from 10,000 to 4,096 words
  • Memory saved: ~6KB (approximately 6,000 bytes)
  • Location: src/main.cpp - xTaskCreatePinnedToCore() call

5. JSON Buffer Optimization ( COMPLETED)

  • Reduced JSON buffer size in getState() from 1024 to 512 bytes
  • Memory saved: 512 bytes per JSON state request
  • Location: src/main.cpp - DynamicJsonDocument jsonState(512)

Additional Recommendations (Not Yet Implemented)

ESP32-audioI2S Library Optimizations (HIGH IMPACT!)

The ESP32-audioI2S library has several configurable memory settings that can significantly reduce RAM usage:

1. Audio Buffer Size Optimization

// In your main.cpp setup(), add after audio initialization:
audio.setBufferSize(8192);  // Default is much larger (655350 bytes for PSRAM, 16000 for RAM)

Potential savings: 40-600KB depending on your current buffer size!

2. Audio Task Stack Optimization

The library uses a static audio task with 3300 words (13.2KB). You can modify this in the library:

// In Audio.h, change:
static const size_t AUDIO_STACK_SIZE = 2048;  // Instead of 3300

Potential savings: ~5KB

3. Frame Size Optimization

The library allocates different frame sizes for different codecs. For MP3-only usage:

// You can reduce buffer sizes for unused codecs by modifying Audio.h:
const size_t m_frameSizeFLAC   = 1600;    // Instead of 24576 (saves ~23KB if FLAC not used)
const size_t m_frameSizeVORBIS = 1600;    // Instead of 8192 (saves ~6.5KB if Vorbis not used)

4. Disable Unused Features

Add these build flags to platformio.ini:

build_flags = 
    -DAUDIO_NO_SD_FS          ; If you don't use SD file streaming
    -DAUDIO_NO_PSRAM          ; If you want to force RAM usage only
    -DCORE_DEBUG_LEVEL=0      ; Disable debug output

String Optimization with F() Macro

  • Use F("string") macro to store string literals in flash memory instead of RAM
  • Example: jsonState[F("playing")] instead of jsonState["playing"]
  • Potential savings: 2-3KB

Web Content Optimization

  • CSS is already moved to SD card ( done)
  • JavaScript should be moved to SD card using the provided script
  • Potential savings: ~7KB for JavaScript

Compiler Optimizations

Add to platformio.ini:

build_flags = 
    -Os                    ; Optimize for size
    -DCORE_DEBUG_LEVEL=0   ; Disable debug output
    -DARDUINO_LOOP_STACK_SIZE=4096  ; Reduce loop stack

Total Memory Savings Achieved

Optimization Memory Saved
Vector reserves ~1-2KB
RFID task stack reduction ~6KB
JSON buffer reduction 512 bytes
Current Total Savings ~7-8KB

Potential Additional Savings (ESP32-audioI2S Library)

ESP32-audioI2S Optimization Potential Memory Saved
Audio buffer size reduction 40-600KB
Audio task stack reduction ~5KB
Unused codec frame buffers ~30KB
Disable unused features 5-10KB
Potential Additional Total 80-645KB

Next Steps

  1. Copy web files to SD card:

    ./copy_to_sd.sh
    

    (Adjust the SD card mount point in the script as needed)

  2. Test the optimizations:

    • Monitor free heap using the web interface
    • Check for any stability issues
    • Verify RFID functionality with reduced stack size
  3. High-impact ESP32-audioI2S optimizations:

    // Add to setup() after audio.setPinout():
    audio.setBufferSize(8192);  // Reduce from default large buffer
    
  4. Optional further optimizations:

    • Implement F() macro for string literals
    • Add compiler optimization flags
    • Consider data type optimizations if you have <256 files
    • Modify Audio.h for unused codec optimizations

Files Modified

  • src/DirectoryNode.cpp - Added vector reserve calls
  • src/main.cpp - Reduced task stack and JSON buffer sizes
  • copy_to_sd.sh - Script to copy web files to SD card

Monitoring

The web interface displays current free heap memory. Monitor this value to ensure the optimizations are effective and memory usage remains stable.