[ai] tcp optimizations
This commit is contained in:
parent
c14624ef92
commit
ea4461cc54
|
|
@ -4,6 +4,109 @@ This document summarizes the memory optimizations implemented to resolve out-of-
|
|||
|
||||
## Implemented Optimizations
|
||||
|
||||
|
||||
### Improvement for /directory
|
||||
Implemented a backpressure-safe, low-heap HTML streaming solution to prevent AsyncTCP cbuf resize OOM during /directory.
|
||||
|
||||
Root cause
|
||||
|
||||
- The previous implementation used AsyncResponseStream (a Print) and wrote faster than the TCP stack could drain. Under client/network backpressure, AsyncTCP’s cbuf tried to grow and failed: cbuf.resize() -> WebResponses write(): Failed to allocate.
|
||||
|
||||
Fix implemented
|
||||
|
||||
- Switched /directory to AsyncChunkedResponse with a stateful generator that only produces bytes when the TCP layer is ready.
|
||||
- Generates one entry at a time, respecting maxLen provided by the framework. This prevents buffer growth and heap spikes.
|
||||
- No yield() needed; backpressure is handled by the chunked response callback scheduling.
|
||||
|
||||
Code changes
|
||||
|
||||
1. Added a tiny accessor to fetch file id at index
|
||||
|
||||
- Header: src/DirectoryNode.h
|
||||
|
||||
- Added: uint16_t getFileIdAt(size_t i) const;
|
||||
|
||||
- Source: src/DirectoryNode.cpp
|
||||
|
||||
- Implemented: uint16_t DirectoryNode::getFileIdAt(size_t i) const { return (i < ids.size()) ? ids[i] : 0; }
|
||||
|
||||
2. Replaced /directory handler with AsyncChunkedResponse generator
|
||||
|
||||
- File: src/main.cpp
|
||||
|
||||
- New logic (high level):
|
||||
|
||||
- DirectoryHtmlStreamState holds an explicit traversal stack of frames {node, fileIdx, childIdx, headerDone}.
|
||||
|
||||
- next(buffer, maxLen) fills output up to maxLen with:
|
||||
|
||||
- Single top-level \n
|
||||
- A name\n for non-root directories (kept original behavior—no nested per subdir)
|
||||
- One filename\n per file
|
||||
- Depth-first traversal across subdirectories
|
||||
- Closes with \n when done
|
||||
|
||||
- Uses snprintf into the chunk buffer and a simple copy loop for filenames, avoiding extra heap allocations.
|
||||
|
||||
- Frees generator state when finished and also on client disconnect.
|
||||
|
||||
3. Minor improvements in the chunked generator
|
||||
|
||||
- Normalized newline literals to \n (not escaped).
|
||||
- Used single quotes around HTML attribute values to simplify C string escaping and reduce mistakes.
|
||||
|
||||
What remains unchanged
|
||||
|
||||
- DirectoryNode::streamDirectoryHTML(Print&) is left intact but no longer used by /directory. Mapping/State endpoints continue using their existing streaming; they are small and safe.
|
||||
|
||||
Why this eliminates the crashes
|
||||
|
||||
- AsyncChunkedResponse only invokes the generator when there’s space to send more, so AsyncTCP’s cbuf won’t grow unbounded. The generator respects the maxLen and yields 0 on completion, eliminating the resize path that previously caused OOM.
|
||||
|
||||
Build and flash instructions
|
||||
|
||||
- Your environment doesn’t have PlatformIO CLI available. Options:
|
||||
|
||||
1. VSCode PlatformIO extension: Use the “Build” and “Upload” tasks from the PlatformIO toolbar.
|
||||
|
||||
2. Install PlatformIO CLI:
|
||||
|
||||
- python3 -m pip install --user platformio
|
||||
- $HOME/.local/bin must be in PATH (or use full path).
|
||||
- Then build: pio run -e d1_mini32
|
||||
- Upload: pio run -e d1_mini32 -t upload
|
||||
|
||||
3. Arduino IDE/CLI: Import and build the sketch there if preferred.
|
||||
|
||||
Runtime test checklist
|
||||
|
||||
- Open serial monitor at 115200, reset device.
|
||||
|
||||
- Hit [](http://DEVICE_IP/directory)<http://DEVICE_IP/directory> in a browser; the page should render fully without OOM or crash.
|
||||
|
||||
- Simulate slow client backpressure:
|
||||
|
||||
- curl --limit-rate 5k [](http://DEVICE_IP/directory)<http://DEVICE_IP/directory> -v -o /dev/null
|
||||
- Observe no “[E][cbuf.cpp:104] resize(): failed to allocate temporary buffer” or “WebResponses write(): Failed to allocate”
|
||||
|
||||
- Watch heap logs during serving; you should see stable heap with no large dips.
|
||||
|
||||
- If desired, repeat with multiple concurrent connections to /directory to verify robustness.
|
||||
|
||||
Optional follow-ups
|
||||
|
||||
- If mapping ever grows large, convert /mapping to AsyncChunkedResponse using the same pattern.
|
||||
- If your ESP32 has PSRAM, enabling it can further reduce heap pressure, but the chunked approach is already robust.
|
||||
- Consider enabling CONFIG_ASYNC_TCP_MAX_ACK_TIME tune if you want more aggressive backpressure timing; your platformio.ini already has some AsyncTCP stack tweaks noted.
|
||||
|
||||
Summary
|
||||
|
||||
- Replaced Print-based recursive streaming with a chunked, backpressure-aware generator for /directory.
|
||||
- This removes the cbuf resize failure path and should stop the crashes you observed while still using minimal heap.
|
||||
|
||||
|
||||
|
||||
|
||||
### 2. DirectoryNode Structure Optimization (✅ COMPLETED)
|
||||
- **Added vector reserve calls** in `buildDirectoryTree()` to reduce heap fragmentation
|
||||
- **Memory saved**: Reduces fragmentation and improves allocation efficiency
|
||||
|
|
|
|||
|
|
@ -13,25 +13,23 @@ platform = https://github.com/pioarduino/platform-espressif32/releases/download/
|
|||
board = wemos_d1_mini32
|
||||
framework = arduino
|
||||
lib_deps =
|
||||
ESP32Async/AsyncTCP@3.3.8
|
||||
ESP32Async/ESPAsyncWebServer@3.7.9
|
||||
ESP32Async/ESPAsyncWebServer@3.7.10
|
||||
alanswx/ESPAsyncWiFiManager@0.31
|
||||
miguelbalboa/MFRC522@^1.4.12
|
||||
bblanchon/ArduinoJson@^6.21.3
|
||||
monitor_speed = 115200
|
||||
build_flags =
|
||||
-Os ; Optimize for size
|
||||
; -DDEBUG ; Hannabox Debugging
|
||||
-DCORE_DEBUG_LEVEL=0 ; Disable all debug output
|
||||
-DARDUINO_LOOP_STACK_SIZE=3072 ; Further reduce from 4096
|
||||
-DWIFI_TASK_STACK_SIZE=3072 ; Reduce WiFi task stack
|
||||
-DARDUINO_EVENT_TASK_STACK_SIZE=2048 ; Reduce event task stack
|
||||
-DTCPIP_TASK_STACK_SIZE=2048 ; Reduce TCP/IP stack
|
||||
-DESP_TASK_WDT_TIMEOUT_S=10 ; Reduce watchdog timeout
|
||||
; -DDEBUG ; Hannabox Debugging
|
||||
; -DCORE_DEBUG_LEVEL=0 ; Disable all debug output
|
||||
; -DARDUINO_LOOP_STACK_SIZE=4096 ; Balanced to avoid stack canary without starving heap
|
||||
; -DWIFI_TASK_STACK_SIZE=3072 ; Reduce WiFi task stack
|
||||
; -DARDUINO_EVENT_TASK_STACK_SIZE=2048 ; Reduce event task stack
|
||||
; -DTCPIP_TASK_STACK_SIZE=2048 ; Reduce TCP/IP stack
|
||||
; -DESP_TASK_WDT_TIMEOUT_S=10 ; Reduce watchdog timeout
|
||||
; -DCONFIG_ASYNC_TCP_MAX_ACK_TIME=3000
|
||||
; -DCONFIG_ASYNC_TCP_PRIORITY=10 ; (keep default)
|
||||
; -DCONFIG_ASYNC_TCP_QUEUE_SIZE=64 ; (keep default)
|
||||
; -DCONFIG_ASYNC_TCP_RUNNING_CORE=1 ; force async_tcp task to be on same core as Arduino app (default is any core)
|
||||
-DCONFIG_ASYNC_TCP_STACK_SIZE=4096 ; reduce the stack size (default is 16K)
|
||||
; -DCONFIG_ASYNC_TCP_PRIORITY=10 ; (keep default)
|
||||
; -DCONFIG_ASYNC_TCP_QUEUE_SIZE=64 ; (keep default)
|
||||
; -DCONFIG_ASYNC_TCP_RUNNING_CORE=1 ; force async_tcp task to be on same core as Arduino app (default is any core)
|
||||
-DCONFIG_ASYNC_TCP_STACK_SIZE=4096 ; reduce AsyncTCP task stack (default can be large)
|
||||
monitor_filters = esp32_exception_decoder
|
||||
board_build.partitions = huge_app.csv
|
||||
|
|
|
|||
|
|
@ -46,6 +46,11 @@ const String &DirectoryNode::getDirPath() const
|
|||
return dirPath;
|
||||
}
|
||||
|
||||
uint16_t DirectoryNode::getFileIdAt(size_t i) const
|
||||
{
|
||||
return (i < ids.size()) ? ids[i] : 0;
|
||||
}
|
||||
|
||||
|
||||
|
||||
String DirectoryNode::buildFullPath(const String &fileName) const
|
||||
|
|
@ -279,10 +284,7 @@ void DirectoryNode::printDirectoryTree(int level) const
|
|||
{
|
||||
Serial.print(F(" "));
|
||||
}
|
||||
|
||||
// Use buffer for building path
|
||||
buildFullPath(mp3File, buffer, buffer_size);
|
||||
Serial.println(buffer);
|
||||
Serial.println(mp3File);
|
||||
}
|
||||
|
||||
for (DirectoryNode *childNode : subdirectories)
|
||||
|
|
@ -647,7 +649,12 @@ DirectoryNode *DirectoryNode::advanceToNextMP3(const String ¤tGlobal)
|
|||
return this;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* @brief Not used anymore due to new
|
||||
* backpressure-safe, low-heap HTML streaming solution to prevent AsyncTCP cbuf resize OOM during /directory.
|
||||
*
|
||||
* @param out
|
||||
*/
|
||||
void DirectoryNode::streamDirectoryHTML(Print &out) const {
|
||||
#ifdef DEBUG
|
||||
Serial.printf("StreamDirectoryHTML name=%s numOfFiles=%i\n", name, mp3Files.size());
|
||||
|
|
@ -670,20 +677,15 @@ void DirectoryNode::streamDirectoryHTML(Print &out) const {
|
|||
out.print(F("<li data-id=\""));
|
||||
out.print(ids[i]);
|
||||
out.print(F("\">"));
|
||||
buildFullPath(mp3Files[i], buffer, buffer_size);
|
||||
out.print(buffer);
|
||||
out.print(mp3Files[i].c_str());
|
||||
out.println(F("</li>"));
|
||||
#ifdef DEBUG
|
||||
Serial.printf("stream song: %s\n", buffer);
|
||||
#endif
|
||||
|
||||
// Yield every few items to allow the async web server to send buffered data
|
||||
if (i % 5 == 4) {
|
||||
yield();
|
||||
}
|
||||
}
|
||||
|
||||
out.flush();
|
||||
|
||||
for (DirectoryNode* child : subdirectories) {
|
||||
child->streamDirectoryHTML(out);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -42,6 +42,7 @@ public:
|
|||
const std::vector<DirectoryNode*>& getSubdirectories() const;
|
||||
const std::vector<String>& getMP3Files() const;
|
||||
const String& getDirPath() const;
|
||||
uint16_t getFileIdAt(size_t i) const;
|
||||
|
||||
size_t getNumOfFiles() const;
|
||||
|
||||
|
|
|
|||
144
src/main.cpp
144
src/main.cpp
|
|
@ -1280,15 +1280,139 @@ void init_webserver() {
|
|||
server.on("/directory", HTTP_GET, [](AsyncWebServerRequest *request)
|
||||
{
|
||||
webreq_enter();
|
||||
request->onDisconnect([](){ webreq_exit(); });
|
||||
// Stream the response directly from the directory tree to avoid large temporary Strings
|
||||
AsyncResponseStream* stream = request->beginResponseStream(txt_html_charset, buffer_size);
|
||||
Serial.printf("Serving /directory heap=%u webreq_cnt=%u numOfFiles=%u\n", (unsigned)xPortGetFreeHeapSize(), (unsigned)webreq_cnt, rootNode.getNumOfFiles());
|
||||
stream->addHeader(hdr_cache_control_key, hdr_cache_control_val);
|
||||
stream->addHeader(hdr_connection_key, hdr_connection_val);
|
||||
// Generate HTML directly into the stream under lock
|
||||
rootNode.streamDirectoryHTML(*stream);
|
||||
request->send(stream);
|
||||
// Backpressure-safe, chunked HTML streaming to avoid cbuf growth/OOM
|
||||
struct DirectoryHtmlStreamState {
|
||||
struct Frame {
|
||||
const DirectoryNode* node;
|
||||
size_t fileIdx;
|
||||
size_t childIdx;
|
||||
bool headerDone;
|
||||
};
|
||||
|
||||
Frame stack[MAX_DEPTH];
|
||||
int top;
|
||||
bool openedUL;
|
||||
bool closedUL;
|
||||
explicit DirectoryHtmlStreamState(const DirectoryNode* root)
|
||||
: top(-1), openedUL(false), closedUL(false) {
|
||||
push(root);
|
||||
}
|
||||
inline void push(const DirectoryNode* n) {
|
||||
if (top + 1 < (int)MAX_DEPTH) {
|
||||
++top;
|
||||
stack[top] = { n, 0, 0, false };
|
||||
} else {
|
||||
// Depth exceeded: stop descending further. Listing will be truncated but safe.
|
||||
}
|
||||
}
|
||||
inline void pop() { if (top >= 0) --top; }
|
||||
inline Frame& cur() { return stack[top]; }
|
||||
|
||||
size_t next(uint8_t* out, size_t maxLen) {
|
||||
char* p = (char*)out;
|
||||
size_t remaining = maxLen;
|
||||
|
||||
auto putLiteral = [&](const char* s) {
|
||||
for (const char* q = s; *q && remaining; ++q) { *p++ = *q; --remaining; }
|
||||
return remaining != 0;
|
||||
};
|
||||
|
||||
auto putNumberLiOpen = [&](unsigned id) {
|
||||
int n = snprintf(p, remaining, "<li data-id='%u'>", id);
|
||||
if (n <= 0) return false;
|
||||
if ((size_t)n > remaining) { p += remaining; remaining = 0; return false; }
|
||||
p += n; remaining -= (size_t)n; return remaining != 0;
|
||||
};
|
||||
|
||||
auto putNumberDirHeaderOpen = [&](unsigned id) {
|
||||
int n = snprintf(p, remaining, "<li data-id='%u'><b>", id);
|
||||
if (n <= 0) return false;
|
||||
if ((size_t)n > remaining) { p += remaining; remaining = 0; return false; }
|
||||
p += n; remaining -= (size_t)n; return remaining != 0;
|
||||
};
|
||||
|
||||
auto putStrUnsafe = [&](const String& s) {
|
||||
// Follow existing behavior: raw text (no escaping)
|
||||
for (size_t i = 0; i < s.length() && remaining; ++i) { *p++ = s[i]; --remaining; }
|
||||
return remaining != 0;
|
||||
};
|
||||
|
||||
if (!openedUL) {
|
||||
putLiteral("<ul>\n");
|
||||
openedUL = true;
|
||||
if (remaining == 0) return maxLen - remaining;
|
||||
}
|
||||
|
||||
while (remaining && top >= 0) {
|
||||
Frame &fr = cur();
|
||||
const DirectoryNode* node = fr.node;
|
||||
|
||||
// Emit directory header for non-root
|
||||
if (!fr.headerDone) {
|
||||
const String& nm = node->getName();
|
||||
if (nm != "/") {
|
||||
if (!putNumberDirHeaderOpen(node->getId())) break;
|
||||
if (!putStrUnsafe(nm)) break;
|
||||
if (!putLiteral("</b></li>\n")) break;
|
||||
}
|
||||
fr.headerDone = true;
|
||||
}
|
||||
|
||||
// Emit files
|
||||
const auto& files = node->getMP3Files();
|
||||
while (remaining && fr.fileIdx < files.size()) {
|
||||
uint16_t fid = node->getFileIdAt(fr.fileIdx);
|
||||
if (!putNumberLiOpen(fid)) break;
|
||||
if (!putStrUnsafe(files[fr.fileIdx])) break;
|
||||
if (!putLiteral("</li>\n")) break;
|
||||
++fr.fileIdx;
|
||||
}
|
||||
if (remaining == 0) break;
|
||||
|
||||
// Descend into children
|
||||
const auto& children = node->getSubdirectories();
|
||||
if (fr.childIdx < children.size()) {
|
||||
const DirectoryNode* child = children[fr.childIdx++];
|
||||
push(child);
|
||||
continue;
|
||||
}
|
||||
|
||||
// Done with this node
|
||||
pop();
|
||||
}
|
||||
|
||||
if (remaining && top < 0 && !closedUL) {
|
||||
putLiteral("</ul>\n");
|
||||
closedUL = true;
|
||||
}
|
||||
|
||||
return maxLen - remaining;
|
||||
}
|
||||
};
|
||||
|
||||
struct StreamCtx { DirectoryHtmlStreamState* state; };
|
||||
auto* ctx = new StreamCtx{ new DirectoryHtmlStreamState(&rootNode) };
|
||||
auto resp = request->beginChunkedResponse(
|
||||
txt_html_charset,
|
||||
[ctx](uint8_t* buffer, size_t maxLen, size_t /*index*/) -> size_t {
|
||||
// Generate next chunk; return 0 when done, and free state
|
||||
size_t n = ctx->state ? ctx->state->next(buffer, maxLen) : 0;
|
||||
if (n == 0 && ctx->state) { delete ctx->state; ctx->state = nullptr; }
|
||||
return n;
|
||||
}
|
||||
);
|
||||
#ifdef DEBUG
|
||||
Serial.printf("Serving /directory (chunked) heap=%u webreq_cnt=%u numOfFiles=%u\n", (unsigned)xPortGetFreeHeapSize(), (unsigned)webreq_cnt, rootNode.getNumOfFiles());
|
||||
#endif
|
||||
resp->addHeader(hdr_cache_control_key, hdr_cache_control_val);
|
||||
resp->addHeader(hdr_connection_key, hdr_connection_val);
|
||||
// Ensure cleanup after transfer completes or client aborts
|
||||
request->onDisconnect([ctx](){
|
||||
if (ctx->state) { delete ctx->state; }
|
||||
delete ctx;
|
||||
webreq_exit();
|
||||
});
|
||||
request->send(resp);
|
||||
});
|
||||
|
||||
server.on("/mapping", HTTP_GET, [](AsyncWebServerRequest *request)
|
||||
|
|
@ -1468,7 +1592,7 @@ void setup()
|
|||
volume = config.initialVolume; // Update global volume variable
|
||||
|
||||
// Optimize audio buffer size to save heap (lower = less RAM, but risk of underflow on high bitrates)
|
||||
audio.setBufferSize(8000);
|
||||
audio.setBufferSize(8192);
|
||||
|
||||
Serial.println(F("Audio init"));
|
||||
|
||||
|
|
|
|||
|
|
@ -215,4 +215,6 @@ bool folderModeActive = false;
|
|||
bool pendingSeek = false;
|
||||
uint32_t pendingSeekSeconds = 0;
|
||||
|
||||
static const size_t MAX_DEPTH = 32;
|
||||
|
||||
#endif
|
||||
|
|
|
|||
Loading…
Reference in New Issue