Hey Rudolph, first off I'd like to thank you for engaging whether or not this trick is necessary or even useful.
The main problem this is meant to solve is prediction of the size of a snippet produced by a series of co-processor commands, or in my case, by higher-level API calls which make use of widgets plus our own display elements. In the source code you linked earlier, you're building a snippet once during initialization, and getting the size upon completion. This works well enough for static snippets, but generalizing this approach to snippets that are updated between frames requires synchronization between the EVE and MCU.
Writing the snippets direct to RAM_G as you mentioned would avoid the problem of RAM_DL contention, but as you point out, introduce a new problem of ensuring that the snippets are not modified during the CMD_APPEND calls. Though, this problem is simply mitigated by double-buffering the snippets on update. Moot point though, as this is not an option provided by the co-processor.
Pre-calculating multiple states of snippets, and appending the desired one based on MCU state is the closest to what I sought out to accomplish. The main issue here becomes that the API I'm building has more than two states per button as we also allow non-touch navigation using a d-pad. With a highlight state added to support the d-pad the number of per-button states increases from two to four, and quickly becomes impractical should other states be added. In our use-case, we also use different colors/blinking to indicate other state through the appearance of the buttons.
Re-issuing all the draw commands each frame where the screen is updated is probably the safest workaround. I'll admit, the main reason I didn't pursue this method is that we have some screens where very little changes per frame, and others where most of the screen content is dynamic (mostly text, but not all). Again, the goal is to provide a consistent API that handles these two cases in a uniform way. Also, in the case of text, the most SPI traffic efficient way to update a large number of text fields is using CMD_TEXT, provided the text strings are more than a few characters. Obviously, this is no longer true if the screen content is static and can be loaded to flash ahead of time.
With regards to allocating enough space for worst case size, that's exactly what I'm doing. But due to the desire to save these snippets to ram, and the structure of the append command, the actual size of the snippet is still required. It is required both for the copy to RAM_DL, and for the corresponding CMD_APPEND.
The primary advantage of this approach is the number of snippets can be arbitrarily large, with no impact on the amount of SPI traffic. In essence, it allows arbitrary granularity on how many elements are updated as a batch or snippet. Managing worst case performance can then be tackled by splitting larger snippets into smaller ones, updated less often. The commands sent by the MCU do not change regardless of how many snippets are created.
Perhaps your constraints aren't the same as the ones I'm managing in my own work?
For what it's worth, developing this trick and dealing with the FIFO wraparound didn't take much time at all.
Cheers!