1. Is there a need for some delay within the CMD_FLASHWRITE command?
Yes, if you exceed the size of the CMD-FIFO.
2. What does the BT815 do exactly> Does it simply pipe the bytes into the flash
directly? Or does it store the bytes into a buffer that has to be flushed?
I don't know.
However, assuming a best-case szenario in which the BT815 just writes out the data
immediately to the flash-chip, it has to do so in chunks of 256 bytes since that is
the page-size the chips are working with.
Transferring the data from the BT815 to the flash-chip should take maybe 40µs.
At 30MHz SPI clock with single SPI transferring 256 bytes would take about 70µs.
So the transfer alone is neglectable.
But checking the datasheet of for example the W25Q64JVSSIQ you will find
that the "Page Program Time" is usually 0.4ms and has a maximum of 3ms.
0.4ms / 8 / 256 = 195ns 1/195ns = 5MHz
3ms / 8 / 256 = 1.5µs 1/1.5µs = 666kHz
Sure, this is simplified.
But even if the data is copied with no delay whatsoever, the fact that writing a page costs time
means that the CMD-FIFO will overrun when you try to feed it several chunks of 4k without waiting at SPI clocks over 5MHz.
Sure, there may be some buffering involved but it is highly unlikely that the command co-processor
can use some additional xxx kB of SRAM that is not documented since embedded SRAM is expensive.
So it may work at slow SPI speeds but the above does not take into consideration that there is additional overhead.
The flash needs to be unlocked for writing and the pages need to be addressed.
3. Is there any limit of the data buffer length of CMD_FLASHWRITE when doing
it in a single DMA burst?
3840 bytes
The data must be in multiples of 256 and 15 * 256 = 3840.
After that EVE needs some time to clear out the CMD-FIFO.
You could try and read back the empty space in the CMD-FIFO directly after a block is transferred,
just to get an idea how long it takes to process the data.
But this may yield different results at a different temperature and a different chip has different timing.
I just read that the page programming time depends on the data, the typical value is for 50% zero bits.
Page program time (256 bytes)
MT25QL128ABA1ESE-0SIT 0.12ms to 1.8ms
MX25L12872FM2I-10G 0.33 to 1.2ms
GD25Q127CSIGR 0.5 to 2.4ms
You could use CMD_FLASHUPDATE instead if you have enough free space in RAM_G.