SPI Transaction Host Memory Read and Write When Reach The End of Circular Buffer

Qrenz · June 13, 2020, 01:23:07 PM

Hi, let's say I want to send Host Memory Write/Read to the RAM_CMD circular buffer.
First I set low the SPI CS (Chip Select) line, then I send the address that I'm going to write (this is RAM_CMD address + cmdOffset),
and then I send byte-0, byte-1, byte-2, so on.
I keep sending the SPI clock without pulling high the SPI CS line.
Then when it reach the end address of RAM_CMD (end of circular buffer) will it loop back by itself? or I need to handle this in my program?
(Check when it reach the end, then I pull high CS, then send again the address)
The same when read from RAM_CMD circular buffer, will it loop back automatically when reach the end address?

Thanks

Rudolph · June 15, 2020, 10:31:49 AM

Yes, the command FIFO will automatically loop around.

darkjezter · June 15, 2020, 03:27:39 PM

The RAM_CMD circular buffer will automatically wrap around IF you fill it by writing to REG_CMDB_WRITE.

If instead you write directly to the FIFO by writing to RAM_CMD plus offset, then no, it will not wrap around. I'm actually using this technique to fill the command FIFO with self-modifying code to ensure that the code is sent before it begins self-modification.

Once you've written the data to the FIFO, the EVE will not begin execution until you update REG_CMD_WRITE to indicate how much of the FIFO has been filled. Thus, in the example you've given, either once the FIFO has been filled, or at safe-points along the way, you'll still need to pull CS high and initiate a write to REG_CMD_WRITE before the EVE will consume anything you've written in this way.

By far, the easiest way to write the FIFO is to use REG_CMDB_WRITE instead, which will automatically wrap around AND update REG_CMD_WRITE as you go. I can't think of any reason not to do it this way unless you are writing self-modifying code to the FIFO.

Hope this helps.

Rudolph · June 15, 2020, 05:54:56 PM

Quote from: darkjezter on June 15, 2020, 03:27:39 PM
The RAM_CMD circular buffer will automatically wrap around IF you fill it by writing to REG_CMDB_WRITE.

If instead you write directly to the FIFO by writing to RAM_CMD plus offset, then no, it will not wrap around.

This is not correct, writing to RAM_CMD + offset *does* automatically wrap around.
My library is not even using REG_CMDB_WRITE and I am using DMA to write the display lists.
This would fail pretty fast if the command FIFO would not wrap around.

The one that does not wrap around is the MEDIAFIFO.

darkjezter · June 16, 2020, 03:16:49 PM

My mistake, upon reviewing my code it turns out that it's CMD_MEMCPY which does not wrap at the FIFO boundary.

Though, I'm still at a loss as to the benefit of doing so with the exception of suppressing updates to REG_CMD_WRITE.

Rudolph · June 16, 2020, 07:56:04 PM

Quote from: darkjezter on June 16, 2020, 03:16:49 PM
Though, I'm still at a loss as to the benefit of doing so with the exception of suppressing updates to REG_CMD_WRITE.

There is no benefit to write to RAM_CMD + offset over writing to REG_CMDB_WRITE.
Well okay, it works on FT80x as well.

But what is REG_CMDB_WRITE actually meant to be used for?
In AN_390 we find this:
"To offload work from the MCU for checking the free space in the circular buffer, the FT81x offers
two auxiliary registers â€œREG_CMDB_SPACEâ€ and â€œREG_CMDB_WRITEâ€ for bulk transfers. It
enables the MCU to write commands and data to the co-processor in a bulk transfer, without
computing the free space in the circular buffer and increasing the address. As long as the amount
of data to be transferred is less than the value in the register â€œREG_CMDB_SPACEâ€, the MCU is
able to safely write all the data to â€œREG_CMDB_WRITEâ€ in one write transfer."

Well, when I started to play with EVE back in december 2015 I was using a VM800B35A with a FT800.
So no REG_CMDB_SPACE.
But I also did not have the issue that led to implementing REG_CMDB_WRITE, I never had to "offload work from the MCU for checking the free space" as my library never did check the free space.
That is a concept that I decided very early on to not copy from the original FTDI driver.
I still remember that I was very surprised when I found out that the FTDI driver was reading REG_CMD_READ and REG_CMD_WRITE on every single command to calculate the free space in the FIFO.

My library still does not calculate the free space in the FIFO and I am also not using REG_CMDB_SPACE.
My EVE_busy() function does only read REG_CMD_READ and compares that with my own "cmdOffset" var that all my commands update and that gets written to REG_CMD_WRITE.

And this works just fine since the FIFO is always empty anyways.

At least when writing a chunk of data of <4k for a new display list every 20ms.

So my library already has very low overhead and it supports FT80x as well, so I never really had to change
it to use REG_CMDB_WRITE, I never had the problem that lead to implementing it.
And even less so since I started to use DMA to transfer the display list to the cmd FIFO.

I plan to drop FT80x support with 5.0 of my library and this will happen when I implement support for BT817/BT818.
Switching over to REG_CMDB_WRITE then should save a few addtional clock cycles per command, like <10 or so.
Hmm, the longer I think about this the more I know that I need to test this. :-)

What could potentially benefit from REG_CMDB_WRITE and REG_CMDB_SPACE would be CMD_LOADIMAGE.

2k at 8MHz is ~3ms. (calculated with 12 bits per byte to account for the pauses between bytes)
I did a test a while back with a FT813 and decoding a 3867 bytes .png took 53ms.
And decoding a .jpg with 3903 bytes took 480Âµs

So sending a 64k .jpg in 2k chunks would mean that this would take about 96ms,
the FIFO would always be empty for the second packet.
At a theorethical 80Mhz for the SPI the transfer would take less time than the decoding.
I am sending the data in chunks of 3840 bytes now so 64k would take 17 chunks and a few bytes.
With about ~5.8ms for 3840 bytes and 17 times waiting for 0.5Âµs this is about 107ms.

So that would shave off about 10ms and not even with 10 images of 64k this would really make a noticeable difference.
At least not when you only load the images once when starting the program.

I wanted to calculate that for .png as well but decided after a couple of steps to not do it fully. :-)
Using .png is not really a valid option in the first place, even less so since we got ASTC.

PhilipJ · June 18, 2020, 10:17:00 AM

Hi
thanks for this info.
I was unaware that I could let the EVE take care of the wraparound for the RAM_CMD buffer. Currently my library functions write to RAM_CMD + offset, and my code increments offset with Modulo 4096 to do the wrap-around.

i.e. offset = (offset + 4) % 4096;

I had been trying to work out how I could build a Command List in memory and then use a DMA channel to transfer the buffer to the EVE via the SPI interface. I couldn't work out how the DMA would deal with the Modulo 4096 on the address it was writing to but with what you guys have described above, it isn't necessary. The DMA would just write to REG_CMDB_WRITE each time!!

Thanks
PhilipJ

Rudolph · June 18, 2020, 07:14:36 PM

Quote from: PhilipJ on June 18, 2020, 10:17:00 AM
I couldn't work out how the DMA would deal with the Modulo 4096 on the address it was writing to but with what you guys have described above, it isn't necessary. The DMA would just write to REG_CMDB_WRITE each time!!

Yes, the DMA does not have to deal with the Modulo 4096.
Yes, using REG_CMDB_WRITE for this works if you do not use a FT800/FT801.

But you still can write for example 1024 bytes of data with the DMA to RAM_CMD + 4080, this just works as the FIFO wraps around by itself.
The difference is that without REG_CMDB_WRITE you need to keep track of the offset in your programm and make
it wrap around to follow the FIFO, all my commands just do this now since regardless of DMA or not, after the command or the buffer is sent REG_CMD_WRITE needs to be written.

But yes, using REG_CMDB_WRITE instead is not only easier but also a little faster and takes less program-memory since you do not have to keep track of the offset anymore.
I have tested this now and this really is annother good reason to give up FT80x support.

One nice thing is that after transmitting the DMA buffer my end-of-dma interrupt now needs
to set REG_CMD_WRITE to make EVE start to process the buffer.
With REG_CMDB_WRITE the interrupt only raises CS and clears my dma-busy-flag.

BRT Community

News:

SPI Transaction Host Memory Read and Write When Reach The End of Circular Buffer

Qrenz

Rudolph

darkjezter

Rudolph

darkjezter

Rudolph

PhilipJ

Rudolph