Graphics engine freezing

General Category > Discussion - EVE

(1/4) > >>

Maxzillian:
I have a somewhat random problem where the graphics engine appears to be freezing. Before generating and swapping a new display list I'm checking REG_DLSWAP, REG_ID and REG_CMD_READ to monitor for errors and reset as needed. Something I'm getting pretty often at this point is that REG_DLSWAP is getting stuck with a value of 2 on a particular page in my UI. When this happens the display goes black or blank.

This page isn't terribly special with this particular behavior appearing to revolve around a command to draw a string of text of 55 characters with a few line returns. Nothing too crazy, but I have seemingly narrowed the behavior down to this particular string.

Anyway, per the documentation I'm giving the display some time to recover and polling REG_DLSWAP, but after about a half second I've got it programmed to time out and just do a hard reset of the EVE IC (pulling down PD to reset and then running all the same code that is ran at boot). At no time am I getting a Co-Processor fault and I've checked REG_CMD_READ and REG_CMD_WRITE to verify their positions are equal when this occurs.

Unfortunately this is a real bear to replicate as it can go a few minutes between failures and sometimes as much as half an hour. I half wonder if I'm getting some errant data on the SPI bus, but I would expect to see some co-processor faults if that were the case.

What should I be looking for to delve into this further? Is there a more graceful way to recover from this than resetting the whole EVE IC?

Rudolph:
What EVE chip with which library on which controller, connected how, at what SPI clock?
Have you tried a different software, a different hardware, lowered the SPI clock?

Why are you reading REG_DLSWAP in the first place?
In order to update as fast as possible?

What is reading from REG_CMD_READ return when REG_DLSWAP "hangs"?

If you a using a BT81x, what is stored in RAM_ERR_REPORT when such an event occurs?

I never encountered this - or at least I did not notice. :-)
But then I am never reading REG_DLSWAP and only indirectly have it written to by using CMD_SWAP.

When anything goes wrong when using a new display module, the first thing I do is to get a SPI trace with the logic analyzer.

BRT Community:
Hello,

Would you be able to provide the display list for the page that this occurs on?

Best Regards,
BRT Community

Maxzillian:

--- Quote from: Rudolph on December 05, 2023, 05:50:45 PM ---What EVE chip with which library on which controller, connected how, at what SPI clock?
Have you tried a different software, a different hardware, lowered the SPI clock?
--- End quote ---

BT816Q with an ESP32-WROOM-32E connected over VSPI at a 26 MHz clock. We're using a modified version of your library and the ESP-IDF. I haven't tried this on different hardware yet and I have considered the lowering the clock. Occasionally I do see co-processor fault warnings that suggest the data bits aren't always correct and we're getting some corruption in that regard; the SPI bus is a bit long and when looking at it on a scope the rise and fall time of data is not great for the speed we're running. It's definitely a good idea to consider lowering the rate.

--- Quote ---Why are you reading REG_DLSWAP in the first place?
In order to update as fast as possible?
--- End quote ---

Update rate is locked in at 20 Hz using an RTOS task. When doing static discharge testing we observed we could get the screen to freeze due to data corruption of the SPI communication leading to co-processor faults so some fault checking was added to the code that is performed before the DL list is reconstructed and swapped. In all we check:

-REG_ID for detecting if the EVE IC is communicating over SPI
-REG_DLSWAP to check for the graphics engine hanging
-REG_CMD_READ to check for any co-processor faults

Depending on what happens we generally perform a hard reset of the EVE IC to get the display working again. The downside is this creates a visible blackout of the screen as the EVE IC is reinitialized and images are decompressed into RAM_G.

--- Quote ---What is reading from REG_CMD_READ return when REG_DLSWAP "hangs"?
--- End quote ---

I'm glad you mentioned this because I realized that we're checking the state of REG_DLSWAP before checking REG_CMD_READ. The code is written such that if REG_DLSWAP isn't 0 it handles and returns before REG_CMD_READ. So in all reality I may have been getting co-processor faults this whole time without realizing it. I'm currently running a test to check this and plan to change the order of operations in the code to resolve this oversight.

Edit: I finally got the bug to recreate with a bug fixed build. I still managed to get a graphics engine freeze, at the time REG_CMD_READ reported a value of 0x560 and the co-processor fault detection did not get triggered. At different times during this test I did get one random co-processor error (overflow in cmd_append()) so there is probably still some data corruption going on with the SPI communication.

--- Quote ---If you a using a BT81x, what is stored in RAM_ERR_REPORT when such an event occurs?
--- End quote ---

Refer to above.

--- Quote ---I never encountered this - or at least I did not notice. :-)
But then I am never reading REG_DLSWAP and only indirectly have it written to by using CMD_SWAP.

When anything goes wrong when using a new display module, the first thing I do is to get a SPI trace with the logic analyzer.

--- End quote ---

More than likely I've been getting co-processor faults and not realizing it because of the bug in our error checking. I do need to invest in a good logic analyzer as it's too difficult to pick out the data using our scope even though it does have a decode function; doesn't help that we also have two CAN controllers on the same SPI bus as the display so there's a fair bit of traffic going around.

So ultimately I need to wait for the issue to pop up again, but I'm pretty certain I'm getting a co-processor fault and that has been masked by how my error/fault checking is performed. Long term I probably just need to reduce the clock rate to improve reliability and we're also working towards moving the majority of our images to the display flash so the initialization time won't be as long.

--- Quote from: BRT Community on December 07, 2023, 10:54:45 AM ---Hello,

Would you be able to provide the display list for the page that this occurs on?

Best Regards,
BRT Community

--- End quote ---

What format would you like that to be in? So you just want some pseudo code or would I be better served dumping the raw data of the actual display list? Although based on the conversation above I'm pretty sure there's nothing amiss in the display list.

BRT Community:
Hello,

The Raw data contained in RAM_DL for when the issue occurs would be preferable, as this would allow us to see exactly what EVE is trying to render.
It would also allow us to verify if there are any discrepancies between the expected data in RAM_DL and what is actually in RAM_DL.

There are a couple of this that may cause REG_DLSWAP to freeze on the value of 2:

* If the Display List itself contains an infinite loop (I think unlikely in this case), which could be caused by a JUMP /CALL instruction.
* OR if the display list contains no DISPLAY instruction. (this is likely if there is an issue with the RAM_DL write)
Best Regards,
BRT Community

Navigation

[0] Message Index

[#] Next page

Go to full version