- Turns out I had a nasty bug that was masked by using my own RX to loopback the TX. Since the new RX is very benign with a programmable fifo full flag the timing is quite relaxed.
- The legacy elink for e16 has a strict wait policy. When wait is raised high, you must stop pretty much immediately.
- I struggled with testing this bug on the parallella for 2 days.
- Putting together the test environment uncovered the bug in a couple of hours. F**K, I should know better!!
- The axi slave can never drive enough reads to saturate the maxi fifo since it's only sending out one read at a time.
- Changing the system so that a raw elink sits in front of stimulus..
-Need to get into this again! (don't like this part of code still..)
-One lesson, if you are unsure of something leave the old code in comment...can save a lot of time.
- Turns out there was a bug hidden in the emaxi that can only be found by properly driving a master device with reads. This could not happen in the old environment.
- Note that due to limitations in the esaxi, I had to add the etx_fifo block as an interface (simplest).
- The ESAXI is very limited in that it MUST interface to a fifo with spare entries. (so prog_full). This should be FIXED!
- Minimal test passes, now to try to reproduce the DMA bug..
- Clearing the "done" register with tx_burst. Kind of makes sense logically since while we are in burst mode we are not done.
- Still not 100% happy with this circuit, but there arent' a lot of lines of code left...
- But elink now passes 500 random burst transactions!!!
- Adding transaction counter to speed up debugging
- Clearing access signal on wait ("bubble")
- Adding back special propagation when there is a wait after io_wait.
- This is a pain in the ass and should never have been implemented in the first place!
- Burst information is contained in two places, once in the first byte being transmitted and once by the frame staying high
- This was done because there was a second special bursting mode where data is streamed into the same address, so bit[2] becomes a "command bit".
-Solved a speed path in synchronizing the wait signal, had to use the first edge signal fo the IO and the lclk_div4 for the core logic. It seems that the FPGA has a really hard time mixing clock domains, the routing delay between domains explodes
-Put in some special case logic for edge cases, like when there is a wait coming in from the IO and there is a wait from the IO. In that case, the packet gets sampled by the IO and not by the current logic.
-This needs to be cleaned up eventually, not clean enough but it's good enough for now.
- The burst signal was going fro lclk_div4 domain straight into the io high speed domain. There is quite a bit of logic on this signal. Instead of starting with false paths or multi cycle paths with firstedge, I changed the pipeline.
- The logic was a mess, causing me to go around in circles for days. In the end, by adding a missing sync circuit (duh!) between the fast and slow clock to align the edges and removing a redundant pipeline stage ("double") the nasty logic just fell away. Looks good now.
-Write bursts mostly works and design looks clean.
-one bug left to fix on streams of writes...
- The burst signal needs to be pipelined like everything else (0th order..)
- Don't look at write signal when pushing back wait...WILL GO BACK AND REVISIT THIS ONE LATER.
- Yeah, burst write test now passes!!!!
- Old design was not workable with bursting and long waits. The wait signal needs to be very carfully handled since it's asynchronous to the clock.
-The TX needs to be stopped quickly so the sync needs to be done at the high speed clock, not at div4 clock
-Since there are synchronizers here, there should be only one point of sync. This is not completely the case still, but I think??? it should be safe by constructiona at this point.
-bursting working at this point for writes!!!!!