-The current testbench has a big pause between frames, whereas the chip might push out back to back frames with only a single cycle pause between frames. It seems possible that the old logic would have been a problem, since there two incorrect states that took 2 cycles to settle. This would not have been a problem with bursting or frames with many nops between. Let's see....
-The correct way to verify this is to 1.) Improve TX to make performance as good as on the chip (less stalls) 2.) Create a testbench witht the chip reference code.
-In the meantime, we compile and pray...
-Mailbox is a pretty useful little block, registers don't belong in the RX space
-Moved registers to the "MESH" group block at bits [10:8].
-Feel good about this, should not change...
-Has been tested to work with test/test_regs.emf
-For new register address, see README.md
cc @olajep @peteasa
- Removed the cfgif block, too confusing. There is a good lesson here. Probably the n'th time I that I have been overzealous about reuse. When you end up adding a parameter to a block that duplicates the logic 2X it's always better to create two separate blocks...
- Changed the register access interface to packet format
- Change the priority on the etx_arbiter to pick read responses first
- Removed redundant signals
- Took away the read resonse bypass on remap in tx for now..
- Removed defparams (convention)
- Unified wait signal on tx
- Fixed cfg wait
-
- Bypass path was ugly! Always try to go through the same logic path as much as possible.
- Note: when MMU is enabled, you need to put in entry for read return (ie 810)
- Gating mailbox_not empty with irq_en. bit [28] of RXCFG
- Changing elink output interrupt to "or" of not_empty and full
- Adding mailbox status register (mostly for debug)
- Moving register addresses to make space for mailbox status register
- Fixing wrappers for DV
- Updating README docs with new register map
- Removing mailbox from RX status reg. Doesn't belong there, should be coupled with mailbox for modularity.
- Turns out I was debugging ghosts for ~1 day today. Everything was working in simulation but nothing works in the FPGA. Since I was only changing small logic stuff, I didn't bother checking the warning messages in Viviado. Turns out for some reason it was throwing away some logic and disconnecting all the important rr signals
- This is where I was making changes, but I still can't figure out what exactly was happening...doesn't make sense. Either there is a bug in icarus or in vivado, this shouldn't happen!
- After finding the bug in the reference model and wasting countless hours going back and forth with FPGA timing optimization and bug tweaks, I realized that the design was fundementally broken. The decision to use two clock domains (high speed) and low speed was correct from the beginning. The FPGA is dreadfully slow, (you definitely don't want to do much logic at 300MHz...), but the handoff between tclk and tclk_div4 was too complicated. The puzzle of having to respond to wait quickly, covering the corner cases, and meeting timing was just too ugly.
- The "new" design goes back to the method of using the high speed logic only for doing a "dumb" parallel to serial converter and preparing all the necessary signals in the low speed domain.
- This feel A LOT cleaner and the it already passes basic tests with the chip reference and the loopback after less than 3 hours of redesign work!
- The TX meets timing but there is still some work to do with wait pushback testing.
-Need to get into this again! (don't like this part of code still..)
-One lesson, if you are unsure of something leave the old code in comment...can save a lot of time.
- Clearing the "done" register with tx_burst. Kind of makes sense logically since while we are in burst mode we are not done.
- Still not 100% happy with this circuit, but there arent' a lot of lines of code left...
- But elink now passes 500 random burst transactions!!!
- Adding transaction counter to speed up debugging
- Clearing access signal on wait ("bubble")
- Adding back special propagation when there is a wait after io_wait.
- This is a pain in the ass and should never have been implemented in the first place!
- Burst information is contained in two places, once in the first byte being transmitted and once by the frame staying high
- This was done because there was a second special bursting mode where data is streamed into the same address, so bit[2] becomes a "command bit".