Skip to content

Conversation

@phsauter
Copy link
Contributor

@phsauter phsauter commented Oct 28, 2025

Requires #28 #29 #21

Implements the necessary delay-line changes to use Hyperbus on the Genesys 2 board without clock divider.
Moves the outgoing clock delay line from the top-level to phy_if to create a better split-point for hard-macros of the PHY.

phsauter and others added 14 commits October 28, 2025 16:53
 Prevents performance degradations (FIFO bubbles)
in situation where the system and
PHY are running on the same clock.
 Prevents performance degradations (FIFO bubbles)
in situation where the system and
PHY are running on the same clock.
Acording to spec:
t_DSV (data strobe valid) which is the time from
CS# going low to the first hyperbus clock can be
at most 2 clock periods long (12ns@166MHz).
This shrinks the RWDS valid window down to
one period centered on CA4 (5th data transaction).
Meaning it is valid around the 3rd rising edge of CK.

Problem:
With additional routing delay this may cause the
RWDS sample register (clocked by clk_i) to miss
the stable period of RWDS.

Solution:
Delaying the clock is allowed and gives RWDS more
time to arrive and creates a larger stable window.

It is possible to set this to zero to increase throughput.
For the worst case RWDS timing
(t_DSV max, t_CSS min and t_CKDS min)
the window of validity for RWDS is around
one clock period centered around the
3rd rising edge of CK.

This ensures we sample exactly then.
Other sampling may lead to improper results
(from sampling high Z) and increases the risk of metastability.

For long chip-to-chip delays (or slow pads)
it may still be necessary to increase the
CS falling edge to first CK edge time.
Decouples the clock domain better,
only the rwds_sample_o signal crosses
between phy and system clk.
Exact sampling edge is adjustable.
We want to give the RWDS sampler as much time as possible to get a value.
So we delay the additional latency decision to the latest point possible.
The reset was not being triggered since the gated clock stops before chip select goes high.
A sticky bit driven by the ungated clock is used to indicate start of transfer.
The counter reaching  the target value is used to reset the sticky bit.
Counter only counts while it is set (when the transfer starts until the target is reached).
@phsauter phsauter changed the title Delay-line based FPGA implementation Delay-line refactor and FPGA implementation Oct 28, 2025
Adds the Xilinx/Gensys2 delay lines for 200MHz operation.
@phsauter phsauter force-pushed the phsauter/xilinx-fpga branch from 66cca28 to 141fca8 Compare October 28, 2025 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants