|
| 1 | +- Start Date: 2024-03-18 |
| 2 | +- RFC PR: [amaranth-lang/rfcs#56](https://github.com/amaranth-lang/rfcs/pull/56) |
| 3 | +- Amaranth Issue: [amaranth-lang/amaranth#1211](https://github.com/amaranth-lang/amaranth/issues/1211) |
| 4 | + |
| 5 | +# Asymmetric memory port width |
| 6 | + |
| 7 | +## Summary |
| 8 | +[summary]: #summary |
| 9 | + |
| 10 | +Memory read and write ports can have varying width, allowing eg. for memories with 8-bit read path and 32-bit write path. |
| 11 | + |
| 12 | +## Motivation |
| 13 | +[motivation]: #motivation |
| 14 | + |
| 15 | +This is a common hardware feature. It allows for eg. having a slow but wide port in one domain, and fast but narrow port in another domain. On platforms lacking dedicated hardware support, it can often be emulated almost for free. |
| 16 | + |
| 17 | + |
| 18 | +## Guide-level explanation |
| 19 | +[guide-level-explanation]: #guide-level-explanation |
| 20 | + |
| 21 | +Memories can have asymmetric port width. To use that feature, instantiate the memory with the shape of the narrowest desired port, then pass the `aggregate` argument on ports that should be wider than that: |
| 22 | + |
| 23 | +```py |
| 24 | +m.submodules.mem = mem = Memory(shape=unsigned(8), depth=4096, init=[]) |
| 25 | +# 8-bit write port |
| 26 | +wp = mem.write_port() |
| 27 | +# 32-bit read port |
| 28 | +rp = mem.read_port(aggregate=4) |
| 29 | +# Address 0x123 on rp is equivalent to addresses (0x123 * 4, 0x123 * 4 + 1, 0x123 * 4 + 2, 0x123 + 3) on wp. |
| 30 | +# Shape of rp.data is ArrayLayout(unsigned(8), 4) |
| 31 | +``` |
| 32 | + |
| 33 | +## Reference-level explanation |
| 34 | +[reference-level-explanation]: #reference-level-explanation |
| 35 | + |
| 36 | +Both `lib.memory.Memory.read_port` and `lib.memory.Memory.write_port` have a new `aggregate=None` keyword-only argument. If `aggregate` is not `None`, the behavior is as follows: |
| 37 | + |
| 38 | +- `aggregate` has to be a power of two |
| 39 | +- `mem.depth` must be divisible by `aggregate` |
| 40 | +- the `shape` passed to the `*Port.Signature` constructor becomes `ArrayLayout(memory.shape, aggregate)` |
| 41 | +- implied by the previous point, `granularity` on wide write ports is counted in terms of single memory row |
| 42 | +- the `addr_width` passed to `*Port.Signature` constructor becomes `ceil_log2(memory.depth // aggregate)` |
| 43 | + |
| 44 | +The behavior of wide ports is defined by expanding them to `aggregate` narrow ports: |
| 45 | + |
| 46 | +- the `data` of subport `i` is connected to `data[i]` of wide port |
| 47 | +- the `addr` of subport `i` is connected to `addr * aggregate + i` of wide port |
| 48 | +- for read ports and write ports without granularity, `en` is broadcast |
| 49 | +- for write ports with granularity, `en` of subport `i` is connected to `en[i // granularity]` of wide port |
| 50 | + |
| 51 | +No change is made to signature types or port types. Wide ports are recognized solely by their relation to `memory.shape`. |
| 52 | + |
| 53 | +The rules for `MemoryInstance.read_port` and `MemoryInstance.write_port` change as follows: |
| 54 | + |
| 55 | +- define `aggregate_log2 = ceil_log2(depth) - len(addr)`, `aggregate = 1 << aggregate_log2` |
| 56 | +- `aggregate_log2` must be non-negative |
| 57 | +- `depth` must be divisible by `aggregate` |
| 58 | +- `len(data)` must be equal to `width * aggregate` |
| 59 | +- for write ports, one of the following must hold: |
| 60 | + - `aggregate` is divisible by `len(en)` |
| 61 | + - `len(en)` is divisible by `aggregate` and `len(data)` is divisible by `len(en)` |
| 62 | + |
| 63 | +## Drawbacks |
| 64 | +[drawbacks]: #drawbacks |
| 65 | + |
| 66 | +More complexity. |
| 67 | + |
| 68 | +Wide write ports with sub-row write granularity cannot be expressed. However, there is no hardware that would actually natively support such a combination. |
| 69 | + |
| 70 | +## Rationale and alternatives |
| 71 | +[rationale-and-alternatives]: #rationale-and-alternatives |
| 72 | + |
| 73 | +The design is straightforward enough. |
| 74 | + |
| 75 | +An alternative is not doing this. Yosys already has an optimization pass that recognizes wide ports from a collection of narrow ports, so this is not necessarily an expressiveness hole. However, platforms with non-yosys toolchain could still benefit from custom lowering for this case. |
| 76 | + |
| 77 | +## Prior art |
| 78 | +[prior-art]: #prior-art |
| 79 | + |
| 80 | +This proposal is directly based on yosys memory model. |
| 81 | + |
| 82 | +## Unresolved questions |
| 83 | +[unresolved-questions]: #unresolved-questions |
| 84 | + |
| 85 | +None. |
| 86 | + |
| 87 | +## Future possibilities |
| 88 | +[future-possibilities]: #future-possibilities |
| 89 | + |
| 90 | +Similar functionality could potentially be added to `lib.fifo`. |
0 commit comments