Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update data preloading #61

Closed
wants to merge 0 commits into from
Closed

update data preloading #61

wants to merge 0 commits into from

Conversation

yyan7223
Copy link
Collaborator

@yyan7223 yyan7223 commented Jan 8, 2025

example

@yyan7223 yyan7223 requested a review from tancheng January 8, 2025 02:17
lib/messages.py Outdated
@@ -392,6 +392,31 @@ def str_func(s):
namespace = {'__str__': str_func}
)

#=========================================================================
# Ring for delivering data across SPM banks
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ring is the NoC for multiple CGRAs? Or for multiple tiles?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the multiple banks in the SPM of one CGRA.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then all the banks are connected to one controller:

s.data_mem.recv_raddr[height] //= s.controller.send_to_tile_load_request_addr
s.data_mem.recv_waddr[height] //= s.controller.send_to_tile_store_request_addr
s.data_mem.recv_wdata[height] //= s.controller.send_to_tile_store_request_data

Why do we need to send pkt/msg on ring for SRAM banks?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, do you mean the SRAM banks on other CGRAs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, just local SRAM banks for current CGRA.
This is for preloading data to SRAM banks before executing, just like we load ctrl signals to tiles
1736314675362

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better if this message just have data and addr? The cgra id and bank id could be inferred based on the addr2controller_lut, and send through

s.send_to_tile_store_request_addr_queue = ChannelRTL(CGRAAddrType, latency = 1)
s.send_to_tile_store_request_data_queue = ChannelRTL(CGRADataType, latency = 1)
, which are already connected to the write ports of the data memory.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I draw the arch figure for you:

image

controller/ControllerRTL.py Outdated Show resolved Hide resolved
s.send_to_ctrl_ring_ctrl_pkt = SendIfcRTL(CtrlPktType)
# s.recv_from_cpu_ctrl_pkt = RecvIfcRTL(CtrlPktType)
# s.send_to_ctrl_ring_ctrl_pkt = SendIfcRTL(CtrlPktType)
s.recv_ctrl_pkt = RecvIfcRTL(CtrlPktType)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we remove _cpu_ here? It can come from other places?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have 100 cgras, I don't think let CPU connect all of them is a good idea.
Instead, only the first cgra connect to cpu, and the rest of cgras are connected sequentially, which will be more synthesised&P&R friendly I believe?
Or I change it back.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we are aligned on this, CPU only talks to the first controller. Thus, we already have:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, then what is the purpose of recv_from_cpu_ctrl_pkt?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't worry, let me read CgraTemplateRTL.py and understand your design.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. And to briefly clarify, recv_from_cpu_ctrl_pkt of the first CGRA/controller would get pkt from CPU then send into the queue recv_ctrl_pkt_queue, then the pkt in the queue would be popped to the inter-CGRA NoC (now is a ring), note that even the CGRA_0's pkt would be sent into the ring and popped, to be uniformed/simplified.

@tancheng
Copy link
Owner

tancheng commented Jan 8, 2025

example

Why do we remove most of the ifcs? How to handle different types if they are removed.

@yyan7223
Copy link
Collaborator Author

yyan7223 commented Jan 8, 2025

example

Why do we remove most of the ifcs? How to handle different types if they are removed.

I don't remove them, the green are the updated stuff based on the original

@yyan7223 yyan7223 closed this Jan 8, 2025
@yyan7223 yyan7223 reopened this Jan 8, 2025
@tancheng
Copy link
Owner

tancheng commented Jan 8, 2025

example

Why do we remove most of the ifcs? How to handle different types if they are removed.

I don't remove them, the green are the updated stuff based on the original

You mean "added"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants