Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] mem_req buffer and queues #138

Open
5surim opened this issue Sep 7, 2022 · 1 comment
Open

[Question] mem_req buffer and queues #138

5surim opened this issue Sep 7, 2022 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@5surim
Copy link

5surim commented Sep 7, 2022

Hello folks,

my name is Surim who is a Ph.D. student at UCSC advised by Prof. Litz. I have been using Scarab for our research. On top of Scarab, I am working on the Fetch-Directed Instruction Prefetching (FDIP) mechanism to improve the frontend-bound applications. In our FDIP implementation, when it emits a prefetch for an instruction cache line, we call new_mem_req() function with a newly added MRT_FDIPPRF type. We have added logic for merging requests based on the priority of IFETCH and FDIPPRF types. I would like to ask some questions about the memory implementation. Could any of you answer the following questions if you have any chance to have a look?

  1. I can see there are queues with 7 different types and a single req_buffer for each core shared by all the different types of queues. The buffer has a fixed size MEM_REQ_BUFFER_ENTRIES when PRIVATE_MSHR_ON is off. Can you briefly explain why the implementation separates the buffer and queues instead of just using the buffer only? Is it to handle the different priorities of memory requests? Also, I would appreciate it if you could briefly explain when and how requests move between queues.

  2. I think that at first, it was intended to implement QUEUE_MEM and to enqueue a request when it is sent to the memory. Currently, the ‘queue’ pointer of a request becomes NULL when the request is sent to ramulator, and the request becomes undiscoverable in any of the queues. I think there exists a period when we cannot find a matching request even though the request is in process. We expect that all the requests to the same cache line address will be merged into the first request in all periods of time until the first request is processed and the cache line is actually loaded into the cache. If we lose some merging opportunities, it will lead to unnecessarily additional waste of memory bandwidth for the same cache line. I quickly checked IPC gains by finding a valid memory request directly from the whole req_buffer and merging it directly into the buffer instead of finding one from the queues. I can see IPC gains in all the benchmarks I used. Do you think we can just directly use the buffer during the merging process instead of the queues?

  3. Also, in mem_search_queue, when it finds a matching request, it only finds the request which is not in the final state (MRS_MLC_HIT_DONE, MRS_L1_HIT_DONE, MRS_MEM_DONE, MRS_FILL_DONE). Could you tell me the reason why it excludes these final states from the merging candidates?

Best,
Surim

@5surim 5surim added the question Further information is requested label Sep 7, 2022
@spruett spruett assigned spruett and bencplin and unassigned spruett Sep 7, 2022
@5surim
Copy link
Author

5surim commented Sep 28, 2022

Hello, do you have any updates on these questions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants