[Question] mem_req buffer and queues #138

5surim · 2022-09-07T18:36:37Z

Hello folks,

my name is Surim who is a Ph.D. student at UCSC advised by Prof. Litz. I have been using Scarab for our research. On top of Scarab, I am working on the Fetch-Directed Instruction Prefetching (FDIP) mechanism to improve the frontend-bound applications. In our FDIP implementation, when it emits a prefetch for an instruction cache line, we call new_mem_req() function with a newly added MRT_FDIPPRF type. We have added logic for merging requests based on the priority of IFETCH and FDIPPRF types. I would like to ask some questions about the memory implementation. Could any of you answer the following questions if you have any chance to have a look?

I can see there are queues with 7 different types and a single req_buffer for each core shared by all the different types of queues. The buffer has a fixed size MEM_REQ_BUFFER_ENTRIES when PRIVATE_MSHR_ON is off. Can you briefly explain why the implementation separates the buffer and queues instead of just using the buffer only? Is it to handle the different priorities of memory requests? Also, I would appreciate it if you could briefly explain when and how requests move between queues.
I think that at first, it was intended to implement QUEUE_MEM and to enqueue a request when it is sent to the memory. Currently, the ‘queue’ pointer of a request becomes NULL when the request is sent to ramulator, and the request becomes undiscoverable in any of the queues. I think there exists a period when we cannot find a matching request even though the request is in process. We expect that all the requests to the same cache line address will be merged into the first request in all periods of time until the first request is processed and the cache line is actually loaded into the cache. If we lose some merging opportunities, it will lead to unnecessarily additional waste of memory bandwidth for the same cache line. I quickly checked IPC gains by finding a valid memory request directly from the whole req_buffer and merging it directly into the buffer instead of finding one from the queues. I can see IPC gains in all the benchmarks I used. Do you think we can just directly use the buffer during the merging process instead of the queues?
Also, in mem_search_queue, when it finds a matching request, it only finds the request which is not in the final state (MRS_MLC_HIT_DONE, MRS_L1_HIT_DONE, MRS_MEM_DONE, MRS_FILL_DONE). Could you tell me the reason why it excludes these final states from the merging candidates?

Best,
Surim

5surim · 2022-09-28T17:30:40Z

Hello, do you have any updates on these questions?

5surim added the question Further information is requested label Sep 7, 2022

spruett assigned spruett and bencplin and unassigned spruett Sep 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] mem_req buffer and queues #138

[Question] mem_req buffer and queues #138

5surim commented Sep 7, 2022

5surim commented Sep 28, 2022

[Question] mem_req buffer and queues #138

[Question] mem_req buffer and queues #138

Comments

5surim commented Sep 7, 2022

5surim commented Sep 28, 2022