-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RSR per client #193
RSR per client #193
Comments
We need to change spark-evaluate to start producing this new data. Caveat: one retrieval task (miner_id, payload_cid) can test deals from multiple clients, that’s why each task have an array of client ids. When calculating retrieval-based RSR, each measurement should contribute exactly +1 to the total number of measurements performed, irrespective of how many clients/deals were covered by this measurement. We need to figure out how to define and implement per-client measurement-based RSR and how to expand one measurement covering X clients into +1 measurement in total for each client. This change must not affect per-miner RSR (i.e. we cannot add |
The first task is to record per client retrieval stats in ContextIn ImplementationI propose we create a new function which stores the per client RSR scores. This function will be called inside of To summarize, the implementation will do the following:
For persisting this new data we will need a new table in our database. The proposed table format is the following:
Feedback is much appreciated @juliangruber @bajtos @pyropy |
It was awesome that you listed all the context, linking to source code. That was easy to follow and got me up to speed 🙏
To clarify:
Is my understanding correct?
👍 |
Yes, that is my intention. |
@NikolasHaimerl Implementation plan looks great, thanks for taking out time to write it! My proposal for table name would be something like |
Great! Please add that to the implementation plan then, so a review doesn't need to look at these comments :) |
The plan looks good to me 👍🏻
Implementation-wise, it may be simpler to ignore committees and work directly with the list of measurements. I think we needed to use committees in Illustration of what I mean: for (const m of allMeasurements) {
if (m.taskingEvaluation !== 'OK') continue
const clients = findDealClients(m)
for (const c of clients) {
// update the aggregated data for client `c` using measurement `m`
}
} OTOH, if we start from committees, we can call It can also be said that it's better to make The current (minerId, cid) => sparkRoundDetails.retrievalTasks
.find(t => t.cid === cid && t.minerId === minerId)?.clients On the second thought, I am fine with either approach - the one described in the original proposal or the measurement-based I described in this comment. It's something easy to change in the future if needed. |
Provide retrieval-based RSR calculated on a per-client basis.
Blocks #216
Related discussions:
^^ The goal is to calculate RSR for deals made by a particular data onramp provider. See https://app.hex.tech/protocol/app/ccbb785c-7fad-42b5-9553-609cde5c6acc/latest
The text was updated successfully, but these errors were encountered: