Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple Geolocation Probing #445

Merged
merged 13 commits into from
Sep 9, 2021
Merged

Conversation

Soptq
Copy link
Contributor

@Soptq Soptq commented Sep 3, 2021

This PR tries to implement simple geolocation probing in Phala Network.

Goal:

  • Implement a contract to collects desensitized geolocation data from workers.
  • Provide contract query entries.
  • Using SecretMessageChannel to secure communication between pRuntime and the contract.

Currently this PR is considered to be merged into geo-probing branch for safety, but if you think it is ready, feel free to merge it into the master branch.

Soptq added a commit to Soptq/prpc-protos that referenced this pull request Sep 3, 2021
SendCoordinateInfo RPC Sends geolocation data to pruntime to initialize a secret message channel toward geolocation contract.
Echo RPC can be used to measure network RTT.

Relevant to Phala-Network/phala-blockchain#445
@Soptq
Copy link
Contributor Author

Soptq commented Sep 3, 2021

The Protobuf changes is proposed in Phala-Network/prpc-protos#2

@Soptq
Copy link
Contributor Author

Soptq commented Sep 3, 2021

ping @kvinwang, will close the previous PR. This PR is currently finalized.

@Soptq Soptq mentioned this pull request Sep 3, 2021
3 tasks
@h4x3rotab
Copy link
Contributor

Could you rebase to the latest master branch? Now it's full of history and hard to review.

@Soptq Soptq changed the base branch from geo-probing to master September 3, 2021 03:46
@Soptq
Copy link
Contributor Author

Soptq commented Sep 3, 2021

Could you rebase to the latest master branch? Now it's full of history and hard to review.

Done. Now this PR will be merged into master branch directly.

.send_mq
.channel(sender, id_pair);
let secret_mq = SecretMessageChannel::new(&ecdh_key,
&mq,
Copy link
Collaborator

@kvinwang kvinwang Sep 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this will break the deterministic of the mq egress?
Is it better that pherry directly report the infomation to the chain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I previously implemented a version that uses pallets to redirect contract commands, but later I with @shelvenzhou decided to refactor the code to initialize the contract command call directly in pRuntime, mainly to take advantages of SecretMessageChannel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(For @kvinwang and @shelvenzhou)

Need some general discussion about feeding external data to the consensus system.

Maybe we should allow the sender to feed "loosely consistent" messages to the blockchain. The message is still signed by the sender (e.g. a worker), and has a ingress sequence id associated with it. However the sender no longer send it deterministically. The determinism only start to take effect since the message arrives the blockchain and thus permanently timestamped. Before arriving the blockchain, there's no guarantee on if the message will be processed or now.

A very naive implementation on this specific case can be:

  1. pherry gets the geolocation in some way
  2. pherry gets the latest ingress sequence of WorkerLoose(worker_pubkey)
  3. pherry sends the geolocation data to pruntime by RPC
  4. pruntime encrypts the data, and signs it with the worker identity key, return the pherry
  5. pherry sends the signed payload to the blockchain

This process is not guaranteed to be deterministic. If it fails, we can just redo it a bit later. And if we think one step further, probably we can also let not only pherry, but also any components to initiate a non-deterministic message to the mq. Examples are:

  • A simple query can results in some data feed back to the blockchain
  • An on-chain command can trigger an async action (e.g. sending a http request), and once the action is finished, it can sends the result back to the blockchain.
  • Of course, the above example can be slightly modified to be originated from pruntime itself rather than pherry.

Other thoughts?

Copy link
Contributor

@h4x3rotab h4x3rotab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides all the in-line comments, I found the geolocation never get removed. Let's consider adding a TTL to each report, and let the client to refresh the data before it gets expired.

Summary:

  1. TTL of the geolocation records
  2. Admin permission to read the raw data (or remove the ability to read raw data completely)
  3. Non-deterministic message submission

crates/phactory/src/contracts/geolocation.rs Outdated Show resolved Hide resolved
crates/phactory/src/contracts/geolocation.rs Outdated Show resolved Hide resolved
crates/phactory/src/contracts/geolocation.rs Outdated Show resolved Hide resolved
crates/phactory/src/contracts/geolocation.rs Outdated Show resolved Hide resolved
crates/phactory/src/contracts/geolocation.rs Outdated Show resolved Hide resolved
@@ -0,0 +1,3 @@
Download GeoLite-City.mmdb database at https://drive.google.com/file/d/1UDKHuZ2KQSaDvy34LMKlUPKgToZPoNUk/view?usp=sharing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to figure out the license of the database. Then maybe we can redistribute on our end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worry. I see if we meet some criteria we are fine:

  1. Auto update (we can do it later now)
  2. Not used to identify individuals or households (we only output aggregated data!)

In any case, we may consider to buy a commercial license in the future.

standalone/pherry/README.md Outdated Show resolved Hide resolved
standalone/pherry/src/main.rs Outdated Show resolved Hide resolved
standalone/pherry/src/main.rs Outdated Show resolved Hide resolved
standalone/pherry/src/main.rs Outdated Show resolved Hide resolved
@h4x3rotab h4x3rotab linked an issue Sep 4, 2021 that may be closed by this pull request
2 tasks
@Soptq
Copy link
Contributor Author

Soptq commented Sep 4, 2021

The geolocation report ttl is added to the pherry. Now workers will update the geolocation record when started, and then it will be updated again every 8 hours.

@h4x3rotab
Copy link
Contributor

I'm ok with this PR. A few follow-ups as discussed:

  1. Move the geocoding from pherry to pRuntime: We have not only pherry but also the next generation version PRB. So by moving the logic to pRuntime, we simplify the overall complexity

    This relies on the access to HTTP and file system. We have both, but not enabled by default. Will check with @kvinwang

  2. Enable geolocation submission in pRuntime.

    This relies on the design and api of Side Inputs. We have already discussed a few rounds of the design, and now approaching to the first implement.

  3. Smoke test. We can set up a local dev environment with geolocation contract running. Then we register a worker. With the geolocation get into the contract, we can query the contract to get the aggregated result.

  4. Unit tests in the geolocation contract.

@h4x3rotab h4x3rotab enabled auto-merge September 9, 2021 21:54
@h4x3rotab h4x3rotab merged commit 7d8347b into Phala-Network:master Sep 9, 2021
h4x3rotab pushed a commit to Phala-Network/prpc-protos that referenced this pull request Sep 29, 2021
SendCoordinateInfo RPC Sends geolocation data to pruntime to initialize a secret message channel toward geolocation contract.
Echo RPC can be used to measure network RTT.

Relevant to Phala-Network/phala-blockchain#445
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Geolocation Probes the simple appraoch
3 participants