Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Uber db type to gdaScore() #33

Open
yoid2000 opened this issue Jan 28, 2019 · 10 comments
Open

Add Uber db type to gdaScore() #33

yoid2000 opened this issue Jan 28, 2019 · 10 comments
Assignees

Comments

@yoid2000
Copy link
Contributor

yoid2000 commented Jan 28, 2019

Once the Uber server is working on db001, I'd like to add an interface to the Uber server to the class gdaAttack() which can be found in common/gdaScore.py.

Currently there are two interfaces, postgres and aircloak. I want to add a third, called uber_dp. This will result in changes deep within gdaAttack(), upon which pretty much everything runs, so we need to be very careful here and validate that all of the examples in common/examples and attacks/examples work after the changes.

Note that the type of interface is specified in the common/config/master.json config file, as "type". Here is where a database is configured as uber_dp.

When gdaAttack() is called, it is handed a dict containing various parameters. An example of the config file for these parameters is for instance attacks/dumbList_Infer.py.json. For uber_dp, we need to add two additional parameters, "budget" and "epsilon".

The tricky part will be establishing the connection itself and making the queries. This all happens in a method called _dbWorker(), which runs as a thread. _dbWorker() calls _processQuery(), which is the thing that makes the query and returns the answer.

_dbWorker() sets up its connection with this code:

        # Establish connection to database
        connStr = str(f"host={d['host']} port={d['port']} dbname={d['dbname']} user={d['user']} password={d['password']}")
        if self._vb: print(f"    {me}: Connect to DB with DSN '{connStr}'")
        conn = psycopg2.connect(connStr)
        cur = conn.cursor()

You'll need to add a ifelse here like if d['type'] == 'uber_dp': ..... else: and put your connection setup there. (Note that both aircloak and postgres use the same underlying interface, which is why there isn't an ifelse currently.)

Note also that there is an interface to a cache sqlite database, with handles connInsert, curInsert, connRead, curRead. These will continue to run as is, so the new interface doesn't affect that.

In _processQuery(), the following code executes the query:

            try:
                cur.execute(query['sql'])
            except psycopg2.Error as e:
                reply = dict(error=e.pgerror)
            else:
                ans = cur.fetchall()
                numCells = self._computeNumCells(ans)
                reply = dict(answer=ans,cells=numCells)

You will need to add an ifelse to do the uber_dp query instead. Note in particular that the cur.fetchall() call returns a data structure that is a list of lists like this:

[
[[a1],[b1],[c1],...],
[[a2],[b2],[c2],...],
....
[[aN],[bN],[cN],...]
]

where a, b, c, etc. are the columns returned by the query, and 1, 2, ..., N are the rows returned by the query. Your interface much replicate this structure. When the uber server returns an error, or returns an out-of-budget message, this will be encoded in the error type (i.e. reply = dict(error=e.pgerror)).

If this is done right, then all the code running above this should work unchanged.

Please regard this particular issue as a kind of master issue. You should make specific smaller issues that we can test one at a time as you go. Each of the smaller issues will have an associated push, where the example functions are all verified as running.

Note that there are a number of helper methods that currently work with both postgres and aircloak interfaces. These include getColNamesAndTypes() and getTableNames(). I don't expect these to work with uber_dp, so you don't need to worry about that.

As always, let me know if you have questions.

@yoid2000
Copy link
Contributor Author

@rbh-93 please start on this issue.

Be sure to create a new branch for this. Don't write to master.

@rbh-93
Copy link
Contributor

rbh-93 commented May 28, 2019

Hello,
I have been understanding the workflow in the gdaScore class but you mentioned:
Note that the type of interface is specified in the common/config/master.json config file, as "type". Here is where a database is configured as uber_dp.
Am I supposed to create a new "type" or should I use the existing "postgres" type?

@yoid2000
Copy link
Contributor Author

yoid2000 commented May 28, 2019 via email

@rbh-93
Copy link
Contributor

rbh-93 commented May 28, 2019

Another question is that will the _dbWorker() send the parameters (query, epsilon, budget) to the Python simpleServer which will then write the query to a file and the UberTool will read from the file and send back the result? This sending of parameters to the simpleServer.py will be done in the following part of gdaScore:

# Establish connection to database
        connStr = str(f"host={d['host']} port={d['port']} dbname={d['dbname']} user={d['user']} password={d['password']}")
        if self._vb: print(f"    {me}: Connect to DB with DSN '{connStr}'")
        conn = psycopg2.connect(connStr)
        cur = conn.cursor()

Is that correct?

@rbh-93
Copy link
Contributor

rbh-93 commented May 28, 2019

RIght now the UberTool connects to the database like this:
val con_str = "jdbc:postgresql://db001.gda-score.org:5432/" + dbName + "?ssl=true&sslfactory=org.postgresql.ssl.NonValidatingFactory&user=<username>&password=<password>"
So the UberToo is connecting to db001.gda-score.org:5432.

@yoid2000
Copy link
Contributor Author

yoid2000 commented May 28, 2019 via email

@yoid2000
Copy link
Contributor Author

yoid2000 commented May 28, 2019 via email

@fraboeni fraboeni self-assigned this Sep 19, 2019
fraboeni added a commit that referenced this issue Sep 25, 2019
…into #33-uber_interface

� Conflicts:
�	gdascore/gdaScore.py
@fraboeni
Copy link
Collaborator

fraboeni commented Feb 9, 2020

Hi @yoid2000 , I am currently working on the uber_interface branch. (https://github.com/gda-score/code/tree/uber_interface). I pushed my current working state even though it is not working. I am currently facing two issues when trying to test and thereby make it work and could benefit from your input.

  1. Could you communicate me the address of the server where the uber_dp is running?
  2. Could you give me a hint on what I need to run in order to test my changes in the project? I cannot figure out how I would initialize a process in the code that would run the gdaAttack.

Thank you for your help.

@yoid2000
Copy link
Contributor Author

Could you communicate me the address of the server where the uber_dp is running?

In this directory:

https://github.com/gda-score/anonymization-mechanisms/tree/master/uber/examples

you can find a file config.py that contains the URL of the uber DP service. It is:

https://db001.gda-score.org/ubertool

@yoid2000
Copy link
Contributor Author

Could you give me a hint on what I need to run in order to test my changes in the project? I cannot figure out how I would initialize a process in the code that would run the gdaAttack.

This is unfortunately rather complex.

This file:

https://github.com/gda-score/code/blob/master/gdascore/global_config/master.json

is a kind of master configuration. It contains all of the services, databases, and anonymization types.

Up to now, all anonymization types could be reached through 2 services, postgres and aircloak (

"services": {
"postgres": {
"host": "db001.gda-score.org",
"port": 5432,
"type": "postgres"
},
"aircloak": {
"host": "attack.aircloak.com",
"port": 9432,
"type": "aircloak"
}
},
)

You need to add a new service, which could be called uber_dp or something like that. The master config would be updated with the new service and in other places where we link the anonymization scheme with the service, etc. I could help you with that.

Then, when you want to run a test, you could do something like you find here:

https://github.com/gda-score/attacks/blob/master/examples/testSinglingOut.py

In that example, you can find a config structure that tells the code what to get from the master config to run the attack (which ultimately generates queries to the service). The config is here:

https://github.com/gda-score/attacks/blob/master/examples/testSinglingOut.py#L25-L33

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants