Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added new elements to delete event #392

Merged
merged 35 commits into from
Aug 27, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
1095e6e
Added canonical_facts to delete method and HostEvent class
aleccohan Aug 2, 2019
79c48c5
Small Changes to a few files for local development
aleccohan Aug 5, 2019
610227f
changed hack_client.py to not upload multiple chunks
aleccohan Aug 5, 2019
5038067
More small changes to try and setup the local dev enviroment
aleccohan Aug 5, 2019
b0d8808
added insights_id to delete event and all canonical facts to HostEven…
aleccohan Aug 7, 2019
4c95079
deleted unecessary comments
aleccohan Aug 7, 2019
c046158
removed utils folder containing unecessary tests
aleccohan Aug 7, 2019
08a275c
Added instructions about adding kafka to hosts file to README
aleccohan Aug 7, 2019
addc78e
Changed original host_id_to_delete.append() into a dictionary & added…
aleccohan Aug 8, 2019
40f6db1
added additional conditions to DeleteHostsTestCase to test changes to…
aleccohan Aug 8, 2019
e6de4f2
Fine tuned tests to work with mac_addresses and ip_addresses
aleccohan Aug 8, 2019
a789023
added account to delete method to help with callback to legacy api
aleccohan Aug 9, 2019
bc5b827
Merge branch 'new_delete_conditions'
aleccohan Aug 9, 2019
d6f8476
tested changes with mock data and removed mock data from events.py
aleccohan Aug 12, 2019
8937269
Merge branch 'master' of https://github.com/RedHatInsights/insights-h…
aleccohan Aug 13, 2019
37ae521
Merge remote-tracking branch 'origin' into added_delete_tests
aleccohan Aug 13, 2019
5595299
Merge branch 'master' of https://github.com/RedHatInsights/insights-h…
aleccohan Aug 14, 2019
8a76897
made suggested changes, used serialization method for not defined can…
aleccohan Aug 13, 2019
4612bf4
ran changes through pre-commit screening & implemented suggestions
aleccohan Aug 14, 2019
db47368
Merge branch 'master' of https://github.com/RedHatInsights/insights-h…
aleccohan Aug 14, 2019
b2299f0
Merge branch 'master' into new_delete_conditions
aleccohan Aug 14, 2019
a698d7f
applied suggestions made, added test for timestamp and type, generate…
aleccohan Aug 14, 2019
1bbf69c
Merged with master
aleccohan Aug 14, 2019
d1b46a2
made changes based on feedback
aleccohan Aug 15, 2019
44b99d3
merged added_delete_tests with new_delete_conditions
aleccohan Aug 15, 2019
1af93ce
made suggested changes to test_api.py
aleccohan Aug 15, 2019
8e1892e
Updated changes based on feedback
aleccohan Aug 2, 2019
2dc997b
Merge branch 'new_delete_conditions' of https://github.com/aleccohan/…
aleccohan Aug 15, 2019
05367ff
removed validations from HostEvent(Schema)
aleccohan Aug 16, 2019
355ba89
Added validate back in to match create/update schema
aleccohan Aug 21, 2019
fd38720
Fix 500 ERROR and adapt ObjectDeletedError to new code
aleccohan Aug 22, 2019
60025fb
Added request_id to the delete event
aleccohan Aug 23, 2019
a2b0c4d
removing validations from Schema
aleccohan Aug 23, 2019
c4c01ef
simplified validations to include specified attributes
aleccohan Aug 26, 2019
b586b70
fixed Delete event tests in test_api.py
aleccohan Aug 26, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,6 @@ _prometheus_multiproc_dir_ environment variable. This is done automatically.
python run_gunicorn.py
```


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change shouldn’t be here, but let it be.

## Configuration environment variables

```
Expand Down
26 changes: 13 additions & 13 deletions api/host.py
Original file line number Diff line number Diff line change
Expand Up @@ -283,29 +283,29 @@ def find_hosts_by_hostname_or_id(account_number, hostname):
def delete_by_id(host_id_list):
query = _get_host_list_by_id_list(current_identity.account_number, host_id_list)

host_ids_to_delete = []
hosts_to_delete = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better! 😀

for host in query.all():
try:
host_ids_to_delete.append(host.id)
except sqlalchemy.orm.exc.ObjectDeletedError:
logger.exception(
"Encountered sqlalchemy.orm.exc.ObjectDeletedError exception during delete_by_id operation. Host was "
"already deleted."
)
hosts_to_delete.append(host)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query.all() actually returns a pure list, so we can do hosts_to_delete = query.all() directly. But as I noted in the next comment, we are not checking now in any way whether the host has been already deleted or not.

I guess we need to figure out how to eagerly load the host here, essentially doing manually what happens on the attribute access. I would not eagerly load all of them at once, just do something like object identity check here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we’re not going to eagerly load in this PR, please abbreviate this to

Suggested change
hosts_to_delete.append(host)
hosts_to_delete = query.all()


if not host_ids_to_delete:
if not hosts_to_delete:
return flask.abort(status.HTTP_404_NOT_FOUND)

with metrics.delete_host_processing_time.time():
query.delete(synchronize_session="fetch")
db.session.commit()

metrics.delete_host_count.inc(len(host_ids_to_delete))
metrics.delete_host_count.inc(len(hosts_to_delete))

logger.debug("Deleted hosts: %s", host_ids_to_delete)
logger.debug("Deleted hosts: %s", hosts_to_delete)
Copy link
Collaborator

@Glutexo Glutexo Aug 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works only thanks to the fact that logger has some own exception handling mechanism. This logging call actually calls _Host._repr for every host, even for those that are already deleted. This function internally calls Host.id that triggers the lazy loading and bang, an uncaught exception occurs.

Hopefully, this is internally caught by the logger output to stderr and the execution continues normally. This is probably for proper logging calls not to cause unexpected termination. But if logger’s interpolation weren’t used (f"Deleted hosts: {hosts_to_delete}"), this would explode spectacularly.

So, sorry, @dehort. I understand that you don’t want to have this pending any longer, but we’re really on a thin ice here. The logging doesn’t crash just because it behaves unexpectedly wise, and catching the error there when emitting an event, that’s just randomly putting a try-expect on a first side-effect that triggers the lazy loading.

If you really need to have this merged in the current state. Please, at least, let @aleccohan to fix this in a subsequent PR, having that properly in a Sprint. This is not something we want in our codebase. It can go well with writing automatic test for this.


for deleted_host_id in host_ids_to_delete:
emit_event(events.delete(deleted_host_id))
for deleted_host in hosts_to_delete:
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change actually modifies the behavior. Before the change, if a host has been already deleted, it is not added to the hosts_to_delete list. That caused aborting with 404 if no hosts were actually deleted, and also those not actually deleted hosts not to be logged. Now, all hosts to delete are logged even though they have been already deleted and raise the ObjectDeletedError here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might change the behavior a bit but i'm not sure how big of a change it really is.

With this change, the only difference I see is that we might not send back a 404 in the case where all the hosts that were requested to be deleted, were already deleted. (This situation is likely pretty rare anyhow.) I suspect with the try/catch added around the emit_event(), then we shouldn't produce a delete event in the case where a host was deleted already (that would be the correct behavior in my opinion).

In either case (200 return code vs 404 return code), the host would be deleted and on event would be logged. I'm not sure that's a big difference.

emit_event(events.delete(deleted_host))
except sqlalchemy.orm.exc.ObjectDeletedError:
logger.exception(
"Encountered sqlalchemy.orm.exc.ObjectDeletedError exception during delete_by_id operation. Host was "
"already deleted."
)


@api_operation
Expand Down
24 changes: 22 additions & 2 deletions app/events.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,34 @@
from marshmallow import fields
from marshmallow import Schema

from app.logging import threadctx
from app.models import CanonicalFacts
from app.validators import verify_uuid_format

logger = logging.getLogger(__name__)


class HostEvent(Schema):
id = fields.UUID()
timestamp = fields.DateTime(format="iso8601")
type = fields.Str()
account = fields.Str(required=True)
insights_id = fields.Str(validate=verify_uuid_format)
request_id = fields.Str()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the request ID has a fixed format? Even if so, we probably don’t want to specify it here as we’re only blindly forwarding it.



def delete(id):
return HostEvent(strict=True).dumps({"id": id, "timestamp": datetime.utcnow(), "type": "delete"}).data
def delete(host):
return (
HostEvent()
Copy link
Collaborator

@Glutexo Glutexo Aug 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need to use the Schema to format the message? It’s our code that composes it, so it is inherently trusted. The format can (and should) be tested by our tests. Using Schema for validation makes sense for data from the outside.

Host serialization currently doesn’t use Schema and it results in a discrepancy in how DateTime is formatted. In the HTTP output, the created and updated are formatted manually as 2019-08-15T11:36:28.105816Z. Marshmallow formats it as 2019-08-15T11:36:28.105816+00:00. Both are valid ISO 8601 formats, but they are different.

But maybe a good solution would be to use a Schema for the Host output too instead of removing it from here. This looks like a new issue. #399

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using the Schemas for defining what our input and output should look like. I think it makes it easy to point someone to the schema so that they can see what is expected for input/output. Otherwise, it can be kind of difficult to determine what the expected input/output is.

I did just think of something that we need to watch out for though. I think we should remove the validation checks from this schema. As @Glutexo mentioned, we are the ones producing the data...so there shouldn't be a need to validate it on the way out. Plus if someone tries to delete an "old" host that has data that doesn't match our current rules, then it would fail to produce a delete event for that object.

.dumps(
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be missing the id field.

Copy link
Collaborator

@Glutexo Glutexo Aug 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really! It probably wasn’t in my proposed code snippet and nobody realized. Good catch!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thing is that the current tests catch this. 🙂

"timestamp": datetime.utcnow(),
"id": host.id,
**CanonicalFacts.to_json(host.canonical_facts),
"account": host.account,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"account": host.account,
"id": host.id,
"account": host.account,

"request_id": threadctx.request_id,
"type": "delete",
}
)
.data
)
29 changes: 24 additions & 5 deletions test_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -1107,10 +1107,18 @@ def create_hosts(self):

for host in hosts_to_create:
host_wrapper = HostWrapper()
host_wrapper.id = generate_uuid()
host_wrapper.account = ACCOUNT
host_wrapper.display_name = host[0]
host_wrapper.insights_id = host[1]
host_wrapper.insights_id = generate_uuid()
host_wrapper.rhel_machine_id = generate_uuid()
host_wrapper.subscription_manager_id = generate_uuid()
host_wrapper.satellite_id = generate_uuid()
host_wrapper.bios_uuid = generate_uuid()
host_wrapper.ip_addresses = ["10.0.0.2"]
host_wrapper.fqdn = host[2]
host_wrapper.mac_addresses = ["aa:bb:cc:dd:ee:ff"]
host_wrapper.external_id = generate_uuid()
host_wrapper.facts = [{"namespace": "ns1", "facts": {"key1": "value1"}}]

response_data = self.post(HOST_URL, [host_wrapper.data()], 207)
Expand Down Expand Up @@ -1208,10 +1216,10 @@ def test_invalid_host_id(self):


class DeleteHostsTestCase(PreCreatedHostsBaseTestCase):
def test_create_then_delete(self):
original_id = self.added_hosts[0].id
@unittest.mock.patch("app.events.datetime", **{"utcnow.return_value": datetime.utcnow()})
def test_create_then_delete(self, datetime_mock):

url = HOST_URL + "/" + original_id
url = HOST_URL + "/" + self.added_hosts[0].id

# Get the host
self.get(url, 200)
Expand All @@ -1226,7 +1234,18 @@ def __call__(self, e):
# Delete the host
with unittest.mock.patch("api.host.emit_event", new=MockEmitEvent()) as m:
self.delete(url, 200, return_response_as_json=False)
assert original_id in m.events[0]
event = json.loads(m.events[0])
timestamp_iso = datetime_mock.utcnow.return_value.isoformat()

self.assertIsInstance(event, dict)
expected_keys = {"timestamp", "type", "id", "account", "insights_id", "request_id"}
self.assertEqual(set(event.keys()), expected_keys)

self.assertEqual(f"{timestamp_iso}+00:00", event["timestamp"])
self.assertEqual("delete", event["type"])
self.assertEqual(self.added_hosts[0].id, event["id"])
self.assertEqual(self.added_hosts[0].insights_id, event["insights_id"])
self.assertEqual("-1", event["request_id"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we test a real forwarded request ID too along with this default unknown one?


# Try to get the host again
response = self.get(url, 200)
Expand Down