-
Notifications
You must be signed in to change notification settings - Fork 335
Question about abillities #124
Comments
That's an interesting use case - I am not sure Skyline supports that in
particular, as all the timeseries are intended to be compared to
themselves, not to other timeseries.
Perhaps as a hacky fix, you could make a composite timeseries, consisting
of the average of all the machines? It's not exactly what you need but it
could get you partially there.
…On Wed, Feb 7, 2018 at 10:18 AM, asafcombo ***@***.***> wrote:
I intend to test Skyline to monitor anomalous behavior of cpu usage across
several instances hosted on the company's DC.
I fully understand how to config skyline to find anomalous behavior of a
server cpu vs its timeseries.
What I want to achieve is to find anomalous behavior of this instance vs
the rest of the servers as a function of time. It will help us flag
unwanted behavior of a distributed system that allocates more work (or
harder computation) to that instance.
For example: the mean cpu for all servers is 30%. one instance is now at
40% for 1 hour.
Whereas if this cpu behavior could be found anomalous against its
timeseries - after 1 hour of that usage - it will be flagged as normal.
In my case, because the 40% is compared against 30% - it would definitely
be flagged as anomalous for the entire cycle.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#124>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AARJpOlchCL9sVFozAvbJJLgRn5Qr3K7ks5tSb7GgaJpZM4R84Ix>
.
--
Abe Stanway
abe.is
|
On the other hand, you could also write a custom algorithm and hard code
the other timeseries that you want to compare the current one too. That
should do the trick.
…On Wed, Feb 7, 2018 at 10:43 AM, Abe Stanway ***@***.***> wrote:
That's an interesting use case - I am not sure Skyline supports that in
particular, as all the timeseries are intended to be compared to
themselves, not to other timeseries.
Perhaps as a hacky fix, you could make a composite timeseries, consisting
of the average of all the machines? It's not exactly what you need but it
could get you partially there.
On Wed, Feb 7, 2018 at 10:18 AM, asafcombo ***@***.***>
wrote:
> I intend to test Skyline to monitor anomalous behavior of cpu usage
> across several instances hosted on the company's DC.
>
> I fully understand how to config skyline to find anomalous behavior of a
> server cpu vs its timeseries.
>
> What I want to achieve is to find anomalous behavior of this instance vs
> the rest of the servers as a function of time. It will help us flag
> unwanted behavior of a distributed system that allocates more work (or
> harder computation) to that instance.
>
> For example: the mean cpu for all servers is 30%. one instance is now at
> 40% for 1 hour.
> Whereas if this cpu behavior could be found anomalous against its
> timeseries - after 1 hour of that usage - it will be flagged as normal.
> In my case, because the 40% is compared against 30% - it would definitely
> be flagged as anomalous for the entire cycle.
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#124>, or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AARJpOlchCL9sVFozAvbJJLgRn5Qr3K7ks5tSb7GgaJpZM4R84Ix>
> .
>
--
Abe Stanway
abe.is
--
Abe Stanway
abe.is
|
Hard-coding won't suffice as the servers ids could change by the Resource manager. I think that what I could do is change If I take several |
@asafcombo, @astanway is correct it could be done with some customisation and in terms of hard coding the server metric names, you could instead match on the namespace and then server id changes are not an issue, as long as you have a common namespace for servers, e.g. In terms of the cost of checking the metrics, if done properly the penalty incurred should not be too steep, especially if it is only 40 metrics you are talking about compositing. However, that said, even though all the servers may be the same, you may find that there metrics are somewhat different at times, normally. However as @astanway said this is an interesting use case I have been thinking about myself for some time in terms of metric clustering, however I feel it would work better if Skyline learnt related metrics and clustered the namespaces and did it all (or most) by itself :) Although I can tell you that it is probably not as easy as it sounds. Skyline Mirage and Ionosphere may be able to help you out now in the interim, until you or I or someone else does that. I maintain an unforked version at https://github.com/earthgecko/skyline, the additional functionality is outlined here - #123 (comment) And I shall definitely be looking at adding something similar to what you have outlined here in the not too .... future. I am currently adding an autocorrelations module and with Skyline learning using autocorrelations and user defined correlations, the addition of another module to analyse the metric, in terms of clustered or composite metric medians, etc is one of the next logical steps, as you nicely outlined here. I am not certain how well it will work, but it will definitely work in some way :) Good luck with your endeavours with Skyline. |
@earthgecko thanks. |
I intend to test Skyline to monitor anomalous behavior of cpu usage across several instances hosted on the company's DC.
I fully understand how to config skyline to find anomalous behavior of a server cpu vs its timeseries.
What I want to achieve is to find anomalous behavior of this instance vs the rest of the servers as a function of time. It will help us flag unwanted behavior of a distributed system that allocates more work (or harder computation) to that instance.
For example: the mean cpu for all servers is 30%. one instance is now at 40% for 1 hour.
Whereas if this cpu behavior could be found anomalous against its timeseries - after 1 hour of that usage - it will be flagged as normal.
In my case, because the 40% is compared against 30% - it would definitely be flagged as anomalous for the entire cycle.
The text was updated successfully, but these errors were encountered: