'Interesting' behaviour in EDM-X when data has a small SD #22

richdrich · 2016-05-02T04:05:58Z

I found some unusual behavior as the standard deviation of some test data (on either side of a step change) drops.

When the sd is less than 1, the detection of the change becomes inaccurate - in a very defined manner. [EDIT: I'd note that the '1' is a big coincidence - the knee changes as the data range changes, as you might expect]

See the below. My data actually changes at point 500, EDM-X finds this to within two intervals above that and is out by 50 intervals below.

I'd be interested in any comments on this...

library(BreakoutDetection)

# Try EDM-X on SDs over a (log) range
logSds <- seq(from=-0.2, to=0.2, by=.05)
sds <- 10 ^ logSds
errs <- vector(,length(sds))

erri <- 1
for(i in logSds) {
  sd <- 10 ^ i

  set.seed(123)
  # construct datasets
  s1 <- zoo(rnorm(500,mean=100,sd=sd), seq.POSIXt(as.POSIXlt("2016-01-01"), by=3600,length.out=500))
  s2 <- zoo(rnorm(400,mean=110,sd=sd), seq.POSIXt(as.POSIXlt("2016-01-21 20:00:00"), by=3600,length.out=400))

  st <- rbind(s1, s2)

  zdata <- data.frame(timestamp=time(st), count=as.vector(st))

  br <- breakout(zdata,min.size=100, method='amoc', plot=T)

  errs[erri] <- abs(br$loc - 500)
  erri <- erri + 1
}

plot(sds, errs)

The text was updated successfully, but these errors were encountered:

richdrich · 2016-05-03T04:23:10Z

Further to this, I think the issue is when the SD and the number of observations are such that there the two medians tend to 1 and 0 for all data ranges (values of tau2 and tau1) => that leads to the medians being ignored and the algorithm converging at a point determined by tau2 and tau1 (and hence the sizes of the two datasets), which isn't the actual breakout. Or something like that.

I'm thinking this won't be too much of a problem with real data (I found it with a naive test case) but would be interested in any comments?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'Interesting' behaviour in EDM-X when data has a small SD #22

'Interesting' behaviour in EDM-X when data has a small SD #22

richdrich commented May 2, 2016 •

edited

Loading

richdrich commented May 3, 2016

'Interesting' behaviour in EDM-X when data has a small SD #22

'Interesting' behaviour in EDM-X when data has a small SD #22

Comments

richdrich commented May 2, 2016 • edited Loading

richdrich commented May 3, 2016

richdrich commented May 2, 2016 •

edited

Loading