everydayvast.blogg.se - Anomaly 2 demo

#Anomaly 2 demo Patch#
#Anomaly 2 demo series#

These are used to create a per-pixel anomaly score matrix for each frame. This yields an anomaly score per overlapping video patch. The distance to the closest exemplar is the anomaly score for that video patch.

#Anomaly 2 demo Patch#

For each spatial region in a sequence of T frames of a testing video, compute the feature vector representing the video patch and then find the nearest neighbor in that region's exemplar set.

#Anomaly 2 demo series#

Given a model of normal video which consists of a different set of exemplars for each spatial region of the video, the anomaly detection is simply a series of nearest neighbor lookups.

For flow-field feature vectors we use normalized L1 distance. For blurred FG mask feature vectors, we use L2 distance.

The distance function used to compare two exemplars depends on the feature vector. Otherwise we add it to the set of exemplars. If the distance to the nearest exemplar is less than a threshold then we discard that video patch. For each video patch, we compare it to the current set of exemplars for that spatial region. We slide a spatial-temporal window (with step size equal to one frame) along the temporal dimension of each training video to give a series of video patches which we represent by either a FG-mask based feature vector or a flow-based feature vector depending on the algorithm variation as described above. For a particular spatial region, the exemplar set is initialized to the empty set. Our exemplar selection method is straightforward. In the model building phase, a distinct set of exemplars is selected to represent normal activity in each spatial region. In our experiments we use the optical flow algorithm of Kroeger et al.

The flow fields within the region of each video patch frame are concatenated and then vectorized to yield a feature vector twice the length of the feature vector from the FG mask baseline (due to the dx and dy components of the flow field). The flow-based variation uses optical flow fields computed between consecutive frames in place of FG masks. b) and d) show the corresponding blurred FG masks. a) and c) show two video patches consisting of 7 frames cropped around a spatial region. The BG model used in the experiments is a very simple mean color value per pixel.įigure 3: Example blurred FG masks which are concatenated and vectorized into a feature vector. The FG masks are computed using a background (BG) model that is updated as the video is processed (see Figure 3). The foreground (FG) mask variation uses blurred FG masks for each frame in a video patch as a feature vector. The only differences between the two variations are the feature vector used to represent each video patch and the distance function used to compare two feature vectors. The distance to the nearest exemplar serves as the anomaly score. In the anomaly detection phase, the testing video is split into the same regions used during training and for each testing video patch, the nearest exemplar from its spatial region is found. We call these representative video patches, exemplars. In the model-building phase, the training (normal) videos are used to find a set of video patches (represented by feature vectors described later) for each spatial region that represent the variety of activity in that spatial region. The baseline algorithm has two phases: a training or model-building phase and a testing or anomaly detection phase. This figure shows nonoverlapping regions, but in our experiments we use overlapping regions. See Figure 2 for an illustration.įigure 2: Illustration of a grid of regions partitioning a video frame and a video patch encompassing 4 frames. In the experiments we choose H=40 pixels, W=40 pixels, T=4 or 7 frames, and s = 20 pixels. The new algorithm is very straightforward and is based on dividing the video into spatio-temporal regions which we call video patches, storing a set of exemplars to represent the variety of video patches occuring in each region, and then using the distance from a testing video patch to the nearest neighbor exemplar as the anomaly score.įirst, each video is divided into a grid of spatio-temporal regions of size H x W x T pixels with spatial step size s and temporal step size 1 frame. We describe two variations of a novel algorithm for video anomaly detection which we evaluate along with two previously published algorithms on the Street Scene dataset (described later). The blue square represents the ground truth labeled anomaly. Figure 1: Example frame from the Street Scene dataset and an example anomaly detection (red tinted pixels) found by our algorithm (a jaywalker).