| |
Similarity between complex objects is a central notion in
data mining. Computing similarity between complex objects
has security related applications - e.g., determining whether
two potential terrorists are in fact the same person from
analyzing their traces. However traditional similarity measures
are often inadequate for these applications, especially
for categorical data where there no natural numeric notion
of distance. For example, in criminal databases, two suspects
may have the same behavior, but how do we discover this
similarity automatically? Similarity problems between other
types of complex objects such as time-series data are equally
interesting. E.g., we want to quickly and automatically
infer rules such as "if stock there is increased cell phone
activity of potential suspects in city X, then it is likely
that there will be a major crime threat of a particular
kind". In our research, we use the notion of context to
determine similarity between complex objects. We have developed
similarity models that are more sophisticated than traditional
Euclidean distance models. We are attempting to extensively
apply these techniques in the security domain.
|