Suppose we want to cluster a data stream of unknown number of clusters, and estimate them using particle filters. With particle filters, we need to know P(Xt | Xt−1) and P(Zt | Xt) (where z refer to the data (reports), and x refer to the estimated states (hypothesis)).

Then my question is: how can we define P(Xt | Xt−1) (transition probability) and P(Zt | Xt) (observation/likelihood probability) in this context ?

asked 21 May '12, 11:36

ShN's gravatar image


edited 15 Jun '12, 05:23

EllenL, it's strange, I don't see how to create a new question on the forum you provided (reddit.com) ! Even stranger is that I don't see that comment anywhere here, although an email arrived saying you posted it. And on reddit.com/r/aiclass your question is #1 but it's just a link back to here. Most mysterious.
(30 May '12, 15:52) EllenL EllenL's gravatar image
IIRC reddit is different from Aiqus in that the headline is an external link and the reddit comment thread for the link is accessed by clicking on comment just below the headline.
(30 May '12, 23:47) rseiter ♦ rseiter's gravatar image
@EllenL I deleted the comment after I understood that reddit was just a list of links to other forums that you can post, it's not a forum.
(31 May '12, 12:06) ShN ShN's gravatar image

Access to the paper does not require IEEE membership or a subscription -- see comment below.

Here's the abstract (reformatted for readability):

In this paper we develop a particle filtering approach for grouping observations into an unspecified number of clusters.

Each cluster corresponds to a potential target from which the observations originate. A potential clustering with a specified number of clusters is represented by an association hypothesis. Whenever a new report arrives, a posterior distribution over all hypotheses is iteratively calculated from a prior distribution, an update model and a likelihood function.

  • The update model is based on an association probability for clusters given the probability of false detection and a derived probability of an unobserved target.
  • The likelihood of each hypothesis is derived from a cost value of associating the current report with its corresponding cluster according to the hypothesis.

A set of hypotheses is maintained by Monte Carlo sampling. In this case, the state-space, i.e., the space of all hypotheses, is discrete with a linearly growing dimensionality over time. To lower the complexity further, hypotheses are combined if their clusters are close to each other in the observation space.

Finally, for each time-step, the posterior distribution is projected into a distribution over the number of clusters.

Compared to earlier information theoretic approaches for finding the number of clusters this approach does not require a large number of trial clusterings, since it maintains an estimate of the number of clusters along with the cluster configuration.


answered 21 May '12, 20:49

EllenL's gravatar image


edited 28 May '12, 18:10

I think you can also download it without subscription from: http://citeseerx.ist.psu.edu/viewdoc/summary?doi= (see link "cached" at the right) or directly from http://www.foi.se/fusion/fusion44.pdf I don't understand what is the probability of false detection and the probability of probability of an unobserved target, on which the update model ( i.e. P(Xt | Xt−1) ) is defined. It seems that the likelihood ( i.e. P(Zt | Xt) ) is defined as a Gaussian based on the std deviation and the mean distance of data-points to their cluster representative. But I'm not sure, and it's not clear.
(22 May '12, 04:56) ShN ShN's gravatar image
Got it, thanks. Reading it will take a bit longer, though....
(22 May '12, 16:19) EllenL EllenL's gravatar image
Ok, I am waiting for your answer then, thanks.
(28 May '12, 16:55) ShN ShN's gravatar image
I hope you're not expecting an expert opinion, because I'm just another student here. But I've printed out the article and I'm working my way through it bit by bit, looking for answers to your questions. Perhaps rephrasing the question or giving more details would attract more attention. (That's what I tried to do when I posted the abstract, but no luck so far.) Back when the 2011 AI class was covering particle filters, this question would have found a better audience. Have you tried posting it in another forum, such as http://www.reddit.com/r/aiclass/?
(28 May '12, 18:08) EllenL EllenL's gravatar image
Speaking of particle filters and the AI class, one of the Berkeley Pac-Man PAs covered particle filters (the Stanford version left this out) if you are interested in playing with it: http://www-inst.eecs.berkeley.edu/~cs188/pacman/projects/tracking/busters.html
(29 May '12, 00:10) rseiter ♦ rseiter's gravatar image
Your answer
toggle preview

Follow this Question via Email

Once you sign in you will be able to subscribe for any updates here

Q&A Editor Basics

  • to upload an image into your question or answer hit
  • to create bulleted or numbered lists hit or
  • to add a title or header hit
  • to section your text hit
  • to make a link clickable, surround it with <a> and </a> (for example, <a>www.google.com</a>)
  • basic HTML tags are also supported (for those who know a bit of HTML)
  • To insert an EQUATION you can use LaTeX. (backslash \ has to be escaped, so in your LaTeX code you have to replace \ with \\). You can see more examples and info here