Basic Structure of Markovian Population Models
When considering Markov systems, we are often interested in the behaviour of the population of individuals traversing through the graph rather than that of an individual. For example, we might be interested in how crowds move through a shopping mall when entering from different entrances. In such a situation, we would be interested in the crowd density that can be attributed to a particular entrance at some store and time.
In Dynamic Probabilistic Systems: Markov Models (Volume 1), this is described by Markovian Population Models in Chapter 8.
Here we consider a graph, $G = \{V, E\}$, where $V$ and $E$ are sets of nodes and edges, respectively. There are $N$ nodes. The transition probability between nodes $i$ and $j$ over a duration $n$ is given by $\phi_{ij}(n)$.
Now suppose, individuals are injected into $G$ at discrete time intervals and traverse $G$ independently of each other. The number of individuals injected at node $i$ at time $m$ is given by $f_i(m)$.
Let the probability that there are $k$ individuals at node $j$ at time $n$ if $f_i(m)$ individuals are injected from node $i$ at time $m$ be $h_{ij}(k ~|~ m, n)$. Since the individuals traverse the graph independently, one can think of $h_{ij}(k ~|~ m, n)$ as an $n$ trial Binomial Distribution. Each individual either makes it to node $j$ after duration $(n - m)$ or it does not. The probability that an individual makes it to node $j$ at time $n$, is $\phi_{ij}(n - m)$.
$$ h_{ij}(k ~|~ m, n) = \left(\begin{matrix}f_i(m) \ k\end{matrix}\right)\left[\phi_{ij}(n-m)\right]^k\left[1 - \phi_{ij}(n-m)\right]^{(f_i(m) - k)} $$
The above expression considers the case where individuals are injected at time $m$. If we simply want to ask what is the probability that there will be $k$ individuals at time $n$, regardless of time of injection, we need to consider all injection times, $m$. On top of that, we need to consider that the arrivals at node $j$ at time $n$ as to sum to $k$. It is not that at each injection time, $m$, there should be $k$ individuals arriving at node $j$ at time $n$.
In order to do so, we need to take a slight detour and consider the convolution of probability distributions. Suppose we have 3 random variables, $X$, $Y$ and $Z = X + Y$, then
$$ \textrm{Pr}(Z=z) = \sum^\infty_{k=-\infty}\textrm{Pr}(X=k) \textrm{Pr}(Y=z-k) $$
Notice first that the above expression is a convolution of the two constituent distributions. Also, notice that $X=k$ and $Y=z-k$. This is because to find the probability that $Z=z$, one has to consider all potential values of $X$ and the corresponding values of $Y$.
Now let's get back on track. For the problem at hand, we might consider that $k_0$ individuals reach node $j$ from $f_i(0)$ for instance. This would mean that in order for there to be $k$ individuals at node $j$ at time $n$, the number of individuals that arrive at subsequent injection times is $k-k_0$. We can repeat this process for subsequent injection time points. We introduce here the notation that $h_{ij}(k~|~m^+, n)$ to be the probability that $k$ individuals arrive at node $j$ when injected from node $i$ for all injection time points greater or equal to $m$ (i.e. $m^+$).
$$ \begin{aligned} h_{ij}\left(\cdot~|~ n\right) =& \sum_{k_0=0}^k h_{ij}\left(k_0~|~0, n\right)h_{ij}\left(k-k_0~|~1^+, n\right) \\ =& \sum_{k_0=0}^k h_{ij}\left(k_0~|~0, n\right)\left[\sum_{k_1=0}^{k-k_0} h_{ij}\left(k_1~|~0, n\right)h_{ij}\left(k-k_0-k_1~|~2^+, n\right)\right] \\ =& \cdots \\ =& h_{ij}\left(~\cdot~|~0, n\right) * h_{ij}\left(~\cdot~|~1, n\right) * h_{ij}\left(~\cdot~|~2, n\right) * ~\cdots~ h_{ij}\left(~\cdot~|~n, n\right) \\ =& \bigotimes_{m=0}^n h_{ij}\left(\cdot~|~m, n\right), \end{aligned} $$
where $\bigotimes$ is a manifold or multiple convolution and $h_{ij}(\cdot~|~n)$ is a function to be evaluated at some particular value $k$.
Thus the probability that there will be $k$ individuals, that are injected from node $i$, at node $j$ at time $n$ is given by,
$$ h_{ij}(k~|~n) = \left[\bigotimes_{m=0}^n h_{ij}(\cdot~|~m, n)\right]\left(k\right), $$
where $0\leq k \leq \sum_{m=0}^n f_i(m)$.
Similarly if there are multiple injection nodes then,
$$ h_j(k~|~n) = \left[\bigotimes_{i=1}^N h_{ij}(\cdot ~|~ n) \right] (k), $$
which gives the probability that there are $k$ individuals at node $j$ at time $n$.
One can also calculate the actual population at each node, $r_j(n)$, which is the number of individuals at node $j$ at time $n$. Recall that the population follows a Binomial distribution and the mean and variance are given by $np$ and $npq$, respectively. The average population at node $j$ is given by,
$$ \bar r_j\left(n\right) = \sum^N_{i=1}\sum^n_{m=0} f_i\left(m\right) \phi_{ij}\left(n-m\right). $$
The variance is given by,
$$ \textrm{Var}\left(r_j\left(n\right)\right) = \sum^N_{i=1}\sum^n_{m=0} f_i\left(m\right) \phi_{ij}\left(n-m\right) \left(1 - \phi_{ij}\left(n-m\right)\right) $$
The most interesting thing to me here is the calculation of the probability distribution of the population. If I hadn't worked it out by hand, I wouldn't have understood it.