Counts are sampled from the Poisson distribution, and so the
best way to assess the quality of model fits is to use the
product of individual Poisson probabilities computed in each
bin i, or the
likelihood L:
L = (product)_i [ M(i)^N(i)/N(i)! ] * exp[-M(i)]
where M(i) = S(i) + B(i) is
the sum of source and background model amplitudes, and
D(i) is the number of observed counts,
in bin i.
The cash statistic (Cash 1979, ApJ 228, 939) is derived by (1)
taking the logarithm of the likelihood function, (2) changing
its sign, (3) dropping the factorial term (which remains
constant during fits to the same dataset), and (4) multiplying
by two:
C = 2 * (sum)_i [ M(i) - D(i) log M(i) ]
The factor of two exists so that the change in cash statistic
from one model fit to the
next, (Delta)C, is distributed
approximately
as (Delta)chi-square when
the number of counts in each bin is high (> 5). One can
then in principle use (Delta)C
instead
of (Delta)chi-square in
certain model comparison tests. However,
unlike chi-square, the cash
statistic may be used regardless of the number of counts in
each bin.
The magnitude of the cash statistic depends upon the number of
bins included in the fit and the values of the data
themselves. Hence one cannot analytically assign a
goodness-of-fit measure to a given value of the cash statistic.
Such a measure can, in principle, be computed by performing
Monte Carlo simulations. One would repeatedly sample new
datasets from the best-fit model, fit them, and note where the
observed cash statistic lies within the derived distribution
of cash statistics. (The ability to perform Monte Carlo
simulations is a feature that will be included in a future
version of Sherpa.)
The background should not be subtracted from the data when
this statistic is used. It should be modeled simultaneously
with the source.