Synopsis
A maximum likelihood function.
Description
Counts are sampled from the Poisson distribution, and so the best way to assess the quality of model fits is to use the product of individual Poisson probabilities computed in each bin i, or the likelihood L:
L = (product)_i [ M(i)^D(i)/D(i)! ] * exp[-M(i)]
where M(i) = S(i) + B(i) is the sum of source and background model amplitudes, and D(i) is the number of observed counts, in bin i.
The cash statistic (Cash 1979, ApJ 228, 939) is derived by (1) taking the logarithm of the likelihood function, (2) changing its sign, (3) dropping the factorial term (which remains constant during fits to the same dataset), and (4) multiplying by two:
C = 2 * (sum)_i [ M(i) - D(i) log M(i) ]
The factor of two exists so that the change in cash statistic from one model fit to the next, (Delta)C, is distributed approximately as (Delta)chi-square when the number of counts in each bin is high (> 5). One can then in principle use (Delta)C instead of (Delta)chi-square in certain model comparison tests. However, unlike chi-square, the cash statistic may be used regardless of the number of counts in each bin.
The magnitude of the cash statistic depends upon the number of bins included in the fit and the values of the data themselves. Hence one cannot analytically assign a goodness-of-fit measure to a given value of the cash statistic. Such a measure can, in principle, be computed by performing Monte Carlo simulations. One would repeatedly sample new datasets from the best-fit model, fit them, and note where the observed cash statistic lies within the derived distribution of cash statistics. (The ability to perform Monte Carlo simulations is a feature that will be included in a future version of Sherpa.)
Background Subtraction
The background should not be subtracted from the data when this statistic is used. It should be modeled simultaneously with the source.
Zero and negative value numbers
The Cash statistic function evaluates the logarithm of each data point. If the number of counts is zero or negative, it's not possible to take the log of that number. The behavior in this case is controlled by the truncate and trunc_value settings in the .sherpa.rc file; see "ahelp sherparc" for details on this file.
If truncate is set to True (the default), then log(<trunc_value>) is substituted into the equation, and the statistics calculation proceeds. The default trunc_value is 1.0e-25.
If truncate is set to False, Cash returns an error and stops the calculation when the number of counts in a bin is zero or negative. The trunc_value setting is not used.
Example
sherpa> set_stat("cash")
sherpa> show_stat()
Statistic: CashSet the fitting statistic and then confirm the new value.
Bugs
See the bugs pages on the Sherpa website for an up-to-date listing of known bugs.
See Also
- info
- list_stats
- statistics
- chi2constvar, chi2datavar, chi2gehrels, chi2modvar, chi2xspecvar, chisquare, cstat, get_prior, leastsq, list_priors, set_prior, set_sampler, set_stat, wstat