Simultaneously Fitting Two Datasets

Overview

Last Update: 1 Dec 2006 - reviewed for CIAO 3.4: no changes

Synopsis:

The Sherpa syntax easily allows the modelling of multiple independent datasets. In this example, we have a source observed on two different occasions. If we assume that the source is constant, then the only change will be in the instrument response. We account for this by defining each spectrum independently and fitting with the same model. This allows us to increase the significance of our fit, without making assumptions inherent in the process of averaging spectra and responses.

This thread may also be used to fit source and background data which are independently grouped. Note that the "standard method" of doing this is described in the Fitting Spectral Data: With/Without Independent Background Responses thread.

While the sample datafiles used in this thread are available as sherpa.tar.gz, note also that they are generated with the Extract ACIS Spectra for Pointlike Sources and Make RMFs and ARFs thread.

Proceed to the HTML or hardcopy (PDF: A4 | letter) version of the thread.

Getting Started
Statistical Issues: Background Subtraction
Simultaneously Fit Two Independent Datasets
Summary
History
Images
- Two spectra, fit with same model

Getting Started

Please follow the "Sherpa Threads: Getting Started" thread.

Statistical Issues: Background Subtraction

A typical dataset may contain multiple spectra, one of which contains contributions from the source of interest and the background, and one or more others which contain background counts alone. (The background itself may contain contributions from the cosmic X-ray background, the particle background, etc., but we'll ignore this complication.) The proper way to treat background data is to model them!

However, many X-ray astronomers subtract background data from the raw data.

Why should one not subtract background?

It reduces the amount of statistical information in the analysis - the final fit parameter values will be a less accurate estimate of the true values.
The background subtracted data are not Poisson-distributed - one cannot fit them with the Poisson likelihood (or the Cash statistic), even in the low-count limit. For example, subtracting a background can give negative counts; this is definitely not Poissonian!
Fluctuations (particularly in the vicinity of localized features) can adversely affect analysis.

Simultaneously Fit Two Independent Datasets

Simultaneous fitting is identical to the basic method of fitting data, except for the specification of the "dataset number". Most Sherpa commands have the option of specifying the number of the dataset, on which to invoke the command. By default, this number is assumed to be 1; colon-delimited ranges (e.g. 1:3) and comma-delimited sets (e.g. 1,3) are also possible.

A. Reading FITS Data

Here we wish to fit spectral data from the FITS datafiles pi2278.fits and pi2286.fits These datasets are input into Sherpa with the DATA command:

sherpa> DATA 1 pi2278.fits
The inferred file type is PHA.  If this is not what you want, please 
specify the type explicitly in the data command.
WARNING: statistical errors specified in the PHA file.
         These are currently IGNORED.  To use them, type:
         READ ERRORS "<filename>[cols CHANNEL,STAT_ERR]" fitsbin

Note that the DATA command is a synonym for READ DATA. Sherpa will automatically read in any background spectra or response matrices, if they are defined in the header of the PI file (keywords "BACKFILE", "RESPFILE", and "ANCRFILE", respectively). In this case, we define the response files separately.

The second dataset is loaded similarly, specifying the data be stored as dataset number 2:

sherpa> DATA 2 pi2286.fits
The inferred file type is PHA.  If this is not what you want, please 
specify the type explicitly in the data command.
WARNING: statistical errors specified in the PHA file.
         These are currently IGNORED.  To use them, type:
         READ ERRORS "<filename>[cols CHANNEL,STAT_ERR]" fitsbin

B. Defining Instrument Responses

Next we define the instrument response files. In this example, the response files were not defined automatically, so we set them up like so:

sherpa> INSTRUMENT SOURCE 1 = RSP[r1]("rmf2278.fits", "arf2278.fits")
The inferred file type is ARF.  If this is not what you want, please 
specify the type explicitly in the data command.

sherpa> INSTRUMENT SOURCE 2 = RSP[r2]("rmf2286.fits", "arf2286.fits")
The inferred file type is ARF.  If this is not what you want, please 
specify the type explicitly in the data command.

sherpa> SHOW INSTRUMENT 1
Instrument 1: rsp1d[r1]
    Param   Type  Value
    -----   ----  -----
 1    rmf string: "rmf2278.fits" (N_E=410,N_PHA=685)
 2    arf string: "arf2278.fits" (N_E=410)

sherpa> SHOW INSTRUMENT 2
Instrument 2: rsp1d[r2]
    Param   Type  Value
    -----   ----  -----
 1    rmf string: "rmf2286.fits" (N_E=410,N_PHA=685)
 2    arf string: "arf2286.fits" (N_E=410)

C. Defining And Fitting Source Models

We now need to define model components for our data. We would like to fit each spectrum with an absorbed powerlaw, so we need to define a powerlaw component and an absorbing component. We also ignore data above 5.5 keV for both datasets, to cut out energies with no data (not really necessary, but produces a nicer plot). Ignoring the higher energies may also be justified for ACIS-S data, where the high-energy background is stronger.

sherpa> PARAMPROMPT OFF
Model parameter prompting is off
sherpa> XSWABS[abs]
sherpa> POW[pl]
sherpa> IGNORE 1:2 ENERGY 5.5:

and we define the source models like so:

sherpa> SOURCE 1 = abs * pl
sherpa> SOURCE 2 = abs * pl

The example could be more complicated if we don't assume that the sources and the data extraction methods were identical. For example, if the two datasets were extracted using regions of different area, then we need to use a scaling factor in the source model definition. Use the SHOW DATA command to find the "Background Scale" value; the ratio of any two "Background Scale" values is equal to the ratio of their areas.

We are now ready to fit:

sherpa> FIT 1:2
 LVMQT: V2.0
 LVMQT: initial statistic value = 22.4273
 LVMQT: final statistic value = 5.73184 at iteration 6
           abs.nH  1.2103  10^22/cm^2  
            pl.gamma  2.24011     
            pl.ampl  4.32178e-05     

sherpa> LPLOT 2 FIT 1 FIT 2
sherpa> PRINT POSTFILE simfit.ps

Executing these commands produces the plot of the fit shown here .

Alternatively, if the second spectrum were similar, but not identical, to the first spectrum, we could use a different model component in one of the source definitions. Parameter linking can be used to fix parameter values to others. For example, if the intensity of the source (but not the powerlaw slope) is expected to change, we could define a new powerlaw component and link the slopes together. The result is one more free parameter, the normalization of the second powerlaw component. We might do this like so:

sherpa> POW[pl2]
sherpa> SOURCE 2 = abs * pl2
sherpa> pl2.gamma => pl.gamma
sherpa> FIT
 LVMQT: V2.0
 LVMQT: initial statistic value = 27.4143
 LVMQT: final statistic value = 5.7289 at iteration 7
           abs.nH  1.20699  10^22/cm^2  
            pl.gamma  2.23635     
            pl.ampl  4.27239e-05     
           pl2.ampl  4.3316e-05

As we can see, only the amplitude of the new component is fit. In this case, the amplitudes of the two power laws are nearly identical.

Summary

This thread is complete, so we can exit the Sherpa session:

sherpa> EXIT
Goodbye.

History

14 Jan 2005	reviewed for CIAO 3.2: no changes
21 Dec 2005	reviewed for CIAO 3.3: no changes
01 Dec 2006	reviewed for CIAO 3.4: no changes