Jump to: Description · Examples · Changes in CIAO 4.13 · Bugs · See Also

AHELP for CIAO 4.16 Sherpa

datastack

Context: data

Synopsis

Sherpa extension package for modeling stacks of data.

Syntax

from sherpa.astro import datastack

Description

Datastack is a Sherpa extension package for manipulating a stack of related datasets and simultaneously fitting a model to them. It provides stack-enabled (i.e. vectorized) versions of the key Sherpa commands used to load data, set source models, get and set parameters, fit, and plot. The current implementation is based on Tom Aldcroft's contrib package (https://cxc.harvard.edu/contrib/datastack/) wrapping Sherpa functions. The addition of the CIAO DM stack syntax allows reading and filtering the stacks of data using standard DM filtering syntax.

For X-ray spectral analysis, a datastack means that a group of related datasets, for instance 10 observations of the same source taken at different times, can be analyzed as if they were a single merged dataset. This has the important advantage of using the appropriate individual response functions while hiding the complexity (or tedium) of defining models and parameters for all the datasets.

Loading the Routines

This package needs to be imported to a Sherpa session, for example:

from sherpa.astro import datastack

Datastack File Input

The file stack syntax (e.g. '@file.lis') is available for Sherpa load_* commands. The details of the DM stack concept can be found in the "stack" ahelp file.

The following load_* functions support file stack syntax:

load_pha
load_ascii
load_data
load_bkg

The datastack-specific load_* calls are slightly different than the default Sherpa calls, as they do not take a data ID. The "@" character entered with the filename signals a stack file and all the files listed in the "@file.lis" are added to the datastack. Individual files can be added to the datastack later. You can also use multiple datastack identifiers, providing a way to treat several datastacks in one Sherpa session.

sherpa> from sherpa.astro.datastack import *
sherpa> load_data("@file.lis")
read ARF file obs1/q9407.warf
read RMF file obs1/q9407.wrmf
read ARF (background) file obs1/q9407_bkg.warf
read RMF (background) file obs1/q9407_bkg.wrmf
read background file obs1/q9407_bkg.pi
read ARF file obs2/q9708.warf
read RMF file obs2/q9708.wrmf

After importing the datastack package, the Sherpa load_data command loads the PHA files listed in the file.lis stack, as well as the associated response files pointed to by the headers of those PHA files.

sherpa> show_stack()
1: obs1/q9407.pi OBS_ID: 9407 MJD_OBS: 54437.6143274
2: obs2/q9708.pi OBS_ID: 9708 MJD_OBS: 54445.5404309

unix% more file.lis
obs1/q9407.pi
obs2/q9708.pi

Note that after importing datastack with "from sherpa.astro.datastack import *", datastacks are used by default to manage Sherpa data IDs. New files can either be added to the existing datastack, or input as a separate file. The default datastack is referred to as [].

The following syntax is allowed:

load_pha("@pha.lis") # the files listed in "pha.lis" are added to the default datastack
load_pha("myid","file.pi") # the file is not added to the datastack, it is assigned an id="myid"
load_pha("file.pi") # in this case the file is added to the datastack
load_pha([],"file.pi") # the file is added to the default datastack
ds = DataStack() # create a new datastack
load_pha(ds,"@pha.lis") # adds files in pha.lis to the datastack called "ds"
load_pha(ds,"file.pi") # adds file.pi to the ds datastack

An example of an unsupported syntax is:

load_pha("myid","@pha.lis") # this will result in an error, as the file stack syntax is not accepted when setting the ID

load_arrays

The datastack version of load_arrays has a slightly different interface than the one in standard Sherpa. In standard Sherpa, load_arrays must always get an ID as the first argument.

load_arrays accepts an array of arrays as the first argument when no datastack identifier is provided, as in the example below:

x1 = ... an array ...
y1 = ... an array ...
x2 = ... an array ...
y2 = ... an array ...
x3 = ... an array ...
y3 = ... an array ...

sherpa> load_arrays([[x1,y1], [x2,y2], [x3,y3]])

Datastack Identifier

By default, the datastack is identified by '[]'. In general, it can also be:

a datastack instance reference
an array (any iterable) of dataset IDs
a datastack instance subset

For example:

sherpa> ds = DataStack()

'ds' is both a datastack instance and identifier, and another valid identifier is:

sherpa> ds[1,2]

meaning ds[1,2] is an identifier for datasets 1 and 2 from the datastack 'ds'. If picking an identifier from the default datastack, the syntax:

sherpa> [3,4]

may be used, where [3,4] is the datastack identifier for datasets 3 and 4 from the default datastack.

Datastack Functions

The following functions are compatible with datastacks:

clear_models
clear_stack
conf
fit
fit_bkg
freeze
get_arf
get_bkg
get_bkg_arf
get_bkg_model
get_bkg_rmf
get_bkg_scale
get_model
get_par
get_response
get_rmf
get_source
get_stack_ids
group_adapt
group_adapt_snr
group_bins
group_counts
group_snr
group_width
ids
ignore
link
load_arf
load_arrays
load_ascii
load_bkg
load_bkg_arf
load_bkg_rmf
load_data
load_filter
load_grouping
load_pha
load_rmf
notice
plot_arf
plot_bkg
plot_bkg_chisqr
plot_bkg_delchi
plot_bkg_fit
plot_bkg_fit_delchi
plot_bkg_fit_resid
plot_bkg_model
plot_bkg_ratio
plot_bkg_resid
plot_bkg_source
plot_chisqr
plot_data
plot_delchi
plot_fit
plot_fit_delchi
plot_fit_resid
plot_model
plot_order
plot_psf
plot_ratio
plot_resid
plot_savefig
plot_set_xscale
plot_set_yscale
plot_source
plot_title
plot_xlabel
plot_xlim
plot_ylabel
plot_ylim
query
query_by_header_keyword
query_by_obsid
savefig
set_bkg_full_model
set_bkg_model
set_full_model
set_model
set_par
set_source
show_stack
subtract
thaw
ungroup
unlink
unsubtract

In order for the function to operate on a datastack, the datastack identifier has to be included as the first argument of that function.

Setting Models for a Datastack

The following model function can be used for assigning a model to a datastack:

set_source
set_model
set_bkg_model
set_full_model
set_bkg_full_model

They can be used to assign a model to all the datasets in a datastack or to a dataset subset with the data indicator '__ID'.

Examples

Example 1

sherpa> from sherpa.astro.datastack import *
sherpa> load_data("@file.lis")

Import the datastack package into Sherpa and use the load_data command to load the PHA files listed in file.lis into the default datastack instance.

Example 2

sherpa> set_source([],'xsphabs.abs1*xsapec.kt1')

Here the same model is assigned to both datasets in the default datastack, identified by '[]'.

sherpa> show_source()
Model: 1
(xsphabs.abs1 * xsapec.kt1)
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   abs1.nH      thawed            1            0       100000 10^22 atoms / cm^2
   kt1.kT       thawed            1        0.008           64        keV
   kt1.Abundanc frozen            1            0            5           
   kt1.redshift frozen            0       -0.999           10           
   kt1.norm     thawed            1            0        1e+24           

Model: 2
(xsphabs.abs1 * xsapec.kt1)
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   abs1.nH      thawed            1            0       100000 10^22 atoms / cm^2
   kt1.kT       thawed            1        0.008           64        keV
   kt1.Abundanc frozen            1            0            5           
   kt1.redshift frozen            0       -0.999           10           
   kt1.norm     thawed            1            0        1e+24

These data will be simultaneously fit, assuming this model.

Example 3

sherpa> set_source([1,2], "powlaw1d.p__ID")
sherpa> set_source([3,4], "brokenpowerlaw.bp__ID")

These commands set different models for different subsets of the default datastack.

sherpa> show_source()
Model: 1
powlaw1d.p1
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   p1.gamma     thawed            1          -10           10           
   p1.ref       frozen            1 -3.40282e+38  3.40282e+38           
   p1.ampl      thawed            1            0  3.40282e+38           

Model: 2
powlaw1d.p2
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   p2.gamma     thawed            1          -10           10           
   p2.ref       frozen            1 -3.40282e+38  3.40282e+38           
   p2.ampl      thawed            1            0  3.40282e+38           

Model: 3
brokenpowerlaw.bp3
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   bp3.refer    frozen         5000  1.17549e-38  3.40282e+38  angstroms
   bp3.ampl     thawed            1  1.17549e-38  3.40282e+38  angstroms
   bp3.index1   thawed          0.1          -10           10           
   bp3.index2   thawed         -0.1          -10           10           

Model: 4
brokenpowerlaw.bp4
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   bp4.refer    frozen         5000  1.17549e-38  3.40282e+38  angstroms
   bp4.ampl     thawed            1  1.17549e-38  3.40282e+38  angstroms
   bp4.index1   thawed          0.1          -10           10           
   bp4.index2   thawed         -0.1          -10           10

The use of the '__ID' data indicator assigns a model to all the datasets in a datastack of subset beginning with the model component name preceding the data indicator. The model parameters are assigned names with the string '__ID' replaced by 1,2,3,... as appropriate, as seen in the above examples.

Example 4

sherpa> freeze([], 'abs1')
sherpa> thaw(ds, 'poly.c0')
sherpa> thaw([1,3], 'poly.c1')

These are examples of different ways to freeze and thaw parameters using the various datastack instances.

Utility Functions for Datastacks

These functions operate on datstack instances.

clear_models - remove all models from the datastack
clear_stack - remove all datasets belonging to a datastack
show_stack - print all datasets in the stack, with basic metadata

The show_stack function lists the datasets currently in the stack and it displays the contents of the OBS_ID and MJD_OBS cards in each file header, if available.

Querying Datastacks

Datastack instances can be queried to find which datasets in the stack match some given requirements. The two main methods to do so are:

query_by_header_keyword(keyword, value)
query(func)

query_by_header_keyword is useful for any datasets that have a 'header' attribute whose elements can be accessed, for example as a Python dictionary (e.g. header['XTENSION']), like a DataPHA.

query(func) is a general approach, and accepts any function that is defined according to the following signature:

func(dataset) returns Boolean

The query function will return a list of all dataset IDs in the stack for which 'func' returns True.

From these two methods, one can derive any number of more specific methods. The output of the query functions can be used as a stack identifier to perform some actions, e.g.:

sherpa> f = query_by_header_keyword('INSTRUME', 'ACIS')
sherpa> set_source(f, "powlaw1d.p__ID")

or

sherpa> set_source(ds[f], "powlaw1d.p__ID")

A third call is implemented by default:

query_by_obsid(obsid)

This query returns the datasets(s) matching an OBS_ID, by using the OBS_ID keyword in the file header, if available.

Cleaning a Session

The datastack module offers the clean() function, that clears all models, datasets, and invokes Sherpa's clean() function.

Object Oriented Approach

In this approach, the user does not need to "import *", and can just use the objects defined in the datastack module, as shown below:

sherpa> from sherpa.astro import datastack as dsmod
sherpa> ds = dsmod.DataStack()
sherpa> dsmod.set_template_id("ID")
sherpa> ds.load_pha("@pha.lis")
sherpa> ds.show_stack()
sherpa> f = ds.query_by_header_keyword('INSTRUME', 'ACIS')
sherpa> ds[f].set_source("powlaw1d.pID")

Changes in CIAO 4.13

The ungroup and unsubtract commands have been added to the datastack module.

Bugs

See the bugs pages on the Sherpa website for an up-to-date listing of known bugs.