Last modified: 7 November 2022

URL: https://cxc.cfa.harvard.edu/ciao/caveats/multi_component_subspace.html

Files with Multiple GTI Subspace Components


Caveat: Files with Multiple GTI Subspace Components

The CXC Datamodel keeps track of all the filters that have been applied to a file and various tools use that information when performing analysis. For example the dmextract tool will use information about any prior spatial filtering to correct the BACKSCAL keyword when extracting a spectrum.

Each filtering step in the user's analysis is recorded in the file's subspace ; overlapping filters are intersected to produce the most concise representation of the filtering. For example

unix% dmcopy "evt.fits[energy=500:1000]" broad.fits
unix% dmcopy "broad.fits[energy=700:1200]" soft.fits
unix% dmlist soft.fits subspace
...
      12 energy               Real4                                  700.0:  1000.0 
...

In this example we see that the subspace of the file soft.fits shows the intersection of the original energy filter with the second one.

When data from multiple observations are combined or when complex filters are used the subspace may not produce a single intersection which can result in multiple subspace components. This can lead to unexpected results as most tools are only expecting a single subspace and may not consider the effects of having several components.

ACIS EXPNO column

Consider the following example based on the Manually reprojecting the data part of the Merging Data from Multiple Imaging Observations thread.

unix% download_chandra_obsid 4939,1631 
unix% chandra_repro 4939,1631 ./repro
unix% cd repro/
unix% reproject_events in=acisf01631_repro_evt2.fits out=1631_match_4939.evt match=acisf04939_repro_evt2.fits
unix% dmmerge "1631_match_4939.evt[cols -phas],acisf04939_repro_evt2.fits[cols -phas]" merged.fits
BTIMDRFT values are different...FAIL...
...(various header merging warnings are expected)...

Displaying the merged data in ds9 looks as expected and a cursory look at the header keywords looks reasonable. Specifically if we check the ONTIME keywords we see that the merged ONTIME is, as expected, the sum of the two input files.

% dmlist 1631_match_4939.evt,acisf04939_repro_evt2.fits,merged.fits  header,clean | grep ONTIME
ONTIME                   11197.5999583010 [s]       Sum of GTIs
ONTIME7                  11197.5999583010 [s]       Sum of GTIs
ONTIME                   50177.5597276390 [s]       Sum of GTIs
ONTIME7                  50177.5597276390 [s]       Sum of GTIs
ONTIME                   61375.159685940 [s]        Sum of GTIs
ONTIME7                  61375.159685940 [s]        Sum of GTIs

However, if we look at the file in more detail we will see that it now has multiple GTI7 blocks.

unix% % dmlist merged.fits blocks
 
--------------------------------------------------------------------------------
Dataset: merged.fits
--------------------------------------------------------------------------------
 
     Block Name                          Type         Dimensions
--------------------------------------------------------------------------------
Block    1: PRIMARY                        Null        
Block    2: EVENTS                         Table        15 cols x 77014    rows
Block    3: GTI7                           Table         2 cols x 1        rows
Block    4: GTI7_CPT2                      Table         2 cols x 2        rows

For ACIS data, we expect there to be multiple GTIx extensions (where x is a value between 0 and 9) when there are multiple CCDs in use. However, both of these observations only used a single chip, ACIS-7. If we display the full subspace

% dmlist merged.fits subspace
 
--------------------------------------------------------------------------------
Data subspace for block EVENTS: Components: 2 Descriptors: 16 
--------------------------------------------------------------------------------
 
 --- Component 1 --- 
   1 time                 Real8               TABLE GTI7
                                              
                                              106838774.1579459906:106849971.7579042912
   2 ccd_id               Int2                7:7 
   3 node_id              Int2                0:3 
   4 expno                Int4                200:14790 
....
 --- Component 2 --- 
   1 time                 Real8               TABLE GTI7_CPT2
                                              
                                              218303022.0314603448:218326570.978171140
                                              218326571.8191912174:218353200.4322080612
   2 ccd_id               Int2                7:7 
   3 node_id              Int2                0:3 
   4 expno                Int4                3:28224,28226:59888 
...

We see that both subspace components are related to CCD_ID=7; however the expno (exposure number) has different ranges in the two components. Because two subspace columns are different (time and expno) the overall subspace cannot be collapsed into a single range and the datamodel must keep them separate.

Since this file now has multiple GTI components, anything that is expecting there to only be one will produce unexpected results. For example, any additional filtering of the file causes the ONTIME and related keywords to be recomputed. [This is to keep them up to date with any possible time filters.]

unix% dmcopy "merged.fits[sky=region(ds9.reg)]" mysrc.fits
unix% dmlist mysrc.fits header,clean | grep ONTIME
ONTIME                   11197.5999583010 [s]       Sum of GTIs
ONTIME7                  11197.5999583010 [s]       Sum of GTIs

After the spatial filter is applied, the ONTIME is recomputed. The CXC convention is that ONTIME comes from the first GTI which in this case is the GTI related to the shorter observation's subspace component. Continuing analysis with this file can result in incorrect count rate and flux values. See caveat about extracting spectra from merged event files.

In the case of the ACIS expno column, users not interested in timing analysis using exposure number may simple delete the expno subspace. The merge_obs script automatically removes the expno subspace prior to merging.

unix% dmcopy "merged.fits[sky=region(ds9.reg)][subspace -expno]" mysrc_no_expno.fits clob+
unix% dmlist mysrc_no_expno.fits header,clean | grep ONTIME
ONTIME                   61375.159685940 [s]        Sum of GTIs
ONTIME7                  61375.159685940 [s]        Sum of GTIs

Note: In general the subspace information about a file is internally updated when the file is written out, not when read in. Users should therefore remove the subspace before running the tool that would be affected by it.

Why is the ONTIME from dmmerge different? dmmerge uses a set of merging rules input via the lookupTab parameter. The default rule for ONTIME is calcGTI which tells dmmerge to compute the ONTIME from the sum of the non-overlapping ranges from all GTIs. This is different than what the datamodel does and leads to the inconsistency.

HRC processing versions

A similar situation has arisen for HRC as is noted in the HRC subspace caveat page. Earlier versions of standard data processing had incorrectly set the subspace ranges for various Level 0 engineering columns. This was corrected with DS8.5. When a user attempts to merge together files with software versions before and after DS8.5 it can result in multiple GTI components and the same problems with ONTIME noted above.

unix% download_chandra_obsid 6202,13231 evt2
unix% mv {6202,13231}/primary/*evt2.fits.gz ./
unix% reproject_events hrcf06202N004_evt2.fits.gz 6202_reprj.fits match=hrcf13231N003_evt2.fits.gz clob+
unix% dmmerge 6202_reprj.fits,hrcf13231N003_evt2.fits.gz merge.fits clob+
... normal warnings ...
unix% dmlist merge.fits blocks 
 
--------------------------------------------------------------------------------
Dataset: merge.fits
--------------------------------------------------------------------------------
 
     Block Name                          Type         Dimensions
--------------------------------------------------------------------------------
Block    1: PRIMARY                        Null        
Block    2: EVENTS                         Table         9 cols x 2071847  rows
Block    3: GTI                            Table         2 cols x 1        rows
Block    4: GTI_CPT2                       Table         2 cols x 1        rows

The list of columns that need to be removed from HRC is different

unix% dmcopy "hrc_evt2.fits[subspace -av1,-au1,-mjf,-mnf,-endmnf,-sub_mjf,-clkticks]" evt2.fits

Complex filters

The issue with multiple GTI components is not restricted simply to merged datasets. Users who attempt to use complex filtering logic may create a situation that will cause multiple GTI components to be created. See the Compound Filters section of the DM filtering help file.

unix% dmcopy "acisf04939_repro_evt2.fits[time=218303022:218325000,sky=circle(4096,4906,10)||time=218326571:218353200,sky=box(4096,4096,10,10)]" multi.fits
unix% dmlist multi.fits subspace
 
--------------------------------------------------------------------------------
Data subspace for block EVENTS: Components: 2 Descriptors: 16 
--------------------------------------------------------------------------------
 
 --- Component 1 --- 
   1 time                 Real8               TABLE GTI7
                                              
                                              218303022.0314603448:218325000.0
....
   8 sky                  Real4               Circle(4096,4906,10)
   8 sky                  Real4               Field area = 6.71089e+07 Region area = 314.159

   8 sky                  [ 1] x                   4086.0:     4106.0 
   8 sky                  [ 2] y                   4896.0:     4916.0 
....
 --- Component 2 --- 
   1 time                 Real8               TABLE GTI7_CPT2
                                              
                                              218326571.8191912174:218353200.0
....
   8 sky                  Real4               Box(4096,4096,10,10)
   8 sky                  Real4               Field area = 6.71089e+07 Region area = 100

   8 sky                  [ 1] x                   4091.0:     4101.0 
   8 sky                  [ 2] y                   4091.0:     4101.0 
....
unix%  dmlist multi.fits header,clean | grep ONTIME
ONTIME                   21977.9685396550 [s]       Sum of GTIs
ONTIME7                  21977.9685396550 [s]       Sum of GTIs

Using the compound logic operation, ||, is similar to filtering the file multiple times and then dmmerging at the end.

unix% dmcopy "acisf04939_repro_evt2.fits[time=218303022:218325000,sky=circle(4096,4906,10)]" aa.fits
unix% dmcopy "acisf04939_repro_evt2.fits[time=218326571:218353200,sky=box(4096,4096,10,10)]" bb.fits
unix% dmmerge aa.fits,bb.fits cc.fits
unix% dmlist cc.fits  blocks
 
--------------------------------------------------------------------------------
Dataset: cc.fits
--------------------------------------------------------------------------------
 
     Block Name                          Type         Dimensions
--------------------------------------------------------------------------------
Block    1: PRIMARY                        Null        
Block    2: EVENTS                         Table        16 cols x 9        rows
Block    3: GTI7                           Table         2 cols x 1        rows
Block    4: GTI7_CPT2                      Table         2 cols x 1        rows
unix% dmlist cc.fits header,clean | grep ONTIME
ONTIME                   48606.1493484380 [s]       Sum of GTIs
ONTIME7                  48606.1493484380 [s]       Sum of GTIs

We see a difference in the ONTIME values because of the dmmerge lookupTab rule. As soon as any additional filtering is done, the value will be recomputed from the first GTI and will change back to the 21977.9685 seen above.

All of these example have dealt with multiple TIME subspaces, aka GTIs. The exact same kind of problem can happen when there are multiple spatial filters that cause multiple components to be created. In that case, things like dmextract's BACKSCAL keyword may be incorrect as it is taken from the first subspace component.