Last modified: December 2013

URL: https://cxc.cfa.harvard.edu/ciao/ahelp/dm.html
Jump to: Description · Examples · Bugs · See Also


AHELP for CIAO 4.16

dm

Context: dm

Synopsis

CIAO Data Model: syntax for filtering and binning files

Description

The CIAO Data Model (DM) is a versatile interface used by CIAO to examine and manipulate standard format datafiles (e.g. FITS, ASCII). The DM enables powerful filtering and binning of datafiles. This document is an introduction to the DM syntax used by the CIAO tools.

Table of Contents

Related help files contain information and examples illustrating the capabilities of the DM. A list of these files can also be obtain from the CIAO command line with "about dm" or "ahelp -k dm".

Detailed technical information is available from the Introduction to the Data Model memo

1. DM Syntax and Virtual Files

The Data Model offers an easy and powerful means of filtering data. The filtered file can be directly input to a tool without writing it to disk first; this is known as a "virtual file." The virtual file, which can also be referred to as a subspace, is simply a means of defining a subset of interest in the dataset.

The basic syntax of a virtual file is:

filename[block][filter][binning][option][rename]
filename[block][filter][columns][option][rename]

filename: the input filename. All CIAO tools accept FITS file input, and many also accept ASCII files. Some tools only work on event files, while others require an input image. Refer to the individual tool help files for any restrictions.

[block]: the extension of the file to use, e.g. the name of the image or table. For FITS files, the block corresponds to an HDU and may be identified by name ("[EVENTS]") or number ("[2]"). If the block is not specified, the first "interesting" block is used (e.g. [EVENTS] for an event file). To view the blocks in a file, use "dmlist file.fits blocks".

[filter]: the filter to apply to the data. It indicates, for instance, which time period, energy range, or spatial region to use (e.g. "[time=1522012:1522320,1522400:1522600]"). Refer to "ahelp dmfiltering" for a full discussion of filtering.

[binning]: the binning specification for creating an image from an event file (e.g "[bin x=10:100:1,y=1:100:1]"). Refer to "ahelp dmbinning" for a full discussion of binning.

[columns]: the names of the columns to include ("[cols time,energy,]") or exclude ("[cols -phas]"). The syntax "[cols !phas]" may also be used, but the "!" symbol needs to be written as "\!" in the Unix shell, making the "-" syntax more convenient.

[option]: advanced options for the DM, such as specifying what the NULL character should be or how much memory to allow a tool to use. Refer to "ahelp dmopt" for a list of the available options.

[rename]: the name for the block in the output file. The default behavior is for the output to have the same block name, unless a file is binned to create an image; in that case, "_IMAGE" is added to the block name. (For information on renaming columns, refer to a later section in this file.)

2. Virtual Columns

A file may contain virtual columns whose values are calculated by applying a mathematical transform to an existing column. Virtual columns - such as EQPOS(RA,DEC) - do not physically exist in the event file; they are defined by the WCS information attached to another column, e.g. SKY.

The transformation is listed in the output of "dmlist evt2.fits cols":

1:    EQPOS(RA ) = (+278.3860) +TAN[(-0.000136667)* (sky(x)-(+4096.50))]
           (DEC)   (-10.5899 )      (+0.000136667)  (   (y) (+4096.50)) 

For most applications, these columns may be used the same as non-virtual columns in the file. It is possible to list, filter, and bin on virtual columns.

However, filtering and binning do not work reliably on virtual columns derived from non-monotonic coordinate transforms (e.g. MSC(THETA,PHI), or EQPOS near the poles; see "ahelp coords" for more information on these coordinate systems).

3. Renaming and Reordering Columns

It is possible to rename a column or change the order of the columns within a file. Note that certain CIAO tools require particular column names (e.g. time, energy), but none of the tools make assumptions about the order of the columns within a file.

To rename a column, run dmcopy with the column syntax "newname=oldname". Multiple columns may be renamed in the same command.

dmcopy "pi.fits[cols rate=count_rate]" pi_rate.fits
dmcopy "pi.fits[cols rate=count_rate, rate_err=count_rate_err,*]" \ 
       pi_rate_all.fits

The "count_rate" column in pi.fits is renamed to "rate" in pi_rate.fits. With the first command, "rate" will be the only column in the output file. The "*" operator indicates that all other columns should be copied unchanged to the output file.

The columns will appear in the output file in the order in which they are specified. So in the renaming case, "rate" will be the first column in the output. The "cols" syntax can be used to reorder columns without modifying them as well:

dmcopy "pi.fits[cols energy,time,pi,count_rate, *]" reorder.fits

Note that for a vector column sky(x,y),

"evt.fits[cols x,y]"

will retain the information that (x,y) is a vector column called "sky". Any of the following, however, will separate the vector components and lose the vector-dependent coordinate systems like RA and Dec:

[cols x]
[cols y,x]
[cols x,pha,y]

Examples

Example 1

acisf01843N002_evt2.fits[EVENTS][cols #1,#2,#3]
acisf01843N002_evt2.fits[cols time,ccd_id,node_id]

Select three columns of the EVENTS block by number or by name.

Example 2

acisf01843N002_evt2.fits[#row=1:4]

Select rows 1-4 from a FITS file.

Example 3

dmlist "evt.fits[events][pha=30:200,time=10:20,50:60]" data

Use the tool dmlist to print specific data values from the file to the screen. A filter is applied to the "events" block in the file evt.fits. The filter selects rows in the table for which the value of the pha column is >= 30 and < 20, and for which the time is either >= 10 and < 20 or >= 50 and < 60. Both the the pha and time filters must be satisfied for a row to pass the filter.

Example 4

acisf01843N002_evt2.fits[EVENTS][bin x=3200:4800:4,y=3200:4800:4]

Bin an event file into an image with this input to the tool dmcopy.

Example 5

acisf01843N002_evt2.fits[EVENTS][bin pi=1:1024:1]

Use this specification as input to the tool dmextract to bin an event file into a PI spectrum.

Example 6

dmcopy "evt.fits[cols -status]" evt_new.fits

Removed the status column from evt.fits.


Bugs

General

Linear transforms have an extra 0.5 bin shift applied.

When creating a file with a Linear WCS, the CXCDM adjusts the transform parameters to force a half bin offset. While mathematically consistent with the values input, the result may be confusing especially when the transform is well known, eg °C to °F.

-------------------------------------------------------------------------------
World Coord Transforms for Columns in Table Block simple2.dat
--------------------------------------------------------------------------------

ColNo Name
2: tempF = +32.90 [degree F] +1.80 * (tempC -0.50)

which is mathematically correct, but more commonly written as just

   tempF = +32 + 1.8 * tempC
Incorrect values from CAR transform.

The WCS library that the DM uses has a problem computing coordinate transforms that involve the CAR transform.

You may get a seg fault if you try to create a very large image. What constitutes "very large" depends on the data type, but for long and float images, 8192x8192 pixels seems to be the threshold.

(The image doesn't have to be square; it just needs to have 8192^2 pixels.)

This condition may be met when the "update=no" option is used. Normally, when you filter a dataset, the data subspace (which describes the boundaries of each column's data and therefore is the intersection of the initial minima and maxima with any subsequent filters) gets updated to reflect the filtering. However, when you give the "update=no" option, you instruct the DM not to update the subspace to reflect the current filter. Therefore, the full ranges for x and y are used in the binning, and you get a 8192x8192 image (and a seg fault, for the reason described above).

TNULL raw header keywords are not copied to the output file  (19 Aug 2011)
DM support for FITS TC* keywords

There are three issues with the generic use of TC* keywords in FITS files read and written by DM tools.

The first issue: TC*n[A-Z]. The DM function that composes this keyword does not strip off the 'P' from the before adding number and letter to the end.

Second issue: TCTY* keys are not recognized and therefore are not processed.

Third issue: DM is not always retaining the T*NAM information when it stores information on the DM descriptor, resulting in output keywords T*TYP instead.

EXTNAME is forced to be same a HDUNAME on output

When a FITS file is copied, the EXTNAME is forced on output to be the same as the HDUNAME. Typical Chandra/CIAO files require these to be the same; however, data from other missions and projects may not have the same requirement.

Updating file in place

Various CIAO tools including dmappend, dmhedit, dmreadpar, acis_clear_status_bits, and dmgti attempt to modify a files contents in place. In addition users of crates may attempt to write scripts that try to open a file with read+write access.

Attempting to write to a file that is unwriteable does not generate any waring or error message.

Files may not be writeable for several reasons. The normal UNIX file permissions may not allow a particular user to modify a file.

File that are gziped are never writeable, regardless of whether the file itself has file-write permission.

Standard in and standard out, accessed by the "-" (no quotes) file name are not writeable.

Individual blocks of data that have filters on the data are not writeable. So

my_evt.fits[sky=region(ds9.reg)]

is not writeable becuase it is actually filtering the data,

my_evt.fits[gti3]

is writeable since the filter is applied to select which block in the file.

Filtering Data

Incorrect results when fitering an image using exclude syntax with full option.

This bug is triggered when filtering an image with the [exclude ] synatx with a region together with the [opt full] directive to retain the original image size.

unix% dmcopy "img.fits[exclude sky=region(ciao.reg)][opt full]" filt_img.fits

The pixels outside a box that bounds the region are also filtered out (ie set to 0).

Workaround:

In many cases users can easily invert the logic in their region files to avoid needing to use the [exclude ] directive. This can be done any time the region only contains included shapes. For example:

unix% cat ciao.reg
circle(...A...)
circle(...B...)
circle(...C...)
...

unix% cat ciao.reg | awk ' BEGIN {print "field()"} {print "-"$0}' > exclude_ciao.reg
unix% cat exclude_ciao.reg
field()
-circle(...A...)
-circle(...B...)
-circle(...C...)
...

This file can then be used without needing to use the [exclude ] synatx

unix% dmcopy "img.fits[sky=region(exclude_ciao.reg)][opt full]" filt_img.fits

This has the advantage of also generally being much faster.

Region filtering images without specifing axis name(s).

dm region filtering with unnamed axes may fail for complex regions, eg

dmlist my_img.fits[circle(10,10,10)-box(0,0,30,30)]

The work around is to explicilty use the axis name, eg

dmlist my_img.fits[sky=circle(10,10,10)-box(0,0,30,30)]
Filtering on some WCS columns produces incorrect results  (08 Oct 2012)

When filtering on WCS columns, the range is taken by converting range of the parent columns and using those as the limits of the WCS columns. When the transform is highly non-linear, eg the TAN-P transform used to go from DETX,DETY to THETA,PHI, this can leads to incorrect limits and incorrect filters. Users who want to filter on WCS columns should give explict ranges and not rely on the computed min/maxes.

bad%  dmcopy "evt.fits[theta=:1]" 
good% dmcopy "evt.fits[theta=0:1]"
Creating a vector on-the-fly when region filtering

When region-filtering images, you can create a vector on the fly from any two axes by using a filter like "(#1,#3)=circle(...)". Although the image is filtered correctly with a temporary vector, the region filter isn't recorded in the subspace. Hence, tools that use the filtered file don't know that pixels outside the filter region are invalid. As a result, dmstat reports no nulls in the filtered image (unless you explicitly tell the DM to set pixels outside the filter to null by using "opt null=...").

Applying a bit-filter expression to an integer column does not work, nor does it cause an error.
Using incorrect syntax with the rectangle shape does not fail when filtering.

For example, setting xmax > xmin and/or ymax > ymin. Instead it appears that the Data Model simply swaps the min and max values.

Filtering an image on logical coordinates causes problems when the short cut of omitting a number (i.e. to indicate the default value) is used.

The exit status of dmcopy is also incorrectly set to 0 (success):

unix% dmcopy "image.fits[#1=1:20,#2=:]" delme.fits
# DMCOPY (CIAO): [ftColRead]: FITS error 308 bad first element number in 
dataset image.fits Block 1 PRIMARY

unix% echo $status
0

Workarounds:

  1. Omit the "#2=:" from the filter

  2. Specify a range for both elements: [#1=1:20,#2=1:20]

Trying to exclude a region filter with update=no will cause the image to be filtered by the region's bounding box.

For example:

unix% dmcopy "acis_img.fits[exclude sky=region(src.fits)][opt full,update=no]" filtered.fits 

The regions are correctly excluded; however, the image is also clipped at the bounding box around all the excluded shapes, so the corners of a few chips are removed.

Workarounds:

  1. Remove update=no. In this case, the Data Model internally inverts all exclude filters to be an inclusive filter, and correctly filters the image.

    Be aware that this process is much slower if the region is large. In that case, it will also add a large region keyword to the file's header, noticeably slowing down any operation on that file.

  2. For ASCII region files, it is also possible to manually invert the filter in the file. The "field()" region syntax is used to include the entire field, then remove the undesired sources. For instance,

    # Region file format: CIAO version 1.0
    circle(1635.5,4113.5,135.11408)
    circle(3975,4233,20)
    circle(2565.5,4129.5,40)
    circle(2129.5,4007.5,40)
    

    would become

    # Region file format: CIAO version 1.0
    field()
    -circle(1635.5,4113.5,135.11408)
    -circle(3975,4233,20)
    -circle(2565.5,4129.5,40)
    -circle(2129.5,4007.5,40)
    

    and the dmcopy filtering command would be

    unix% dmcopy "acis_img.fits[sky=region(src.ascii)][opt full,update=no]" filtered.fits 
    
The "or" syntax ("||") doesn't work inside a clause  (15 Apr 2011)

This example command does not work:

unix% dmlist "evt2.fits[(ccd_id=5||ccd_id=7),pha=2500:3500]" blocks 

Workaround:

Rewrite to include the filter conditions in each part of the conditional.

unix% dmlist "evt2.fits[(ccd_id=7,pha=2500:3500)||(ccd_id=5,pha=2500:3500)]" blocks 
Using one column from a vector column in region filter

Trying to use one column that is part of a vector column in a region expression can lead to a crash

% dmlist hrcf04482N003_evt2.fits"[(tg_lam,tg_d)=field()]" counts
4381489
# 24359: Received error signal SIGSEGV-segmentation violation.
# 24359: An invalid memory reference was made.
# 24359: segmentation fault: DMLIST (1) is:
exit_upon_error->NULL   

Where tg_d is part of the rd = (tg_r,tg_d) vector column.

The only work-around is to create a temporary file that dismantels the vector column by removing the other column

% dmcopy hrcf04482N003_evt2.fits"[cols -tg_r]" tmp_evt
% dmcopy tmp_evt"[(tg_lam,tg_d)=field()]" counts
4381489
Filtering on array columns is undefined.

Filter on an array column is not supported. The tool will run; however, the results are unpredictable.

ASCII Kernel

Error when reading SIMPLE text format with 0 rows.

SIMPLE ASCII files cannot have 0 rows. If there are 0 rows, then the column header definition row is incorrectly read. For example:

unix% echo "#foo" > bar
unix% dmlist bar cols
 
--------------------------------------------------------------------------------
Columns for Table Block bar
--------------------------------------------------------------------------------
 
ColNo  Name                 Unit        Type             Range
   1   foo                               String[4]                  

unix% dmlist bar data
 
--------------------------------------------------------------------------------
Data for Table Block bar
--------------------------------------------------------------------------------
 
ROW    foo
 
     1 #foo


unix% dmlist bar counts
1

The SIMPLE ASCII file should have 0 rows and we see from the data option that the column definition row has been read in as data.

Virtual columns are not supported.  (11 Dec 2007)

In the DM, you can normally do

unix% dmlist evt.fits"[cols ra,dec]"  data

even though RA and Dec are just coordinate systems defined on the X and Y columns in the file; the DM applies the transform on the fly. This doesn't work yet for ASCII files.

DTF-FIXED keyword comments may be truncated.

DTF-FIXED header lines may be up to 1024 characters long. However, if the keyword is longer than the FITS standard, the comment is truncated.

unix% input.txt output.dtf'[opt kernel=text/dtf-fixed]'

In input.txt:

TTYPE14 = 'Class' / LV Class Exo: M = missile [B = tactical ballistic  
missile (except Redstone) apo=80:200]
R = research rocket O = orbital LV V = RTV Y = Exo weather rocket X =  
Big test rocket D= Deep space launch

In output.dtf:

TTYPE14 = "Class " / LV Class Exo: M = missile [B = tactical  
ballistic missile (except Redstone) apo
Long column descriptions

ASCII files may contain keyword and column descriptions longer than 80 characters. This can cause random failures in the DM (unable to open columns, wrong data type, etc).

Binning & Rebinning Images

Rebinning an image with different values for the two axes causes the coordinate information to be lost

For example:

unix% dmcopy acis.img"[bin x=::5,y=::6]" acis5x6.img

Using the same value for both axes works correctly:

unix% dmcopy acis.img"[bin (x,y)=::5]" acis5.img
Running a DM tool on an image where one of the axes has been filtered results in an error.  (17 Jul 2009)
unix% dmextract
Input event file (ccd3.sky4.fits[y=3767:][bin sky=annulus(3786,3767,0:380:4)]):
Enter output file name (rprof.fits):
# dmextract (CIAO 4.0 Beta 2): WARNING: Input file, "ccd3.sky4.fits[y=3767:]", 
has no rows in it.

Bus error
Images without a physical coordinate system

For images without a physical coordinate system the DM will internally create one. This is not written to the output file which may lead to errors if the image size changes due to spatial filtering as the output file then has a different physical WCS compared to the input file.

See Also

calibration
caldb
chandra
coords, level, pileup, times
concept
autoname, ciao, ciao-install, history, parameter, stack, subspace
dm
dmascii, dmbinning, dmfiltering, dmmasks, dmopt, dmregions
paramio
paramio
tools::coordinates
dmcoords
tools::core
dmcopy, dmextract, dmlist
tools::statistics
dmstat