| AHELP for CIAO 4.5 | dm |
Context: dm |
Synopsis
CIAO Data Model: syntax for filtering and binning files
Description
The CIAO Data Model (DM) is a versatile interface used by CIAO to examine and manipulate standard format datafiles (e.g. FITS, ASCII). The DM enables powerful filtering and binning of datafiles. This document is an introduction to the DM syntax used by the CIAO tools.
Table of Contents
- 1. DM Syntax and Virtual Files
- 2. Virtual Columns
- 3. Renaming and Reordering Columns
Related help files contain information and examples illustrating the capabilities of the DM. A list of these files can also be obtain from the CIAO command line with "about dm" or "ahelp -k dm".
- ahelp dmascii: using text files in CIAO
- ahelp dmbinning: creating images from event files/tables
- ahelp dmfiltering: table and image filtering
- ahelp dmregions: the CIAO region-filtering syntax
- ahelp dmopt: controlling internal DM options (setting NULL characters, providing more memory to a tool)
- ahelp coords: a discussion of the Chandra coordinate systems
- ahelp subspace : how files keep a history of the filters applied to them
Detailed technical information is available from the Introduction to the Data Model memo
1. DM Syntax and Virtual Files
The Data Model offers an easy and powerful means of filtering data. The filtered file can be directly input to a tool without writing it to disk first; this is known as a "virtual file." The virtual file, which can also be referred to as a subspace, is simply a means of defining a subset of interest in the dataset.
The basic syntax of a virtual file is:
filename[block][filter][binning][option][rename] filename[block][filter][columns][option][rename]
filename: the input filename. All CIAO tools accept FITS file input, and many also accept ASCII files. Some tools only work on event files, while others require an input image. Refer to the individual tool help files for any restrictions.
[block]: the extension of the file to use, e.g. the name of the image or table. For FITS files, the block corresponds to an HDU and may be identified by name ("[EVENTS]") or number ("[2]"). If the block is not specified, the first "interesting" block is used (e.g. [EVENTS] for an event file). To view the blocks in a file, use "dmlist file.fits blocks".
[filter]: the filter to apply to the data. It indicates, for instance, which time period, energy range, or spatial region to use (e.g. "[time=1522012:1522320,1522400:1522600]"). Refer to "ahelp dmfiltering" for a full discussion of filtering.
[binning]: the binning specification for creating an image from an event file (e.g "[bin x=10:100:1,y=1:100:1]"). Refer to "ahelp dmbinning" for a full discussion of binning.
[columns]: the names of the columns to include ("[cols time,energy,]") or exclude ("[cols -phas]"). The syntax "[cols !phas]" may also be used, but the "!" symbol needs to be written as "\!" in the Unix shell, making the "-" syntax more convenient.
[option]: advanced options for the DM, such as specifying what the NULL character should be or how much memory to allow a tool to use. Refer to "ahelp dmopt" for a list of the available options.
[rename]: the name for the block in the output file. The default behavior is for the output to have the same block name, unless a file is binned to create an image; in that case, "_IMAGE" is added to the block name. (For information on renaming columns, refer to a later section in this file.)
2. Virtual Columns
A file may contain virtual columns whose values are calculated by applying a mathematical transform to an existing column. Virtual columns - such as EQPOS(RA,DEC) - do not physically exist in the event file; they are defined by the WCS information attached to another column, e.g. SKY.
The transformation is listed in the output of "dmlist evt2.fits cols":
1: EQPOS(RA ) = (+278.3860) +TAN[(-0.000136667)* (sky(x)-(+4096.50))]
(DEC) (-10.5899 ) (+0.000136667) ( (y) (+4096.50))
For most applications, these columns may be used the same as non-virtual columns in the file. It is possible to list, filter, and bin on virtual columns.
However, filtering and binning do not work reliably on virtual columns derived from non-monotonic coordinate transforms (e.g. MSC(THETA,PHI), or EQPOS near the poles; see "ahelp coords" for more information on these coordinate systems).
3. Renaming and Reordering Columns
It is possible to rename a column or change the order of the columns within a file. Note that certain CIAO tools require particular column names (e.g. time, energy), but none of the tools make assumptions about the order of the columns within a file.
To rename a column, run dmcopy with the column syntax "newname=oldname". Multiple columns may be renamed in the same command.
dmcopy "pi.fits[cols rate=count_rate]" pi_rate.fits
dmcopy "pi.fits[cols rate=count_rate, rate_err=count_rate_err,*]" \
pi_rate_all.fits
The "count_rate" column in pi.fits is renamed to "rate" in pi_rate.fits. With the first command, "rate" will be the only column in the output file. The "*" operator indicates that all other columns should be copied unchanged to the output file.
The columns will appear in the output file in the order in which they are specified. So in the renaming case, "rate" will be the first column in the output. The "cols" syntax can be used to reorder columns without modifying them as well:
dmcopy "pi.fits[cols energy,time,pi,count_rate, *]" reorder.fits
Note that for a vector column sky(x,y),
"evt.fits[cols x,y]"
will retain the information that (x,y) is a vector column called "sky". Any of the following, however, will separate the vector components and lose the vector-dependent coordinate systems like RA and Dec:
[cols x] [cols y,x] [cols x,pha,y]
Example 1
acisf01843N002_evt2.fits[EVENTS][cols #1,#2,#3] acisf01843N002_evt2.fits[cols time,ccd_id,node_id]
Select three columns of the EVENTS block by number or by name.
Example 2
acisf01843N002_evt2.fits[#row=1:4]
Select rows 1-4 from a FITS file.
Example 3
dmlist "evt.fits[events][pha=30:200,time=10:20,50:60]" data
Use the tool dmlist to print specific data values from the file to the screen. A filter is applied to the "events" block in the file evt.fits. The filter selects rows in the table for which the value of the pha column is >= 30 and < 20, and for which the time is either >= 10 and < 20 or >= 50 and < 60. Both the the pha and time filters must be satisfied for a row to pass the filter.
Example 4
acisf01843N002_evt2.fits[EVENTS][bin x=3200:4800:4,y=3200:4800:4]
Bin an event file into an image with this input to the tool dmcopy.
Example 5
acisf01843N002_evt2.fits[EVENTS][bin pi=1:1024:1]
Use this specification as input to the tool dmextract to bin an event file into a PI spectrum.
Example 6
dmcopy "evt.fits[cols -status]" evt_new.fits
Removed the status column from evt.fits.
Bugs
General
The WCS library that the DM uses has a problem computing coordinate transforms that involve the CAR transform.
(The image doesn't have to be square; it just needs to have 8192^2 pixels.)
This condition may be met when the "update=no" option is used. Normally, when you filter a dataset, the data subspace (which describes the boundaries of each column's data and therefore is the intersection of the initial minima and maxima with any subsequent filters) gets updated to reflect the filtering. However, when you give the "update=no" option, you instruct the DM not to update the subspace to reflect the current filter. Therefore, the full ranges for x and y are used in the binning, and you get a 8192x8192 image (and a seg fault, for the reason described above).
(02 May 2013)
There are three issues with the generic use of TC* keywords in FITS files read and written by DM tools.
The first issue: TC*n[A-Z]. The DM function that composes this keyword does not strip off the 'P' from the before adding number and letter to the end.
Second issue: TCTY* keys are not recognized and therefore are not processed.
Third issue: DM is not always retaining the T*NAM information when it stores information on the DM descriptor, resulting in output keywords T*TYP instead.
Filtering Data
When filtering on WCS columns, the range is taken by converting range of the parent columns and using those as the limits of the WCS columns. When the transform is highly non-linear, eg the TAN-P transform used to go from DETX,DETY to THETA,PHI, this can leads to incorrect limits and incorrect filters. Users who want to filter on WCS columns should give explict ranges and not rely on the computed min/maxes.
bad% dmcopy "evt.fits[theta=:1]" good% dmcopy "evt.fits[theta=0:1]"
"col=foo" is okay, but "col=foo,bar" isn't.
Workaround:
Use "col=foo,col=bar" instead.
When region-filtering images, you can create a vector on the fly from any two axes by using a filter like "(#1,#3)=circle(...)". Although the image is filtered correctly with a temporary vector, the region filter isn't recorded in the subspace. Hence, tools that use the filtered file don't know that pixels outside the filter region are invalid. As a result, dmstat reports no nulls in the filtered image (unless you explicitly tell the DM to set pixels outside the filter to null by using "opt null=...").
For example, setting xmax > xmin and/or ymax > ymin. Instead it appears that the Data Model simply swaps the min and max values.
The exit status of dmcopy is also incorrectly set to 0 (success):
unix% dmcopy "image.fits[#1=1:20,#2=:]" delme.fits # DMCOPY (CIAO): [ftColRead]: FITS error 308 bad first element number in dataset image.fits Block 1 PRIMARY unix% echo $status 0
Workarounds:
Omit the "#2=:" from the filter
Specify a range for both elements: [#1=1:20,#2=1:20]
For example:
unix% dmcopy "acis_img.fits[exclude sky=region(src.fits)][opt full,update=no]" filtered.fits
The regions are correctly excluded; however, the image is also clipped at the bounding box around all the excluded shapes, so the corners of a few chips are removed.
Workarounds:
-
Remove update=no. In this case, the Data Model internally inverts all exclude filters to be an inclusive filter, and correctly filters the image.
Be aware that this process is much slower if the region is large. In that case, it will also add a large region keyword to the file's header, noticeably slowing down any operation on that file.
-
For ASCII region files, it is also possible to manually invert the filter in the file. The "field()" region syntax is used to include the entire field, then remove the undesired sources. For instance,
# Region file format: CIAO version 1.0 circle(1635.5,4113.5,135.11408) circle(3975,4233,20) circle(2565.5,4129.5,40) circle(2129.5,4007.5,40)
would become
# Region file format: CIAO version 1.0 field() -circle(1635.5,4113.5,135.11408) -circle(3975,4233,20) -circle(2565.5,4129.5,40) -circle(2129.5,4007.5,40)
and the dmcopy filtering command would be
unix% dmcopy "acis_img.fits[sky=region(src.ascii)][opt full,update=no]" filtered.fits
For example, the following commands both fail:
unix% dmlist "region.fits[shape!=Annulus]" data unix% dmlist catalog.fits"[COMMENT!='weak'][cols COMMENT]" data
For example, this command does not find all instances of "PMterm" in the selected columns:
unix% dmlist stat.fits"[src=PMterm||det=PMterm||mst=PMterm]" counts 13
Compare to
unix% dmlist stat.fits"[cols det,src,mst]" data,clean | grep PMterm | wc -l 27
This example command does not work:
unix% dmlist "evt2.fits[(ccd_id=5||ccd_id=7),pha=2500:3500]" blocks
Workaround:
Rewrite to include the filter conditions in each part of the conditional.
unix% dmlist "evt2.fits[(ccd_id=7,pha=2500:3500)||(ccd_id=5,pha=2500:3500)]" blocks
ASCII Kernel
In the DM, you can normally do
unix% dmlist evt.fits"[cols ra,dec]" data
even though RA and Dec are just coordinate systems defined on the X and Y columns in the file; the DM applies the transform on the fly. This doesn't work yet for ASCII files.
DTF-FIXED header lines may be up to 1024 characters long. However, if the keyword is longer than the FITS standard, the comment is truncated.
unix% input.txt output.dtf'[opt kernel=text/dtf-fixed]'
In input.txt:
TTYPE14 = 'Class' / LV Class Exo: M = missile [B = tactical ballistic missile (except Redstone) apo=80:200] R = research rocket O = orbital LV V = RTV Y = Exo weather rocket X = Big test rocket D= Deep space launch
In output.dtf:
TTYPE14 = "Class " / LV Class Exo: M = missile [B = tactical ballistic missile (except Redstone) apo
(02 May 2013) The text/tsv ASCII parser does not recognize columns with a G format used for floating-point, real valued values such as src_rate_aper_*. The exact behavior will vary based on the other columns returned, and in what order, but generally all columns after the G format will be unreadble by DM tools.
Workaround:
Users can work around the problem by editing the .tsv file and changing any (Gx.y) values to (Fx.y). For example:
Original file:
#Column src_rate_aper_b (G9.5) Aperture-corrected net count rate in...
Modified file:
#Column src_rate_aper_b (F9.5) Aperture-corrected net count rate in...
Binning & Rebinning Images
For example:
unix% dmcopy acis.img"[bin x=::5,y=::6]" acis5x6.img
Using the same value for both axes works correctly:
unix% dmcopy acis.img"[bin (x,y)=::5]" acis5.img
unix% dmextract Input event file (ccd3.sky4.fits[y=3767:][bin sky=annulus(3786,3767,0:380:4)]): Enter output file name (rprof.fits): # dmextract (CIAO 4.0 Beta 2): WARNING: Input file, "ccd3.sky4.fits[y=3767:]", has no rows in it. Bus error

![[CIAO Logo]](../imgs/ciao_logo_navbar.gif)