Last modified: Devember 2023

URL: https://cxc.cfa.harvard.edu/ciao/ahelp/dmascii.html
AHELP for CIAO 4.17

dmascii

Context: dm

Synopsis

Using the Data Model with text files

Description

CIAO users are familiar with the flexible filtering and binning capability that the Data Model tools provide with FITS files. Since CIAO 4.0, the same tools also work on ASCII (text) files, using the 'ASCII kernel'.

This kernel enables easy text file manipulation by DM-specific tools such as dmlist, dmcopy, dmstat, and dmtcalc. However, other CIAO tools (e.g. aconvolve) are not guaranteed to work smoothly with ASCII files.

For example, a raw text file containing a table with three columns and four rows:

unix% cat sample.dat
     21.0  41.3  21.8
     22.0  41.1  20.2
     23.0  43.8  17.3
     24.0  12.3  11.1

then we can use this file with many CIAO tools; for instance

unix% dmlist sample.dat cols
 
--------------------------------------------------------------------------------
Columns for Table Block sample.dat
--------------------------------------------------------------------------------
 
ColNo  Name                 Unit        Type             Range
   1   col1                              Real8          -Inf:+Inf            
   2   col2                              Real8          -Inf:+Inf            
   3   col3                              Real8          -Inf:+Inf            

You can use the full CIAO virtual-file syntax to filter these files; for example to display only those rows of the last two columns where the third column is between 11 and 20 you could say:

unix% dmlist "sample.dat[col3=11:20][cols col2,col3]" data,clean
#  col2                 col3
                43.80                17.30
                12.30                11.10

This command may be repeated with dmcopy to create an output file of the filtered data in FITS or text format:

unix% pset dmcopy infile="sample.dat[col3=11:20][cols col2,col3]"
unix% dmcopy outfile=filtered.fits
unix% dmcopy outfile="filtered.txt[opt kernel=text/simple]"

Additional examples are included later is this document. Note that the DM creates FITS format output by default; the kernel option must be specified every time to make the output be a text file.

Supported Text Formats

There are currently four formats recognized by the ASCII kernel:

text/raw

Raw text table format consists of free-format columns with no header information. There are only two supported datatypes for values in this format: dmDOUBLE and dmTEXT. All columns are scalar and are given the default names "col1", "col2", etc.

text/simple

The simple format is compatible with the SM plotting program. It is similar to raw, but allows for the inclusion of header information provided as a series of comment lines. Each line of the header must begin with the comment character, (see "comment" option). Either the first or last line of this header block, may be used to specify the table column names, (see "colnames" option). In its briefest form, the header consists of a single line of column names.

text/dtf or text/dtf-fixed

Data Text Format (DTF) is a pseudo-FITS format with support for the full list of datatypes, header keywords and data subspaces. Free format is the default, but fixed-format fields are also supported. This format MUST be used in order to define an image.

text/tsv

Generic TSV format files are recognized, as well as the extended header detail provided by the Chandra Source Catalog (CSC) output format. TSV flavor can not be auto-determined, it must either be specified by this kernel syntax or by putting the TSV flavor specification line at the top of the file using the syntax:

#TEXT/TSV

On input, the particular ascii format will be auto-determined. The user may override this by including the "kernel" option on the input file name. Since the default output kernel is FITS, the user MUST specify the output ascii flavor by including the "kernel" option in the output file name.

Ascii Kernel Options

There are several options that may be used to tailor the interface for a particular text file. You can use these options to read text files with a slightly different structure from the default, for example by skipping header lines or changing the field separator. Multiple options should be provided as a comma-separated list.

[opt colnames={value}]

Specifies which header line defines the column names. This line begins with a comment character, followed by a list of names delimited by the same character as the data. Supported values are "first", "last", and "none". If the "none" value is given, columns are defined from the data and auto-named "col1", "col2", etc. .

Column names may contain alphanumeric characters as well as the hash (#), underscore (_) or dash (-) character, although the latter should be avoided if possible. One may define array and vector columns by using the following name syntax:

Syntax Description Example
name(cpt1,cpt2) Vector of 2 components POS(X,Y)
name[size] Array of length n PHAS[3]

[opt comment={value}]

Lines that begin with the given character (e.g. "#") will be treated as comments. For text/raw format files, all comment lines are ignored. For text/simple format, comment lines which occur prior to the first data line will be retained, while any occuring within the data segment will be ignored. There is one special comment line which the "colnames" option controls, as described above.

[opt nullstr={value}]

On input, this value specifies an arbitrary string which should be interpreted as representing a NULL value. This is in addition to the 'default' NULL values for each datatype:

Type Values
INTEGER {empty}, {tnull}, -, INDEF, INF
REAL {empty}, -, NaN, INDEF
STRING {empty}

On output, this string will be used to represent all NULL values.

[opt skip={value}]

Skip the given number of lines at the beginning of the file. This helps handle some formats with fixed headers. For example, '[opt skip=3]' will skip the first three (3) lines of the file.

[opt sep={value}]

Define the given character(s) to be the separator for data fields. Any printing ASCII character from space (' ') to tilde ('~') may be a separator character, except the single quote ('), double quote ("), and backslash (\) characters. In addition, the non-printing tab character (HT) may be used (specified as '\t'). If more than one character is to be used, or if the space or comma character is used, the list of separators must be enclosed by quotes. Each instance of the character is interpreted as a new field.

Examples

Syntax Description
[opt sep=:] colon delimited values
[opt sep=" "] space delimited values
[opt sep=":;"] values delimited by EITHER colon or semi-colon

[white] qualifier

If the "white" qualifier is included, the separator is treated as whitespace. This means that if you have multiple separator characters next to each other, they only count as one separator.

The defaults for each output format are:

Option Raw Simple DTF
colnames none first n/a
comment '#' '#' n/a
nullstr "",NaN "",NaN "",NaN
skip 0 0 0
sep ' \t\r' ' \t\r' ' \t\r'
white on on on

Limitations

The ASCII kernel was developed to allow CIAO users to use the familiar DM syntax in manipulating and filtering text files; it is not intended as a replacement for the FITS kernel in pipelines. The following are some limitations of the kernel:


Examples

Example 1

unix% dmlist "input.txt[time=100:1000,energy=10:20]" data

Filter a text file on time and energy, printing the filtered data to the screen.

Example 2

unix% dmcopy input.fits "output.txt[opt kernel=text/simple]"

Copying a FITS file to a simple text output file that may be used by other code.

Example 3

unix% dmcopy input.txt output.fits

Create a FITS file from a text file.

Example 4

unix% dmcopy input.txt "output.txt[opt kernel=text/dtf-fixed]"

Convert simple text table to fixed format DTF table.

Example 5

unix% dmcopy "data.txt[time=100:200][opt sep=@]" "filtered.out[opt
kernel=text,sep=&]"

Copy the filtered input data to the output file, changing the separator character from "@" to "&". The output could then be used to create a table in LaTeX.

Example 6

unix% dmextract "event.fits[bin pi]" "pha.txt[opt kernel=text/dtf]"
type=pha1
text_spectrum_program pha.txt > new.pha.txt
unix% dmsort new.pha.txt sort.pha.fits

Create a Type 1 PHA file in DTF text format, propogating the full header. Use an external text-based tool, then run dmsort. The final file is again in FITS format.

Example 7

unix% dmtcalc LaunchData.txt"[cols length,diameter,launch_mass]"
calc_results.txt"[opt kernel=text/simple]"
expr="result=diameter*length*(diameter/launch_mass)"

Run dmtcalc on a fixed-format text file, creating a new column named "result".

Example 8

unix% dmcopy "events.txt[bin energy=500:700:10]" "myout.img[opt
kernel=text/dtf]"

Bin an ASCII table file into an ASCII image file.


Sample files

Below are basic samples of each of the ASCII formats.

Table: text/raw

alpha 167413425.5456684232 319 434 2119.869 4193.105
beta  167413425.5456684232 219 532 2020.054 4096.020
gamma 167413425.5867084265 607 483 3448.656 4140.115
delta 167413427.1456484795 420 331 4306.355 4289.419

Table: text/simple

#name time tx ty sky(x,y)
alpha 167413425.5456684232 319 434 2119.869 4193.105
beta  167413425.5456684232 219 532 2020.054 4096.020
gamma 167413425.5867084265 607 483 3448.656 4140.115
delta 167413427.1456484795 420 331 4306.355 4289.419

Table: text/dtf

XTENSION='TABLE'
HDUNAME = "MyINFO"
TFIELDS = 7
TTYPE1  = "name    "          
TFORM1  = "10A     "           / data format of field.
TTYPE2  = "time    "           / Time of event
TFORM2  = "1D      "           / data format of field.
TTYPE3  = "tx      "           / Tile position - X
TFORM3  = "1I      "           / data format of field.
TTYPE4  = "ty      "           / Tile position - Y
TFORM4  = "1I      "           / data format of field.
TTYPE5  = "x       "           / Sky position - X
TFORM5  = "1E      "           / data format of field.
TTYPE6  = "y       "           / Sky position - Y
TFORM6  = "1E      "           / data format of field.
TTYPE7  = "useFlag "           / Record usable? 
TFORM7  = "1L      "           / data format of field.
TTYPE8  = "status  "           / Event status bits
TFORM8  = "3X      "           / data format of field.
MTYPE1 = sky
MFORM1 = x,y
END

alpha 167413425.5456684232 319 434 2119.869 4193.105 T 001
beta  167413425.5456684232 219 532 2020.054 4096.020 T 010
gamma 167413425.5867084265 607 483 3448.656 4140.115 F 011
delta 167413427.1456484795 420 331 4306.355 4289.419 T 100

Image: text/dtf

XTENSION='IMAGE'
HDUNAME = "MyIMG"
BITPIX  =                   16
NAXIS   =                    2
NAXIS1  =                    4
NAXIS2  =                    3
MTYPE1  = "sky     "
MFORM1  = "x,y     "
END

1  2  3  4 
5  6  7  8 
9 10 11 12

Bugs

See the bugs page for the Data Model library on the CIAO website for an up-to-date listing of known bugs.

Refer to the CIAO bug pages for an up-to-date listing of known issues.

See Also

calibration
caldb
chandra
coords, level, pileup, times
concept
autoname, ciao, ciao-install, history, parameter, stack, subspace
dm
dm, dmbinning, dmfiltering, dmmasks, dmopt, dmregions
paramio
paramio
tools::coordinates
dmcoords
tools::core
dmcopy, dmextract, dmlist
tools::statistics
dmstat