Reading and writing files in Python
The CIAO Data Model can be
used to read and write FITS and ASCII files in Python,
using the crates module.
This provides a high-level interface to data; the
cxcdm module is also provided for those cases
where low-level access is required and the user is familiar
with the
Data Model
documentation.
The AstroPy
I/O module can be used to read and write files with
Python, but the CIAO helpdesk does not provide support for this
module. The AstroPy I/O modules do not recognize Data Model syntax, such as column selection,
filtering, or binning.
The following imports are assumed below:
Many Crates routines can be used either
as a function, such as get_colvals, or as a
method on an object, in this case the get_column method of
the TABLECrate object returned by
read_file. This guide prefers the method
option, but either can be used (although note that sometimes there
are differences in what the calls return).
For most examples the following event file will be used:
% dmlist acisf13770_repro_evt2.fits blocks
--------------------------------------------------------------------------------
Dataset: acisf13770_repro_evt2.fits
--------------------------------------------------------------------------------
Block Name Type Dimensions
--------------------------------------------------------------------------------
Block 1: PRIMARY Null
Block 2: EVENTS Table 16 cols x 85650 rows
Block 3: GTI3 Table 2 cols x 1 rows
Block 4: GTI2 Table 2 cols x 1 rows
Block 5: GTI1 Table 2 cols x 2 rows
Block 6: GTI0 Table 2 cols x 1 rows
Block 7: GTI6 Table 2 cols x 1 rows
Note that the CIAO contributed package includes the
crates_contrib.utils
and
crates_contrib.images
modules that may provide useful routines.
Reading a single block from a FITS file
The read_file function will read in
the "most interesting" block (in this case the
"EVENTS block), returning a
"crate" object:
Crate Type: <TABLECrate>
Crate Name: EVENTS
Ncols: 16
Nrows: 85650
acisf13770_repro_evt2.fits
['time', 'expno', 'ccd_id', 'node_id', 'chip(chipx, chipy)',
'tdet(tdetx, tdety)', 'det(detx, dety)', 'sky(x, y)', 'phas',
'pha', 'pha_ro', 'energy', 'pi', 'fltgrade', 'grade', 'status']
There are a number of routines in the crates
module for accessing data,
such as
get_col_names
and
get_colvals,
as well as methods on the crate object, such as
get_colnames and get_column (use
the Python help routine for help).
Some of the routines will return the data as a
cratedata object:
Name: time
Shape: (85650,)
Datatype: float64
Nsets: 85650
Unit: s
Desc: S/C TT corresponding to mid-exposure
Eltype: Scalar
Range:
Min: 455646659.46568
Max: 455677834.84234
<class 'pycrates.cratedata.CrateData'>
name=time units=s desc=S/C TT corresponding to mid-exposure
<class 'numpy.ndarray'>
[4.55647616e+08 4.55647616e+08 4.55647616e+08 4.55647616e+08]
The data can be filtered, for instance to remove the
low and high energy values:
Reading an ASCII file
The same routine (read_file)
can also be used to read in an ASCII file. First we
use the dmcopy tool to copy
a subset of the columns from the event file to an
ASCII file:
% dmcopy "pcadf13770_001N001_asol1.fits[cols time,ra,dec]" "asol.dat[opt kernel=text/simple]"
% file asol.dat
asol.dat: ASCII text
% head -4 asol.dat
#TEXT/SIMPLE
# time ra dec
4.5564730880322e+08 286.7639309292 4.524806469711
4.5564730905947e+08 286.7640088001 4.524777927964
This can then be used with read_file,
which returns a TABLECrate object,
just as the FITS version
did above:
Crate Type: <TABLECrate>
Crate Name: asol.dat
Ncols: 3
Nrows: 113138
As an example, the aspect solution can be plotted:
Note
The ASCII format does not have all the metadata that the FITS
format has, as can be seen in the lack of extra information -
such as a description, units, or a data range - in the
CrateData object:
Name: time
Shape: (113138,)
Datatype: float64
Nsets: 113138
Unit:
Desc:
Eltype: Scalar
Range:
Min: -1.7976931348623157e+308
Max: 1.7976931348623157e+308
This is partly because the "simple" ASCII format was used when
creating the asol.dat file. If
opt kernel=text/dtf
were used in the dmcopy call above then more metadata
would be available, but the file itself would be harder to parse
by code outside of CIAO.
Reading in all the blocks in a file
All the blocks can be accessed using the
read_dataset routine:
<class 'pycrates.cratedataset.CrateDataset'>
Crate Dataset:
File Name: acisf13770_repro_evt2.fits
Read-Write Mode: rw
Number of Crates: 7
1) Crate Type: <IMAGECrate>
Crate Name: PRIMARY
2) Crate Type: <TABLECrate>
Crate Name: EVENTS
Ncols: 16
Nrows: 85650
3) Crate Type: <TABLECrate>
Crate Name: GTI3
Ncols: 2
Nrows: 1
4) Crate Type: <TABLECrate>
Crate Name: GTI2
Ncols: 2
Nrows: 1
5) Crate Type: <TABLECrate>
Crate Name: GTI1
Ncols: 2
Nrows: 2
6) Crate Type: <TABLECrate>
Crate Name: GTI0
Ncols: 2
Nrows: 1
7) Crate Type: <TABLECrate>
Crate Name: GTI6
Ncols: 2
Nrows: 1
Note
Although read_dataset has read in
enough of the data file to recognize the different blocks, and
the columns in these blocks, it will not read in the actual
data values until a method like get_column or
routines like
get_colvals and
get_piximg
are called.
The keyword values can be found either as a
cratedata object
Name: EXPOSURE
Value: 28309.22660496
Unit: s
Desc: Exposure time
Groups: ['BASIC', 'EVENT', 'O_1', 'O_B', 'O_E']
or as a scalar
Note
The CIAO Data Model is not
the same as the FITS system, and so some FITS header keys,
particularly those that represent how column and image data
are stored (such as TUNIT1 and TTYPE1), are
not available as keywords.
The values are available through other
means - such as accessing the list of column names in a
table crate, or the data type stored in the values
field of a CrateData object.
Some of the metadata is accessible with the
Subspace
and
History and Comments
routines of the crates module.
Writing out a FITS block
Although it is possible to create a crate
directly - with TABLECrate or IMAGECrate - it is
easiest to read in a file, edit it, and then write it out.
This can be done with the write_file
routine or with the write method of the Crate class.
As an example, the ASCII file from above can have
- a units field added to the TIME column;
- a new column, set to the time written as an offset from the
first time and stored in kiloseconds;
- and converted to FITS format.
This creates the file:
% dmlist asol.fits cols
--------------------------------------------------------------------------------
Columns for Table Block asol.dat
--------------------------------------------------------------------------------
ColNo Name Unit Type Range
1 time s Real8 -Inf:+Inf
2 ra Real8 -Inf:+Inf
3 dec Real8 -Inf:+Inf
4 dt ks Real8 -Inf:+Inf Offset time
Writing out an ASCII file
The DataModel opt argument is used
to specify the output file format in a write method call,
write_file call, or
write_dataset call.
Since this uses the
CIAO Data Text format,
which uses a FITS-like encoding in ASCII, the output includes much of
the metadata, such as column units and descriptions, as the FITS
format does. Using "opt kernel=text/simple" would produce
an ASCII output file with no metadata.
% head asol_copy.dat
#TEXT/DTF
XTENSION= "TABLE "
HDUNAME = "asol.dat"
EXTNAME = "asol.dat"
TFIELDS = 4
TTYPE1 = "time "
TFORM1 = "1D " / data format of field.
TUNIT1 = "s " / physical unit of field.
TTYPE2 = "ra "
TFORM2 = "1D " / data format of field.
% dmlist asol_copy.dat cols
--------------------------------------------------------------------------------
Columns for Table Block asol.dat
--------------------------------------------------------------------------------
ColNo Name Unit Type Range
1 time s Real8 -Inf:+Inf
2 ra Real8 -Inf:+Inf
3 dec Real8 -Inf:+Inf
4 dt ks Real8 -Inf:+Inf Offset time
Writing out all the FITS blocks
The CrateDataset object
also has a write method, or write_dataset call, that will write all the blocks out. For example
Create a column
A column is created by adding a CrateData object
to a TABLECrate with the
add_col routine or the
add_column method of the TABLECrate
object.
Name: zF
Shape: (3,)
Datatype: float64
Nsets: 3
Unit:
Desc: The amazing zF value
Eltype: Scalar
Range:
Min: None
Max: None
The object can be added to a new crate, or
to an existing one,
as shown earlier:
The output is a FITS file (as no DM option
was included in the output file name). The output file includes the
explitly-set metadata - such as the block and column names, the number
of columns and rows, and the data - as well as automaticlly-created ones
(e.g. the use of Real8 rather than Real4 for the
data type of the column).
% dmlist test.fits blocks
--------------------------------------------------------------------------------
Dataset: test.fits
--------------------------------------------------------------------------------
Block Name Type Dimensions
--------------------------------------------------------------------------------
Block 1: PRIMARY Null
Block 2: TESTDATA Table 1 cols x 3 rows
% dmlist test.fits cols
--------------------------------------------------------------------------------
Columns for Table Block TESTDATA
--------------------------------------------------------------------------------
ColNo Name Unit Type Range
1 zF Real8 -Inf:+Inf The amazing zf value
% dmlist test.fits data,clean
# zF
1.0
3.0
-2.30
Note
Columns of different lengths can be added to a crate, as the
crate will be re-sized to the maximum length and the other columns
set to a default value (e.g. 0 or the empty string)
for the extra rows.
Creating a vector column
The CIAO data model allows pairs of columns to be given a
name, such as SKY representing the X and Y
columns:
% dmlist acisf13770_repro_evt2.fits"[cols sky]" cols
--------------------------------------------------------------------------------
Columns for Table Block EVENTS
--------------------------------------------------------------------------------
ColNo Name Unit Type Range
1 sky(x,y) pixel Real4 0.50: 8192.50 sky coordinates
--------------------------------------------------------------------------------
World Coord Transforms for Columns in Table Block EVENTS
--------------------------------------------------------------------------------
ColNo Name
1: EQPOS(RA ) = (+286.7680)[deg] +TAN[(-0.000136667)* (sky(x)-(+4096.50))]
(DEC) (+4.5239 ) (+0.000136667) ( (y) (+4096.50))
The pycrates create_vector_column
routine is used to create such a vector column. It
returns a single CrateData object which represents the
two components of the vector, in this case COMBO(XC, YC):
Name: COMBO
Shape: (3, 2)
Datatype: int64
Nsets: 3
Unit:
Desc:
Eltype: Vector
NumCpts: 2
Cpts: ['XC', 'YC']
Range:
Min: None
Max: None
Note that the data is arranged in pairs, that is (12, 2) then (14, 1) and
ending with (14, 3), rather than (12, 14, 15) and then (2, 1, 3). There
are several ways to achieve this with NumPy, such as the approach
above of using the
transpose
of the
vertical stack.
% dmlist vtest.fits cols
--------------------------------------------------------------------------------
Columns for Table Block TESTDATA
--------------------------------------------------------------------------------
ColNo Name Unit Type Range
1 zF Real8 -Inf:+Inf The amazing zf value
2 COMBO(XC,YC) Int4 -
% dmlist vtest.fits data
--------------------------------------------------------------------------------
Data for Table Block TESTDATA
--------------------------------------------------------------------------------
ROW zF COMBO(XC,YC)
1 1.0 (12,2)
2 3.0 (14,1)
3 -2.30 (15,4)
Removing a column
Columns can be removed from a crate with the
delete_col routine
or the delete_column method of the parent crate:
Create an image
Am image is also created with a CrateData object,
but this time added to an
IMAGECrate with the
add_piximg routine or the
add_image method of the IMAGECrate
object.
Name: ivals
Shape: (3, 4)
Datatype: int64
Nsets: 3
Unit: m
Desc: A superlative image
Eltype: Array
Ndim: 1
Dimarr: (4,)
Range:
Min: None
Max: None
Crate Type: <IMAGECrate>
Crate Name: IMGDATA
Rather than write out a single block, we use a CrateDataset
to store both the previous blocks (out1 and out2),
with the image block added first to avoid creating an empty
PRIMARY block:
Crate Dataset:
File Name:
Read-Write Mode: rw
Number of Crates: 2
1) Crate Type: <IMAGECrate>
Crate Name: IMGDATA
2) Crate Type: <TABLECrate>
Crate Name: TESTDATA
Ncols: 1
Nrows: 3
The FITS format doesn't support all the metadata (e.g. the
cd2.name field):
% dmlist ds.fits blocks
--------------------------------------------------------------------------------
Dataset: ds.fits
--------------------------------------------------------------------------------
Block Name Type Dimensions
--------------------------------------------------------------------------------
Block 1: IMGDATA Image Int4(4x3)
Block 2: TESTDATA Table 1 cols x 3 rows
% dmlist ds.fits cols
--------------------------------------------------------------------------------
Columns for Image Block IMGDATA
--------------------------------------------------------------------------------
ColNo Name Unit Type Range
1 IMGDATA[4,3] m Int4(4x3) -
--------------------------------------------------------------------------------
Physical Axis Transforms for Image Block IMGDATA
--------------------------------------------------------------------------------
Group# Axis#
1 1 X = #1
2 2 Y = #2
--------------------------------------------------------------------------------
World Coordinate Axis Transforms for Image Block IMGDATA
--------------------------------------------------------------------------------
Group# Axis#
Note
Please be careful with the axis ordering of the data. Here a
NumPy array of shape (3, 4) - which uses C ordering,
so y then x -
is written out as a 4 x 3 FITS array - which uses FORTRAN
ordering (x then y).