# Gallery: Histograms

## Examples

- Displaying (x,y) data points as a histogram
- Displaying (xlow,xhigh,y) data points as a histogram
- A histogram which shows the bin edges
- A histogram showing the full range of options
- Filling a histogram with a pattern
- Comparing two histograms

## 1) Displaying (x,y) data points as a histogram

Histograms are used to plot binned one-dimensional data - of the form
(`x _{mid}`,

`y`) or (

`x`,

_{low}`x`,

_{high}`y`) - with optional error bars on the Y values. A later example shows how to display error bars on the points.

add_histogram("spectrum.fits[fit][cols x,y]")

In this example the data is stored as a set of (`x`, `y`)
points, where the x values give the center of each bin. Unlike
curves,
histograms default to only being drawn by a solid line; the
`symbol.style` attribute is set to `none`.

The preferences for curves can be found by using the
`get_preference` call:

chips> get_preference("histogram") histogram.stem : hist histogram.depth : default histogram.line.color : default histogram.line.thickness: 1 histogram.line.style : solid histogram.symbol.color : default histogram.symbol.style : none histogram.symbol.size : 5 histogram.symbol.angle : 0 histogram.symbol.fill : false histogram.err.color : default histogram.err.thickness: 1 histogram.err.style : line histogram.err.up : on histogram.err.down : on histogram.err.caplength: 10 histogram.dropline : off histogram.fill.color : default histogram.fill.opacity : 1 histogram.fill.style : nofill

The settings for the
current histogram
can be found by using the
`get_histogram` routine:

chips> get_histogram() depth = 100 dropline = False err.caplength = 10 err.color = default err.down = False err.style = line err.thickness = 1.0 err.up = False fill.color = default fill.opacity = 1.0 fill.style = 0 id = None line.color = default line.style = 1 line.thickness = 1.0 stem = None symbol.angle = 0.0 symbol.color = default symbol.fill = False symbol.size = 5 symbol.style = 0

## 2) Displaying (xlow,xhigh,y) data points as a histogram

In the first example, the binned data was given using the mid-point of each bin. In this example we show how histograms can be plotted by giving the low and high edges of each bin.

tbl = read_file("spectrum.fits[fit]") xlo = copy_colvals(tbl,"xlo") xhi = copy_colvals(tbl,"xhi") y = copy_colvals(tbl,"y") add_histogram(xlo,xhi,y,["line.color","red"]) log_scale(X_AXIS)

The bin edges can not be specified by passing in a file name to `add_histogram`,
so we have to read in the arrays using
crates
routines and
then plot them. We use the
`read_file`
to read in the file,
`copy_colvals`
to get the column values,
and then `add_histogram` to plot them.

## 3) A histogram which shows the bin edges

When using `x _{mid}` values, the bins are assumed
to be contiguous. This is not the case when the bin edges - namely

`x`and

_{low}`x`- are given. Here we plot data from a histogram with non-contiguous bins, setting the

_{high}`dropline`attribute so that all the bin edges are drawn (this attribute can also be used with histograms like that used in the first example).

tbl = read_file("histogram.fits") xlo = copy_colvals(tbl,"xlo") xhi = copy_colvals(tbl,"xhi") y = copy_colvals(tbl,"y") add_histogram(xlo,xhi,y,["line.style","longdash","dropline",True])

## 4) A histogram showing the full range of options

In this example we change most of the attributes of a histogram. The Filling a histogram with a pattern example shows how you can change the fill style of a histogram from solid to a pattern.

tbl = read_file("histogram.fits") xlo = copy_colvals(tbl,"xlo") xhi = copy_colvals(tbl,"xhi") y = copy_colvals(tbl,"y") dylo = copy_colvals(tbl,"dylo") dyhi = copy_colvals(tbl,"dyhi") hist = ChipsHistogram() hist.dropline = True hist.line.color = "red" hist.symbol.style = "diamond" hist.symbol.size = 4 hist.symbol.fill = True hist.symbol.color = "orange" hist.err.color = "green" hist.fill.style = "solid" hist.fill.opacity = 0.2 hist.fill.color = "blue" add_histogram(xlo,xhi,y,dylo,dyhi,hist) # Move the histogram behind the axes so that the tick marks are not hidden shuffle_back(chips_histogram)

#### Opacity and Postscript output

Note that the postscript output created by
`print_window` does not
support opaque region or histogram fills; instead the opacity is taken to be `1`.
The relative depth of the objects can be changed - by altering the
`depth` attribute or using the various "shuffle commands"
(`shuffle`,
`shuffle_back`,
`shuffle_front`,
`shuffle_backward`,
`shuffle_forward`,
and the set of `shuffle_<object>` routines)
so that overlapping objects are
not completely obscured if desired.

#### Solid fill and Postscript output

When using a solid fill, the off-screen output may show a pattern of lines within histograms or regions for postscript outputs, depending on what display program you are using. They should not appear when printed out.

## 5) Filling a histogram with a pattern

In this example we fill the histogram using a pattern, rather than a solid fill as used in the A histogram showing the full range of options example.

add_histogram("spectrum.fits[fit][cols x,y]") set_histogram(["fill.style","crisscross"]) set_histogram(["fill.color","green","line.color","red"])

The
`fill.style` attribute
of histograms is used to determine
how the region is filled. Here we use the value
"`crisscross`", rather than "`solid`", to fill the
histogram with crossed lines. These lines can be colored
independently of the histogram boundary.

## 6) Comparing two histograms

Multiple histograms can be added to a plot. Here we use a combination of the opacity setting and careful bin placement to allow the data to be compared.

The idea of this figure is to compare the Normal and Poisson distribution,
calculated using the routines from the
`np.random`
module.

def compare(mu, npts=10000): """Compare the Poisson and Normal distributions for an expected value of mu, using npts points. Draws histograms displaying the probability density function.""" ns = np.random.normal(mu, np.sqrt(mu), npts) ps = np.random.poisson(mu, npts) # Calculate the range of the histogram (using # a bin width of 1). Since np.histogram needs # the upper edge of the last bin we need edges # to start at xmin and end at xmax+1. xmin = np.floor(min(ns.min(), ps.min())) xmax = np.ceil(min(ns.max(), ps.max())) edges = np.arange(xmin, xmax+2) xlo = edges[:-1] xhi = edges[1:] # Calculate the histograms (np.histogram returns # the y values and then the edges) h1 = np.histogram(ns, bins=edges, normed=True) h2 = np.histogram(ps, bins=edges, normed=True) # Set up preferences for the histograms hprop = ChipsHistogram() hprop.dropline = True hprop.fill.style = "solid" hprop.fill.opacity = 0.6 add_window(8, 6, 'inches') split(2, 1, 0.01) # In the top plot we overlay the two histograms, # relying on the opacity to show the overlaps hprop.fill.color = "blue" hprop.line.color = "steelblue" add_histogram(xlo, xhi, h1[0], hprop) hprop.fill.color = "seagreen" hprop.line.color = "lime" add_histogram(xlo, xhi, h2[0], hprop) # Start the Y axis at 0, autoscale the maximum value limits(Y_AXIS, 0, AUTO) # Annotate the plot set_plot_title(r"\mu = {}".format(mu)) hide_axis('ax1') set_yaxis(['majorgrid.visible', True]) # Add regions to the title indicating the histogram type xr = np.asarray([0, 0.05, 0.05, 0]) yr = [1.02, 1.02, 1.1, 1.1] ropts = {'coordsys': PLOT_NORM, 'fill.color': 'blue', 'edge.color': 'steelblue'} lopts = {'coordsys': PLOT_NORM, 'size': 16, 'valign': 0.5, 'color': 'blue'} add_region(xr, yr, ropts) add_label(0.07, 1.06, 'Normal', lopts) ropts['fill.color'] = 'seagreen' ropts['edge.color'] = 'lime' lopts['color'] = 'seagreen' lopts['halign'] = 1 add_region(xr+0.95, yr, ropts) add_label(0.93, 1.06, 'Poisson', lopts) current_plot('plot2') # In the bottom plot we separate the two histograms, # so that they each cover half the bin width hprop.fill.color = "blue" hprop.line.color = "steelblue" add_histogram(xlo, xlo+0.5, h1[0], hprop) hprop.fill.color = "seagreen" hprop.line.color = "lime" add_histogram(xlo+0.5, xhi, h2[0], hprop) set_yaxis(['majorgrid.visible', True]) set_xaxis(['majortick.style', 'outside', 'minortick.style', 'outside']) limits(Y_AXIS, 0, AUTO) bind_axes('plot1', 'ax1', 'plot2', 'ax1') limits(X_AXIS, xmin, xmax) compare(7)

The code is written as a routine, which takes a single argument,
`mu`, the expected value for the two distributions. An optional argument
(`npts`) allows the number of points used to create each distribution
to be changed, but is not actually used when we call the routine
with `mu=7` at the end of the script.

The calls to
`np.random.normal`
and
`np.random.poisson`
create 10000 random numbers each, drawn from the given
distribution (we set the sigma of the normal distribution
to be the square root of the expected value).
These arrays are used to determine the minimum and
maximum ranges for the histograms (converted to the nearest
integer) using the
`np.min`,
`np.max`,
`np.floor`
and
`np.ceil`
routines from numpy.
From these values we can calculate the edges array used
to create the histogram; see the
`np.histogram` documentation
for an explanation of the `bins` and `normed` arguments.

The visualization consists of two plots; the first with the two histograms overlain and the second has the bins split evenly within each bin.

Annotation is added to label the visualization; in particular
regions and labels are added to the left and right of the title
area (by using the
plot-normalized coordinate system
and taking advantage of the
`halign` attribute to right-align the "`Poisson`"
label).

Most of the examples set attributes using the "list" approach
but here we either use a dictionary, where the key is the attribute
name, or use the `ChipsXXX` object. The following would all
produce a curve with no symbols and a green line:

lopts1 = ['line.color', 'green', 'symbol.style', 'none'] lopts2 = {'line.color': 'green', 'symbol.style': 'none'} lopts3 = ChipsCurve() lopts3.line.color = 'green' lopts3.lsymbol.syle = 'none' add_curve(x, y, lopts1) add_curve(x, y, lopts2) add_curve(x, y, lopts3)

#### Opacity and Postscript output

Note that the postscript output created by
`print_window` does not
support opaque region or histogram fills; instead the opacity is taken to be `1`.
The relative depth of the objects can be changed - by altering the
`depth` attribute or using the various "shuffle commands"
(`shuffle`,
`shuffle_back`,
`shuffle_front`,
`shuffle_backward`,
`shuffle_forward`,
and the set of `shuffle_<object>` routines)
so that overlapping objects are
not completely obscured if desired.

#### Solid fill and Postscript output

When using a solid fill, the off-screen output may show a pattern of lines within histograms or regions for postscript outputs, depending on what display program you are using. They should not appear when printed out.