Last modified: 6 Dec 2023

URL: https://cxc.cfa.harvard.edu/sherpa/threads/user_model/

Sherpa User Models

Sherpa Threads (CIAO 4.16 Sherpa)


Overview

Synopsis:

It may be necessary to fit data to a function that is not pre-packaged as a Sherpa model; to do so, we may write that function as a Python, C or C++, or Fortran function and assign it to a Sherpa 'user model'. User models are those for which the user specifies the parameters and model calculation function. 'Table models' can also be defined in Sherpa these contain data read from a file, and have only one parameter, normalization. When the table model is "calculated", the model returns the data—or more generally, some calculated values based on those data values. In this thread, examples of user models are shown; for table models, see the thread "Sherpa Table Models."

Last Update: 6 Dec 2023 - reviewed for CIAO 4.16; made additional revisions to the Python user-model section.


User Models

Assigning a Python Function to a User Model

1D example | 2D example

Here, we consider the most simple example: a 1D user model that contains the equation for a line, \(y = mx + b\). To implement such a model in Python, we would write the function as follows:

def myline(pars, x):
    return pars[0] * x + pars[1]

This function would be written to a file, such as "myline.py". The name of the file is arbitrary, and does not have to match the name of the function it contains. Indeed, we could write several such user functions to a single file.

Next, we must import the new user function into a Python fitting session. A simple example script might look like this:

sherpa> from sherpa.astro.ui import * ##line only necessary in native (i)Python, unnecessary in a Sherpa environment
sherpa> from myline import *                                                          

sherpa> load_data(1, "foo.dat")
sherpa> plot_data()

sherpa> load_user_model(myline, "myl")
sherpa> add_user_pars("myl", ["m","b"])

sherpa> set_model(myl)

sherpa> myl.m = 30
sherpa> myl.b = 20

sherpa> show_model()
Model: 1
usermodel.myl
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   myl.m        thawed           30 -3.40282e+38  3.40282e+38           
   myl.b        thawed           20 -3.40282e+38  3.40282e+38           

sherpa> fit()
Dataset               = 1
Method                = levmar
Statistic             = chi2
Initial fit statistic = 825874
Final fit statistic   = 2.6905 at function evaluation 6
Data points           = 5
Degrees of freedom    = 3
Probability [Q-value] = 0.441845
Reduced statistic     = 0.896832
Change in statistic   = 825871
   myl.m          1.18921      +/- 0.942778    
   myl.b          -3.64869     +/- 3.9652      

sherpa> plot_fit()

A plot of the user model fit to the data contained in 'foo.dat' might appear as shown Figure 1.

Figure 1: Plot of user model fit to the data contained in 'foo.dat'

[Plot of user model fit to the data contained in 'foo.dat']
[Print media version: Plot of user model fit to the data contained in 'foo.dat']

Figure 1: Plot of user model fit to the data contained in 'foo.dat'

After the Sherpa package is imported, we must first import the user function stored in the file "myline.py". But this only makes the function accessible to Python; it must then be attached to an instance of the new user model class. This is done by the call to the function load_user_model. In load_user_model, an instance of the user model class is created and named "myl". The model can then be referred to as "myl" for the rest of the fitting session (as is done when it is passed as an argument to set_model). The user function "myline" is also assigned to the object "myl".

The load_user_model function has two required arguments: a reference to the user function, and the name of the instance of the user model class.

load_user_model(func, modelname)

func      = Reference to user function        (reference)
modelname = Name of model                     (string)  

A second function, add_user_pars, is then used to attach model parameters to this instance of the user model class. The number of user model parameters, their names, values and ranges, are all completely arbitrary—we can assign any names and values we wish. The only required arguments are the name of the model, "myl" in this case, and a list of model parameter names. Lists of initial values and ranges can also be passed in as additional optional arguments.

In this example, two parameters are created with add_user_pars: "m", and "b", the slope and y-intercept of the line function. After the calls to load_user_model and add_user_pars, we can refer to "myl.m" and "myl.b" just as we refer to model parameters pre-packaged with Sherpa.

[WARNING]
Warning

We must ensure that when calling add_user_pars, the parameters "m" and "b" are given in the same order that the function "myline" expects. That is, in the "myline" function, pars[0] is clearly the slope, and pars[1] the y-intercept. Therefore, in the call to add_user_pars, the names of the parameters should be given as ["m", "b"], and not as ["b", "m"].

The add_user_pars function has two required arguments: the name of the instance of the user model class, and a list of the model parameter names. It is also possible to pass initial parameter values and ranges as optional arguments:

sherpa> add_user_pars(modelname, parnames, parvals, parmins, parmaxs, parunits, parfrozen)

sherpa> modelname = Name of model                     (string)
sherpa> parnames  = List of names of model parameters (list of strings)
sherpa> parvals   = List of initial parameter values  (list of numbers)
sherpa> parmins   = List of soft parameter minima     (list of numbers)
sherpa> parmaxs   = List of soft parameter maxima     (list of numbers)
sherpa> parunits  = List parameter units              (list of strings)
sherpa> parfrozen = List of thaw flags                (list of True or False values)

Only the first two arguments to add_user_pars are required. To assign initial values, units, and thaw flags to this example, we could call add_user_pars as:

sherpa> add_user_pars("myl", ["m","b"], [30,20], parunits=["cm/s","s"], parfrozen=[True, False])

and we can subsequently see how the parameters are set compared to the bare settings shown earlier:

sherpa> set_model(myl)

sherpa> show_model()
Model: 1
usermodel.myl
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   myl.m        frozen           30 -3.40282e+38  3.40282e+38       cm/s
   myl.b        thawed           20 -3.40282e+38  3.40282e+38          s

Once the user function has been imported, the user model created with load_user_model, and parameters added to it with add_user_pars, then the model can be used just as any pre-defined Sherpa model.

The function add_model is available as an alternative to using load_user_model together with add_user_pars for explicitly defining user models as Python classes for use in Sherpa. The add_model command registers a user-defined Sherpa model class as a Sherpa model type, to allow users to create instances of models that are recognized automatically in Sherpa model expressions.

sherpa> from sherpa.models import PowLaw1D
sherpa> class MyPowLaw1D(PowLaw1D): pass
sherpa> add_model(MyPowLaw1D)
sherpa> set_model(xswabs.abs1*mypowlaw1d.p1)

In the example above, the Python class 'MyPowLaw1D' is defined, which inherits from the Sherpa 'Powlaw1D' model class. 'MyPowLaw1D' is registered as a Sherpa model type for use in Sherpa model expressions using add_model. The model expression for data set 1 is then set in the usual way, using the newly defined model type.


2D User Model Example

In the example above, we stepped through the details of defining a simple 1D function and assigning it to a user model in Sherpa; in the next example, we show that in a similar way, it is also possible to define 2D user models in Sherpa.

First, we define a 2D function in an external file; in this example, we define a simple Gaussian function in a file called mygauss2d.py (which we could also choose to define directly in a Sherpa Python fitting session):

import numpy as np

def mygauss2d(pars,x,y):
    (sigma, ampl, xpos, ypos) = pars
    r2 = (x - xpos)**2 + (y - ypos)**2
    return ampl*np.exp(-r2/(sigma*sigma))

Then, we write a simple script, like the one shown below, to import the user-defined Gaussian function into a Sherpa Python session, assign the function to a user model with the load_user_model and add_user_pars commands, and then fit the newly defined Gaussian user model to a loaded 2D data set:

sherpa> from sherpa.astro.ui import * ##line only necessary in native (i)Python, unnecessary in a Sherpa environment
sherpa> from mygauss2d import *

sherpa> load_user_model(mygauss2d, "g2d")
sherpa> add_user_pars("g2d",["sigma", "ampl", "xpos", "ypos"],[5,1,1,1],[0.1,0,0,0])

sherpa> print(g2d)
usermodel.g2d
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   g2d.sigma    thawed            5          0.1  3.40282e+38           
   g2d.ampl     thawed            1            0  3.40282e+38           
   g2d.xpos     thawed            1            0  3.40282e+38           
   g2d.ypos     thawed            1            0  3.40282e+38           

sherpa> load_data("image.fits")
sherpa> set_source(g2d)

sherpa> image_fit()

In this script, the optional third and fourth arguments of add_user_pars() are used to set up both the starting and minimum values of the parameters.


Assigning a Fortran Function to a User Model

Above, we showed examples of a user-written functions in Python. It is also possible to write functions in a compiled language, compile the code, and then wrap the functions so that they are accessible from Python. We have added support so that functions written in C or Fortran can be attached to an instance of the user model class. First, we will show how this is done for a Fortran model function.

Sherpa is built with the 'numpy' module, which adds support for arrays to Python. NumPy also provides a tool called "f2py"; given a file containing a function written in Fortran, f2py will compile the code, put it in a shared library, and then automatically write a Python module that can import that shared library into a Python session.

Consider the equation for a line, this time written in Fortran:

      integer function myline(p, x, y, xsize) 

           integer i, xsize
           real*8 p(*), x(xsize), y(xsize)

           myline = 0

           do i=1,xsize,1
              y(i) = p(1) * x(i) + p(2)
           enddo

           return
           end

Suppose the file "myline.f" contains this code. The first step is to use f2py to compile the code and provide a Python interface:

% setenv F77 /usr/bin/g77
% setenv CC /usr/bin/gcc
% f2py -m myline -c myline.f

Setting the environment variables F77 and CC is to ensure that the desired versions of the Fortran and C compilers are found by f2py. In the call to f2py, two options are used: "-c", to compile the Fortran code and generate the Python module from which it will be called; and "-m", to give that module the desired name (without this option, the resulting module is given the name "untitled") with a ".so" extension.

This enables us to call the raw Fortran function from Python. However, the function called from the user model has to have a certain signature—that is, certain arguments given in a particular order—and the function defined in myline.f does not have that signature. Fortunately, it is easy to write a Python "wrapper" function with the proper signature.

To continue with this example, create a Python file called "mypyline.py" containing the following code:

import numpy

import myline as ml

def mypyline(pars, x):
    y = numpy.zeros(x.shape, type(x[0]))
    ml.myline(pars, x, y, len(x))
    return y

In this Python code, we begin by importing numpy and myline. We then define the wrapper function "mypyline". For a 1-D function, mypyline must accept an array of parameters and an array of x-values, create a y-array of equal size to the x-array, and then return the y-array. The function "numpy.zeros" is used to create an empty y-array, of the same shape and size as the x-array. The "myline" module is imported, and given the name "ml" in this example. This is just to distinguish the name of the module from the function that the module contains; in this case we named the function myline as well.

At this point, we have compiled the Fortran function and written a Python interface the user model can accept. All that remains, then, is to create a user model and attach this function to it:

from sherpa.astro.ui import * ##line only necessary in native (i)Python, unnecessary in a Sherpa environment
import mypyline as mpl

load_data(1, "foo.dat")
plot_data()

load_user_model(mpl.mypyline, "myl")
add_user_pars("myl", ["m","b"])

As shown, in the Sherpa session we import the wrapper function we defined, "mypyline"; this provides a reference to a model function with the correct arguments in the correct order. The "mypyline" function takes care of importing the Fortran code when needed and calling the user-written function.

We may run f2py on an arbitrary number of Fortran files at the command-line. Thus, the module created by f2py can provide access to any number of Fortran functions; we need only import that module once to access all the functions.


Assigning a C/C++ Function to a User Model

see also Sherpa User Models in C

The code to be compiled may have been written in C or C++ instead of Fortran. The procedure for compiling such a function and attaching it to a user model is similar to the procedure for attaching a Fortran function. However, NumPy does not provide a nice tool like "f2py" for compiling the C code and providing a Python interface. The reason for this is that there are other tools to automatically generate Python extensions for C code. The most well-known of these is SWIG. So instead of providing his own tool to do this, the NumPy author provides an instruction file which SWIG uses to create an interface between Python and C that understands NumPy.

The Sherpa team has taken a different approach. Instead of using SWIG or a similar tool, we provide our own interface code for users to incorporate into a C extension containing the user function. The reason is that SWIG generates a lot of overhead which is not necessary when "wrapping" a single C function for Python access. So the Sherpa team provides the interface code and uses the Python distutils module to compile the C code and create a new Python module from which to call it. Please download the tar file usermodel.tar.gz (9.5 KB tarred & gzipped; 87 KB unpacked) for the Python/C interface code we provide for C user models. The package contains the following files:

array.hh
glue.hh
setup.py
userfuncs.cc

The files "array.hh" and "glue.hh" need not be touched at all; these files define the interface for passing NumPy arrays from Python to C code, in the order that Sherpa expects for model functions. The file "userfuncs.cc" is the file that we do need to touch, but only to add the function signature of our C function; we will arrive at this step momentarily.

First, we add our C model function in a new file, named as we please. Let's consider the line function again; a new file, "myline.c", is added, with the following function:

#include <stdlib.h>
#include <math.h>
#include <float.h>

int myline(const double* p, const double* x, const int xsize,
           double* y, const int ysize) {

  int i;
  for (i=0; i < xsize; i++) {
    y[i] = p[0] * x[i] + p[1];
  }

  return EXIT_SUCCESS;
}

int myline_int(const double* p, const double* xlo, const int xlosize,
               const double* xhi, const int xhisize,
               double* y, const int ysize) {

  // If not implemented, leave this function empty and return success
  return EXIT_SUCCESS;

}

Here, we define two function signatures: myline, which is passed an array of x-values, and myline_int, which is passed arrays of bin boundaries (low and high bounds of the bins in x). Thus we can create integrated and non-integrated forms of the same function; signatures for BOTH forms must be provided. If we are not interested in implementing one or the other function, we can just have it return EXIT_SUCCESS. In this example, we will concentrate on using the non-integrated form in the function myline.

Having written this C function, the first step is to include the proper function prototype in userfuncs.cc. Since userfuncs.cc is a C++ file, we must include information that identifies myline as a plain C function. This is done using extern in userfuncs.cc:

extern "C" {
  int myline(const double* p, const double* x, const int xsize,
             double* y, const int ysize);
  int myline_int(const double* p, const double* xlo, const int xlosize,
                 const double* xhi, const int xhisize,
                 double* y, const int ysize);
}

(If the function had actually been written in C++, then the 'extern "C"' block would not be used; we would simply include the C++ function prototype here instead.)

In the next section, we simply add the name of the function to the array of function definitions Python will know about when this module is imported into Python:

static PyMethodDef UserFuncs[] = {
  USERMODELFCT1D( myline, 2 ),
  { NULL, NULL, 0, NULL }  
};

The first argument to USERMODELFCT1D is the name of the function, myline; the second argument is the number of model parameters, which is two for the equation for a line. At this point, we are done touching C and C++ files.

At this point, we need only edit "setup.py" and add the name of the C file to the list of files to be compiled. To add myline.c, we edit this line of the file:

ext_modules=[Extension('userfuncs',
                             ['userfuncs.cc'],

so that it now reads:

ext_modules=[Extension('userfuncs',
                             ['userfuncs.cc','myline.c'],

We are now ready to use the Python distutils module to create a Python module for us, to provide access to the myline function. This C extension module will always be named "userfuncs", as noted above by the first argument to the "Extension" function. From the Unix command-line, do:

% setenv CC /usr/bin/gcc
% setenv CXX /usr/bin/g++
% python setup.py install --prefix=.
% setenv PYTHONPATH ${PYTHONPATH}:${PWD}/lib/python2.7/site-packages

This does several things. First, the paths to the C and C++ compilers are set by the CC and CXX environment variables. Next, the module userfuncs will be created and installed in the current working directory. Finally, PYTHONPATH is set to look in the current working directory so that the userfuncs module can be found. (If the module should be installed elsewhere, simply set the --prefix flag and PYTHONPATH to the desired directory.)

Now the user-written function exists in a Python module, and has already been given a function signature that Sherpa understands. The function can be attached to a user model:

from sherpa.astro.ui import * ##line only necessary in native (i)Python, unnecessary in a Sherpa environment
import userfuncs

load_data(1, "foo.dat")
plot_data()

load_user_model(userfuncs.myline, "myl")
add_user_pars("myl", ["m","b"])

Please note that for simplicity, we have used a single function—the equation for a line—in this example. We may add as many user-written functions to the userfuncs module as we wish. The functions simply need to be listed in userfuncs.cc and setup.py. If we were to add ten functions to the userfuncs module, they would all be available after importing the userfuncs module once (we would need to call load_user_model ten different times, to attach a different user function to ten different instances of the user model). Please see the thread Sherpa User Models in C for additional comprehensive examples of how to write your own Sherpa parameterized models in C, and include them in a Sherpa Python fitting session.


History

12 Jan 2009 original version
09 Sep 2009 "Table Models" section moved to a separate thread.
17 Dec 2009 updated for CIAO 4.2
26 May 2010 updated to include an example of a 2D user model
30 Jun 2010 updated for CIAO 4.2 Sherpa v2: new function add_model available for explicitly defining user models as Python classes for use in Sherpa. S-Lang version of thread removed.
06 Jan 2012 reviewed for CIAO 4.4: Figure 1 updated (error bars are now plotted by default)
04 Dec 2013 reviewed for CIAO 4.6: no changes
03 Apr 2014 added comment about sherpa.astro.ui line only necessary for Python environment.
08 Dec 2015 fixed typos.
04 Apr 2022 updates to the Python user-model section for CIAO 4.14.
06 Dec 2023 reviewed for CIAO 4.16; made additional revisions to the Python user-model section.