------------------------------------------ lines 6-547 of file: speed/capture_xam.cpp ------------------------------------------ {xrst_begin capture_xam.cpp} {xrst_spell alloc logit ndebug preprocessor rng royle } A Capture Example and Speed Test ################################ Syntax ****** | ``build/speed/capture_xam`` \\ | |tab| *random_seed* \\ | |tab| *number_random* \\ | |tab| *quasi_fixed* \\ | |tab| *trace_optimize_fixed* \\ | |tab| *ipopt_solve* \\ | |tab| *bool_sparsity* \\ | |tab| *hold_memory* \\ | |tab| *derivative_test* \\ | |tab| *start_near_solution* \\ | |tab| *number_fixed_samples* \\ | |tab| *number_locations* \\ | |tab| *max_population* \\ | |tab| *mean_population* \\ | |tab| *mean_logit_probability* \\ | |tab| *std_logit_probability* \\ | |tab| *random_constraint* Reference ********* J. Andrew Royle, Biometrics 60, 108-115 March 2004, *N-Mixture Models for Estimating Population Size* *from Spatially Replicated Counts.* Command Arguments ***************** random_seed =========== This is a non-negative integer equal to the seed for the random number generator, to be specific, :ref:`manage_gsl_rng@new_gsl_rng@s_in` used during the call to ``new_gsl_rng`` . number_random ============= This is a positive integer equal to the number of random effects. This is also equal to the number of times at which the measurements are made and is denoted by :math:`T` below. quasi_fixed =========== This is either ``yes`` or ``no`` and is the value of :ref:`derived_ctor@quasi_fixed` in the ``cppad_mixed`` derived class constructor. The amount of memory used by the :ref:`derived_ctor@mixed_derived` object, after the information matrix is computed, will be similar to after the initialization when *quasi_fixed* is no. trace_optimize_fixed ==================== This is either ``yes`` or ``no`` . If it is yes, a *print_level* = 5 :ref:`trace` of the fixed effects optimization is included in the program output. Otherwise the ipopt *print_level* is zero and no such trace is printed. ipopt_solve =========== This is either ``yes`` or ``no`` . If it is yes, the ``CppAD::ipopt::solve`` routine is used for optimizing the random effects, otherwise ``CppAD::mixed::ipopt_random`` is used; see :ref:`optimize_random@options@evaluation_method` . bool_sparsity ============= This is either ``yes`` or ``no`` . If it is yes, boolean sparsity patterns are used for this computation, otherwise set sparsity patterns are used. hold_memory =========== The CppAD memory allocator has a hold memory option will be set by ``CppAD::thread_alloc::hold_memory`` ( *hold_memory* ); where *hold_memory* is either ``yes`` or ``no`` . derivative_test =============== This is either ``yes`` or ``no`` . If it is yes, the derivatives of functions used in the optimization of the fixed effects are checked for correctness. (This requires extra time). start_near_solution =================== This is either ``yes`` or ``no`` . If it is yes, the initial point for the optimization is the value of the fixed effects used to simulate the data. Otherwise, the initial point is significantly different from this value. number_fixed_samples ==================== This is a positive integer equal to the number of samples simulated from the posterior distribution for the fixed effects using :ref:`sample_fixed-name` . The samples are used to approximation the standard deviation for the optimal fixed effects. One should use a large number of samples (at least 100) for this purpose. number_locations ================ This is a positive integer equal to the number of locations at which the measurements are made; i.e. :math:`R` in the reference. Increasing this value increases the amount for each function evaluation, but does not change the number of fixed or random effects. max_population ============== This is a positive integer equal to the maximum value in the finite summation with respect to population size; i.e., :math:`K` in the reference. This must be greater than any of the simulated number of captures at any location and time; i.e., and :math:`y_{i,t}`. Also note that *max_population* does not affect the simulated data :math:`y_{i,t}`. A suggested value is five times *mean_population* . The value of *max_population* is large enough if increasing it takes more time but does not make a difference in the optimal fixed effects (use the same value for the other arguments and same actual seed). mean_population =============== This is a positive floating point value equal to the mean of the Poisson distribution for the population used to simulate data values; :math:`\lambda` in reference. mean_logit_probability ====================== This is a positive floating point value equal to the mean of the logit of the capture probability (independent of the random effects) used to simulate data values. The is also equal to the mean of the random effects. std_logit_probability ===================== This is a positive floating point value equal to the standard deviation of the logit of the capture probability (independent of the random effects) used to simulate data values. The is also equal to the standard deviation of the random effects. random_constraint ================= This is either ``yes`` or ``no`` . If it is ``no`` , there is no :ref:`random constraint` for this example. If it is ``yes`` , the random constraint is .. math:: 0 = \hat{u}_0 ( \theta ) + \cdots + \hat{u}_{T-1} ( \theta ) where :math:`\hat{u} ( \theta )` is the :ref:`optimal random effects` . The corresponding :ref:`random constraint matrix` :math:`A` is the row vector of size :math:`T` with all ones; i.e., .. math:: A = [ 1 , \cdots , 1 ] \in \B{R}^{1 \times T} Output ****** Each output name, value pair is written in as *name* = *value* where the amount of spaces surrounding the equal sign is not specified. All of the pairs listed above are output. In addition, the following name value pairs are also output. cppad_mixed_version =================== The ``cppad_mixed`` version number. ldlt_cholmod ============ is the ``bin/run_cmake.sh`` configuration option :ref:`run_cmake.sh@ldlt_cholmod` . optimize_cppad_function ======================= is the ``bin/run_cmake.sh`` configuration option :ref:`run_cmake.sh@optimize_cppad_function` . ndebug_defined ============== is the ``NDEBUG`` preprocessor symbol defined. This should be yes (no) if the ``bin/run_cmake.sh`` configuration option :ref:`run_cmake.sh@build_type` is ``release`` (``debug`` ). actual_seed =========== If *random_seed* is zero, the system clock, instead of *random_seed* , is used to seed the random number generator. The actual random seed *actual_seed* is printed so that you can reproduce results when *random_seed* is zero. initialize_bytes ================ Is the amount of heap memory, in bytes, added to the program during its :ref:`initialize-name` call. Note that more temporary memory may have been used during this call. In addition, only memory allocated using ``CppAD::thread_alloc`` is included. initialize_seconds ================== Is the number of seconds used by the derived class :ref:`initialize-name` call. optimize_fixed_seconds ====================== Is the number of seconds used by the call to :ref:`optimize_fixed-name` that is used to compute the optimal fixed effects. optimize_random_seconds ======================= Is the number of seconds used by a single call to :ref:`optimize_random-name` that is used to compute the optimal random effects. information_mat_seconds ======================= Is the number of seconds used by the call to :ref:`information_mat-name` that computes the observed information matrix. sample_fixed_seconds ==================== Is the number of seconds used by the call to :ref:`sample_fixed-name` that computes the :ref:`number_sample_fixed` samples for the fixed effects. final_bytes =========== Is final amount of heap memory, in bytes, added and retained by the program. Only memory allocated using ``CppAD::thread_alloc`` is included. sum_random_effects ================== Is the sum of the optimal random effects. mean_population_estimate ======================== Is the estimate for the :ref:`capture_xam.cpp@Command Arguments@mean_population` computed by :ref:`optimize_fixed-name` . mean_logit_probability_estimate =============================== Is the optimal estimate for the :ref:`capture_xam.cpp@Command Arguments@mean_logit_probability` (computed by :ref:`optimize_fixed-name` ). std_logit_probability_estimate ============================== Is the optimal estimate for the :ref:`capture_xam.cpp@Command Arguments@std_logit_probability` . mean_population_std =================== Is the sample standard deviation of *mean_population_estimate* (corresponding to the sample computed by :ref:`sample_fixed-name` ). mean_logit_probability_std ========================== Is the sample standard deviation of *mean_logit_probability_estimate* . std_logit_probability_std ========================= Is the sample standard deviation of *std_logit_probability_estimate* . mean_population_ratio ===================== | ( *mean_population_estimate* ``-`` *mean_population* ) / | |tab| *mean_population_std* mean_logit_probability_ratio ============================ | ( *mean_logit_probability_estimate* ``-`` *mean_logit_probability* ) / | |tab| *mean_logit_probability_std* std_probability_ratio ===================== | ( *std_probability_estimate* ``-`` *std_probability* ) / | |tab| *std_probability_std* capture_xam_ok ============== The following conditions are checked. If they are all true, *capture_xam_ok* is yes. Otherwise it is no. #. *sum_random_effects* < 1 ``e-8`` || ( *random_constraint* == ``no`` ) #. *mean_population_ratio* < 5.0 #. *mean_logit_probability_ratio* < 5.0 #. *std_logit_probability_ratio* < 5.0 If *capture_xam_ok* is yes, the program return value is ``0`` (no error condition). Otherwise it is ``1`` (error condition). {xrst_toc_hidden bin/capture_xam.sh } Example ******* The file :ref:`capture_xam.sh-name` is an example using this program. Notation ******** .. list-table:: :widths: auto * - :math:`R` - number of sampling locations; i.e., *number_locations* . * - :math:`T` - number of sampling times; i.e., *number_random* . * - :math:`K` - maximum population in truncation of infinite summation; i.e., *max_population* * - :math:`\theta_0` - mean for :math:`N_i` given :math:`\theta` (:math:`\lambda` in reference); i.e., *mean_population* . * - :math:`\theta_1` - mean of logit of capture probability; i.e., *mean_logit_probability* . * - :math:`\theta_2` - standard deviation of logit of capture probability; i.e., *std_logit_probability* . * - :math:`N_i` - size of the population at *i*-th location * - :math:`y_{i,t}` - number of captures at location :math:`i` and time :math:`t` (:math:`n_{i,t}` in reference) * - :math:`y_i` - is the vector of captures at location :math:`i` :math:`( y_{i,0} , \ldots , y_{i, T-1} )`. * - :math:`M_i` - maximum of captures at *i*-th location (:math:`\max_t n_{i,t}` in reference) * - :math:`q_t` - capture probability at :math:`t` same for all locations (:math:`p_{i,t}` in reference) * - :math:`u_t` - random effect for each sampling time p(y_it|N_i,q_t) *************** We use a binomial distribution to model the probability of :math:`y_{i,t}` given :math:`N_i` and :math:`q_t`; i.e, .. math:: \B{p} ( y_{i,t} | N_i , q_t ) = \left( \begin{array}{c} N_i \\ y_{i,t} \end{array} \right) q_t^{y(i,t)} \left( 1 - q_t \right)^{y(i,t)} Furthermore, we assume that this probability is independent for each :math:`(i, t)`. p(N_i|theta) ************ We use a Poisson distribution to model the probability of :math:`N_i` given :math:`\theta_0`; i.e., .. math:: \B{p} ( N_i | \theta ) = \theta_0^{N(i)} \frac{ \exp[ - \theta_0 ] }{ N_i ! } We assume these this probability is independent for each :math:`i`. q_t(theta,u) ************ Section 2.4 of the :ref:`capture_xam.cpp@Reference` suggests a covariate model for the probability of capture. We use a similar model defined by .. math:: \R{logit} ( q_t ) = u_t + \theta_1 It follows that .. math:: q_t( \theta , u) = [ 1 + \exp(- u_t - \theta_1 ) ]^{-1} M_i *** We define the vector of maximum measurement for each location by .. math:: M_i = \max \left\{ y_{i,0} , \cdots , y_{i, T-1} \right\} p(y_i|theta,u) ************** The probability for :math:`y_i` (the captures at location :math:`i`) given :math:`N_i`, :math:`\theta`, and :math:`u` is .. math:: \B{p}( y_i | N_i, \theta , u ) = \prod_{t=0}^{T-1} \left( \begin{array}{c} {N(i)} \\ y_{i,t} \end{array} \right) q_t ( \theta , u)^{y(i,t)} \left( 1 - q_t( \theta , u) \right)^{y(i,t)} We do not know the population at each location :math:`N_i`, but instead have a Poisson prior for :math:`N_i`. We sum with respect to the possible values for :math:`N_i` to get the probability of :math:`y_i` given :math:`\theta` and :math:`u`. .. math:: \B{p}( y_i | \theta , u ) = \sum_{k=0}^K \B{p}( y_i | N_i=k, \theta , u ) \B{p}( N_i=k | \theta ) where :math:`k` is the possible values for :math:`N_i`. Note that :math:`K` should be plus infinity, but we use a fixed finite value for :math:`K` as an approximation for the infinite sum. p(y|theta,u) ************ Our model for the likelihood of the data at all the locations, given the fixed and random effects, is .. math:: \B{p}( y | \theta , u ) = \prod_{i=0}^{R-1} \B{p}( y_i | \theta , u ) Expressed in terms of fixed effects :math:`\theta`, the random effects :math:`u`, and the data :math:`y`, this is .. math:: \B{p}( y | \theta , u ) = \prod_{i=0}^{R-1} \left[ \sum_{k=0}^{K-1} \theta_0^k \frac{ \exp[ - \theta_0 ] }{ k ! } \prod_{t=0}^{T-1} \left( \begin{array}{c} {k} \\ y_{i,t} \end{array} \right) q_t ( \theta , u)^{y(i,t)} \left( 1 - q_t( \theta , u) \right)^{y(i,t)} \right] In ``cppad_mixed`` notation, this specifies the :ref:`random data density` . p(u|theta) ********** We use a normal distribution, with mean zero and standard deviation :math:`\theta_2`, for the distribution of the random effects :math:`u` given the fixed effects :math:`\theta`; i.e., .. math:: \B{p} ( u | \theta ) = \prod_{t=0}^{T-1} \frac{1}{ \theta_2 \sqrt{ 2 \pi } } \exp \left[ - \frac{1}{2} \frac{ u_t^2 }{ \theta_2^2 } \right] In ``cppad_mixed`` notation, this specifies the :ref:`random prior density` . p(theta) ******** For this example there is no :ref:`fixed prior density` :math:`\B{p}(\theta)`. p(z|theta) ********** For this example there is no :ref:`fixed data density` :math:`\B{p}(z | \theta)`. c(theta) ******** For this example there is no :ref:`fixed constraint function` :math:`c( \theta )`. Source Code *********** {xrst_literal // BEGIN C++ // END C++ } {xrst_end capture_xam.cpp}