sfametafrontier estimates a stochastic metafrontier model
for cross-sectional or pooled data. The function follows the theoretical
frameworks of Battese, Rao, and O'Donnell (2004) and O'Donnell, Rao, and
Battese (2008), and additionally implements the two-stage stochastic approach
of Huang, Huang, and Liu (2014). Three types of group-level frontier models
are supported: standard stochastic frontier analysis
(sfacross), sample selection stochastic frontier
analysis (sfaselectioncross), and latent class stochastic
frontier analysis (sfalcmcross).
Usage
sfametafrontier(
formula,
muhet,
uhet,
vhet,
thet,
logDepVar = TRUE,
data,
subset,
weights,
wscale = TRUE,
group = NULL,
S = 1L,
udist = "hnormal",
start = NULL,
scaling = FALSE,
modelType = "greene10",
groupType = "sfacross",
metaMethod = "lp",
sfaApproach = "ordonnell",
selectionF = NULL,
lcmClasses = 2L,
whichStart = 2L,
initAlg = "nm",
initIter = 100L,
lType = "ghermite",
Nsub = 100L,
uBound = Inf,
intol = 1e-06,
method = "bfgs",
hessianType = NULL,
simType = "halton",
Nsim = 100L,
prime = 2L,
burn = 10L,
antithetics = FALSE,
seed = 12345L,
itermax = 2000L,
printInfo = FALSE,
tol = 1e-12,
gradtol = 1e-06,
stepmax = 0.1,
qac = "marquardt",
...
)
# S3 method for class 'sfametafrontier'
print(x, ...)Arguments
- formula
A symbolic description of the frontier model to be estimated, based on the generic function
formula. ForgroupType = "sfaselectioncross", this argument specifies the frontier (outcome) equation and must be a standard formula whose left-hand side is the output (or cost) variable and whose right-hand side contains the frontier regressors (see alsoselectionF).- muhet
A one-part formula to account for heterogeneity in the mean of the pre-truncated normal distribution. Applicable only when
groupType = "sfacross"andudist = "tnormal". The variables specified model the conditional mean \(\mu_i = \bm{\omega}'\mathbf{Z}_{\mu}\) of the truncated normal inefficiency distribution (see section ‘Details’).- uhet
A one-part formula to account for heteroscedasticity in the one-sided error variance. Applicable for all three model types. The variance of the inefficiency term is modelled as \(\sigma^2_u = \exp(\bm{\delta}'\mathbf{Z}_u)\), where \(\mathbf{Z}_u\) are the inefficiency drivers and \(\bm{\delta}\) the associated coefficients (see section ‘Details’).
- vhet
A one-part formula to account for heteroscedasticity in the two-sided error variance. Applicable for all three model types. The variance of the noise term is modelled as \(\sigma^2_v = \exp(\bm{\phi}'\mathbf{Z}_v)\), where \(\mathbf{Z}_v\) are the heteroscedasticity variables and \(\bm{\phi}\) the coefficients (see section ‘Details’).
- thet
A one-part formula to account for technological heterogeneity in the construction of the latent classes. Applicable only when
groupType = "sfalcmcross". The variables specified enter the logit formulation that determines the prior class membership probabilities \(\pi(i,j)\) (see section ‘Details’).- logDepVar
Logical. Informs whether the dependent variable is logged (
TRUE) or not (FALSE). DefaultTRUE. Must match the transformation applied to the left-hand side offormula.- data
A data frame containing all variables referenced in
formula,selectionF,muhet,uhet,vhet,thet, andgroup.- subset
An optional vector specifying a subset of observations to be used in the estimation process.
- weights
An optional vector of weights to be used for weighted log-likelihood estimation. Should be
NULLor a numeric vector with strictly positive values. WhenNULL(default), all observations receive equal weight.- wscale
Logical. When
weightsis notNULL, a scaling transformation is applied such that the weights sum to the sample size: $$w_{\mathrm{new}} = n \times \frac{w_{\mathrm{old}}}{\sum w_{\mathrm{old}}}$$ DefaultTRUE. WhenFALSE, the raw weights are used without scaling.- group
Character string. The name of the column in
dataidentifying the technology group of each observation. The column is coerced to a factor internally and must have at least two unique values. WhengroupType = "sfalcmcross"andgroupisNULL, a single pooled latent class model is estimated and class assignments serve as groups (see section ‘Details’).- S
Integer. Frontier orientation.
S = 1(default): production or profit frontier, \(\varepsilon_i = v_i - u_i\).S = -1: cost frontier, \(\varepsilon_i = v_i + u_i\).
- udist
Character string. Distribution for the one-sided error term \(u_i \ge 0\). The following distributions are available for
groupType = "sfacross":"hnormal"(default): half-normal distribution (Aigner et al., 1977; Meeusen and van den Broeck, 1977)."exponential": exponential distribution."tnormal": truncated normal distribution (Stevenson, 1980)."rayleigh": Rayleigh distribution (Hajargasht, 2015)."uniform": uniform distribution (Li, 1996; Nguyen, 2010)."gamma": Gamma distribution, estimated by maximum simulated likelihood (Greene, 2003)."lognormal": log-normal distribution, estimated by maximum simulated likelihood (Migon and Medici, 2001; Wang and Ye, 2020)."weibull": Weibull distribution, estimated by maximum simulated likelihood (Tsionas, 2007)."genexponential": generalised exponential distribution (Papadopoulos, 2020)."tslaplace": truncated skewed Laplace distribution (Wang, 2012).
For
groupType = "sfaselectioncross"and"sfalcmcross", only"hnormal"is currently supported.- start
Numeric vector. Optional starting values for the maximum likelihood (ML) or maximum simulated likelihood (MSL) estimation of the group-level frontier models. When
NULL(default), starting values are computed automatically. ForgroupType = "sfacross", they are derived from OLS residuals. ForgroupType = "sfalcmcross", they depend onwhichStart.- scaling
Logical. Applicable only when
groupType = "sfacross"andudist = "tnormal". WhenTRUE, the scaling property model (Wang and Schmidt, 2002) is estimated, whereby \(u_i = h(\mathbf{Z}_u, \bm{\delta}) u^*_i\) and \(u^*_i\) follows a truncated normal distribution \(N^+(\tau, \exp(c_u))\). DefaultFALSE.- modelType
Character string. Applicable only when
groupType = "sfaselectioncross". Specifies the model used to correct for selection bias. Currently, only"greene10"(default) is supported, corresponding to the two-step approach of Greene (2010): a probit model is estimated for the selection equation, and its inverse Mills ratio is included as a correction term in the stochastic frontier second step.- groupType
Character string. Type of frontier model estimated for each technology group. Three options are available:
"sfacross"(default): standard cross-sectional stochastic frontier analysis (sfacross). Groups are defined by thegroupvariable. All 10 distributions forudistare supported, along with heteroscedasticity in both error components (uhet,vhet), heterogeneity in the truncated mean (muhet), and the scaling property."sfaselectioncross": sample selection stochastic frontier analysis (sfaselectioncross). Corrects for sample selection bias via the generalised Heckman approach (Greene, 2010). RequiresselectionF. Only observations for which the selection indicator equals one enter the frontier and metafrontier; efficiency estimates for non-selected observations areNA. Onlyudist = "hnormal"is supported."sfalcmcross": latent class stochastic frontier analysis (sfalcmcross). Estimates a finite mixture of frontier models with the number of classes determined bylcmClasses. Whengroupis supplied, a separate latent class model is estimated per group-stratum and combined for the metafrontier. Whengroupis omitted, a single pooled model is estimated and class assignments serve as technology groups. Supportsthetfor class-membership covariates anduhet,vhetfor within-class heteroscedasticity. Onlyudist = "hnormal"is supported.
- metaMethod
Character string. Method for estimating the global metafrontier that envelopes all group frontiers. Three options are available:
"lp"(default): deterministic linear programming envelope. Finds the parameter vector \(\bm{\beta}^*\) minimising \(\sum_i |\ln \hat{f}(x_i, \bm{\beta}^*) - \ln \hat{f}(x_i, \hat{\bm{\beta}}_{(g)})|\) subject to \(\ln \hat{f}(x_i, \bm{\beta}^*) \ge \ln \hat{f}(x_i, \hat{\bm{\beta}}_{(g)})\) for all observations and all groups (Battese et al., 2004)."qp": deterministic quadratic programming envelope. Minimises the sum of squared deviations under the same envelope constraint."sfa": stochastic metafrontier estimated by a second-stage pooled SFA. The specific construction of the dependent variable is determined bysfaApproach.
- sfaApproach
Character string. Applicable only when
metaMethod = "sfa". Determines how the second-stage SFA is constructed:"ordonnell"(default): The LP envelope of the group frontier predicted values is re-estimated with a stochastic frontier, following O'Donnell, Rao, and Battese (2008). The second-stage SFA directly targets the global technology envelope."huang": the group-specific fitted frontier value \(\ln \hat{y}^g_i\) for each observation is used as the dependent variable in a pooled cross-sectional SFA (Huang, Huang, and Liu, 2014). The technology gap \(U_i \ge 0\) and second-stage noise \(V_i\) are estimated directly by the SFA procedure."ordonnell": the column-wise maximum of all group-fitted frontier values (the deterministic LP envelope) is used as the dependent variable in the second-stage SFA (O'Donnell, Rao, and Battese, 2008).
- selectionF
A two-sided formula specifying the sample selection equation, e.g.,
selected ~ z1 + z2. The left-hand side must be a binary (0/1) indicator already present indata:1means the observation participates in the frontier and metafrontier;0means it is excluded (efficiency estimates will beNA). Alternatively, a named list of formulas, one per group level, may be supplied to allow group-specific selection equations. Required whengroupType = "sfaselectioncross"; ignored otherwise.- lcmClasses
Integer. Number of latent classes to be estimated per group when
groupType = "sfalcmcross". Must be between2and5(default2). The optimal number of classes can be selected based on information criteria (seeic).- whichStart
Integer. Strategy for obtaining starting values in the latent class model (
groupType = "sfalcmcross"):1: starting values are obtained from the method of moments.2(default): the model is initialised by first solving a homoscedastic pooled cross-sectional SFA using the algorithm specified byinitAlgfor at mostinitIteriterations.
- initAlg
Character string. Optimisation algorithm used during the initialisation of the latent class model when
whichStart = 2. Only algorithms from themaxLikpackage are supported:- initIter
Integer. Maximum number of iterations for the initialisation algorithm when
whichStart = 2andgroupType = "sfalcmcross". Default100.- lType
Character string. Specifies how the likelihood is evaluated for the selection model (
groupType = "sfaselectioncross"). Five options are available:"ghermite"(default): Gauss-Hermite quadrature (seegaussHermiteData)."kronrod": Gauss-Kronrod quadrature (seeintegrate)."hcubature": adaptive integration over hypercubes (seehcubature)."pcubature": p-adaptive cubature (seepcubature)."msl": maximum simulated likelihood (controlled bysimType,Nsim,prime,burn,antithetics, andseed).
- Nsub
Integer. Number of quadrature nodes or integration subdivisions when
lTypeis"ghermite","kronrod","hcubature", or"pcubature". Applicable only whengroupType = "sfaselectioncross". Default100.- uBound
Numeric. Upper bound for the numerical integration of the inefficiency component when
lTypeis"kronrod","hcubature", or"pcubature". For Gauss-Hermite the bound is automatically infinite. Applicable only whengroupType = "sfaselectioncross". DefaultInf.- intol
Numeric. Integration tolerance for the quadrature approaches
"kronrod","hcubature", and"pcubature". Applicable only whengroupType = "sfaselectioncross". Default1e-6.- method
Character string. Optimisation algorithm for the main ML/MSL estimation of each group-level frontier model. Default
"bfgs". Eleven algorithms are available:"bfgs": Broyden-Fletcher-Goldfarb-Shanno (seemaxBFGS)."bhhh": Berndt-Hall-Hall-Hausman (seemaxBHHH)."nr": Newton-Raphson (seemaxNR)."nm": Nelder-Mead (seemaxNM)."cg": Conjugate Gradient (seemaxCG)."sann": Simulated Annealing (seemaxSANN)."ucminf": quasi-Newton optimisation with BFGS updating of the inverse Hessian and soft line search (seeucminf)."mla": Marquardt-Levenberg algorithm (seemla)."sr1": Symmetric Rank 1 trust-region method (seetrust.optim)."sparse": trust-region method with sparse Hessian (seetrust.optim)."nlminb": PORT routines optimisation (seenlminb).
- hessianType
Integer. Specifies which Hessian is returned for the group-level frontier estimation. The accepted values match those of the underlying
sfaRfunction for eachgroupType:For
groupType = "sfacross": if1(default), the analytic Hessian is returned; if2, the BHHH Hessian \(\mathbf{G}'\mathbf{G}\) is estimated.For
groupType = "sfalcmcross": if1(default), the analytic Hessian is returned; if2, the BHHH Hessian is estimated.For
groupType = "sfaselectioncross": if1, the analytic Hessian is returned; if2(default), the BHHH Hessian \(\mathbf{G}'\mathbf{G}\) is estimated. The BHHH default reflects the two-step nature of the selection estimator.
When
NULL(the package default), each group-level model uses the natural default of the correspondingsfaRfunction, ensuring that standard errors computed bysfametafrontierare identical to those from a standalonesfaRcall on the same group subset.- simType
Character string. Simulation method for maximum simulated likelihood (MSL). Applicable to
groupType = "sfacross"whenudistis"gamma","lognormal", or"weibull", and togroupType = "sfaselectioncross"whenlType = "msl":"halton"(default): Halton quasi-random sequences."ghalton": Generalised-Halton sequences."sobol": Sobol low-discrepancy sequences."uniform": pseudo-random uniform draws.
- Nsim
Integer. Number of simulation draws for MSL. Default
100.- prime
Integer. Prime number used to construct Halton or Generalised-Halton sequences. Default
2.- burn
Integer. Number of leading draws discarded from the Halton sequence to reduce serial correlation. Default
10.- antithetics
Logical. If
TRUE, antithetic draws are added: the firstNsim/2draws are taken, and the remainingNsim/2are \(1 - \text{draw}\). DefaultFALSE.- seed
Integer. Random seed for simulation draws, ensuring reproducibility of MSL estimates. Default
12345.- itermax
Integer. Maximum number of iterations for the main optimisation. Default
2000. Formethod = "sann", it is recommended to increase this substantially (e.g.,itermax = 20000).- printInfo
Logical. If
TRUE, optimisation progress is printed during estimation of each group-level model. DefaultFALSE.- tol
Numeric. Convergence tolerance. The algorithm is considered converged when the change in the log-likelihood between successive iterations is smaller than
tolin absolute value. Default1e-12.- gradtol
Numeric. Gradient convergence tolerance. The algorithm is considered converged when the Euclidean norm of the gradient is smaller than
gradtol. Default1e-6.- stepmax
Numeric. Maximum step length used by the
"ucminf"algorithm. Default0.1.- qac
Character string. Quadratic Approximation Correction for the
"bhhh"and"nr"algorithms when the Hessian is not negative definite:"marquardt"(default): step length is decreased while also shifting closer to the gradient direction."stephalving": step length is halved, preserving the current direction.
- ...
Additional arguments passed through to the second-stage SFA call when
metaMethod = "sfa".- x
An object of class
"sfametafrontier", as returned bysfametafrontier, for use with theprintmethod.
Value
sfametafrontier returns an object of class
"sfametafrontier", which is a list containing:
- call
The matched call.
- groupModels
A named list of fitted group-level frontier objects, one per technology group. Each element is of class
"sfacross","sfaselectioncross", or"sfalcmcross", depending ongroupType.- metaSfaObj
The fitted metafrontier object. For
metaMethod = "sfa", an object of class"sfacross"from the second-stage SFA. The dependent variable column inmetaSfaObj$dataTableis named according to the approach used:"lp_envelope"whensfaApproach = "ordonnell"(the column-wise maximum of all group-evaluated frontier values is the dependent variable) and"group_fitted_values"whensfaApproach = "huang"(each observation's own-group fitted frontier value is the dependent variable). FormetaMethod = "lp"or"qp", a list containing the optimisation result and the estimated envelope coefficients.- metaRes
Estimated metafrontier coefficients (with standard errors, z-values, and p-values for
metaMethod = "sfa", or the plain coefficient vector for deterministic envelopes).- formula
The
formulasupplied to the call.- metaMethod
The metafrontier estimation method used.
- sfaApproach
The second-stage SFA approach;
NAwhenmetaMethodis not"sfa".- groupType
The type of group-level frontier model estimated.
- group
The name of the grouping variable.
- groups
Character vector of unique group labels.
- S
The frontier orientation (
1or-1).- dataTable
The data used in estimation, augmented with
.mf_yhat_group(group-specific fitted frontier values) and.mf_yhat_meta(metafrontier fitted values).- lcmNoGroup
Logical.
TRUEwhengroupType = "sfalcmcross"andgroupwas not supplied.- lcmObj
When
lcmNoGroup = TRUE, the pooledsfalcmcrossobject.
Details
Standard stochastic frontier (groupType = "sfacross")
The stochastic frontier model is defined as: $$y_i = \alpha + \mathbf{x}_i'\bm{\beta} + v_i - Su_i$$ where \(y\) is the output (cost, revenue, or profit), \(\mathbf{x}\) is the vector of frontier regressors, \(u_i \ge 0\) is the one-sided inefficiency term with variance \(\sigma^2_u\), and \(v_i\) is the symmetric noise term with variance \(\sigma^2_v\).
Estimation is by ML for all distributions except "gamma",
"lognormal", and "weibull", for which MSL is used with
Halton, Generalised-Halton, Sobol, or uniform draws. Antithetic draws are
available for the uniform case.
To account for heteroscedasticity, the variances are modelled as \(\sigma^2_u = \exp(\bm{\delta}'\mathbf{Z}_u)\) and \(\sigma^2_v = \exp(\bm{\phi}'\mathbf{Z}_v)\). For the truncated normal distribution, heterogeneity in the pre-truncation mean is modelled as \(\mu_i = \bm{\omega}'\mathbf{Z}_{\mu}\). The scaling property (Wang and Schmidt, 2002) can also be imposed for the truncated normal.
Sample selection frontier (groupType = "sfaselectioncross")
This model extends the Heckman (1979) selection framework to the
stochastic frontier setting (Greene, 2010; Dakpo et al., 2021).
The selection and frontier equations are:
$$y_{1i}^* = \mathbf{Z}_{si}'\bm{\gamma} + w_i, \quad
w_i \sim \mathcal{N}(0,1)$$
$$y_{2i}^* = \mathbf{x}_i'\bm{\beta} + v_i - Su_i$$
where \(y_{1i} = \mathbf{1}(y_{1i}^* > 0)\) is the binary selection
indicator and \(y_{2i} = y_{2i}^*\) is observed only when
\(y_{1i} = 1\). Selection bias arises from
\(\rho = \mathrm{Corr}(w_i, v_i) \ne 0\). Only selected observations
enter the frontier and metafrontier estimation; efficiency estimates for
non-selected observations are NA.
Latent class frontier (groupType = "sfalcmcross")
The latent class model (Orea and Kumbhakar, 2004) estimates a finite
mixture of \(J\) frontier models:
$$y_i = \alpha_j + \mathbf{x}_i'\bm{\beta}_j + v_{i|j} - Su_{i|j}$$
The prior class probability follows a logit specification:
$$\pi(i,j) = \frac{\exp(\bm{\theta}_j'\mathbf{Z}_{hi})}
{\sum_{m=1}^{J}\exp(\bm{\theta}_m'\mathbf{Z}_{hi})}$$
Class assignment is based on the maximum posterior probability computed via
Bayes' rule. When group is omitted, a single pooled model is
estimated and class assignments serve as technology groups.
Metafrontier estimation
The global metafrontier \(f(x_i, \bm{\beta}^*)\) envelopes all
group frontiers. With LP (Battese et al., 2004),
\(\bm{\beta}^*\) minimises
\(\sum_i |\ln \hat{f}(x_i, \bm{\beta}^*) - \ln \hat{f}(x_i,
\hat{\bm{\beta}}_{(g)})|\)
subject to \(\ln \hat{f}(x_i, \bm{\beta}^*) \ge \ln \hat{f}(x_i,
\hat{\bm{\beta}}_{(g)})\). QP minimises the squared analogue. The
stochastic approaches (Huang et al., 2014; O'Donnell et al.,
2008) treat the technology gap \(U_i\) as a one-sided error in a
second-stage SFA. Group and metafrontier efficiencies are:
$$TE_i^g = \exp(-u_i^g), \quad
MTR_i = \exp(-U_i), \quad
TE_i^* = TE_i^g \times MTR_i$$
Both Jondrow et al. (1982) and Battese and Coelli (1988) estimators
are provided for each measure. See efficiencies for details.
References
Aigner, D. J., Lovell, C. A. K., and Schmidt, P. 1977. Formulation and estimation of stochastic frontier production function models. Journal of Econometrics, 6(1), 21–37. https://doi.org/10.1016/0304-4076(77)90052-5
Battese, G. E., and Coelli, T. J. 1988. Prediction of firm-level technical efficiencies with a generalized frontier production function and panel data. Journal of Econometrics, 38(3), 387–399. https://doi.org/10.1016/0304-4076(88)90053-X
Battese, G. E., Rao, D. S. P., and O'Donnell, C. J. 2004. A metafrontier production function for estimation of technical efficiencies and technology gaps for firms operating under different technologies. Journal of Productivity Analysis, 21(1), 91–103. https://doi.org/10.1023/B:PROD.0000012454.06094.29
Dakpo, K. H., Desjeux, Y., Latruffe, L., and Jeanneaux, P. 2021. Modelling pollution-generating technologies in performance benchmarking. Omega, 102, 102347. https://doi.org/10.1016/j.omega.2020.102347
Greene, W. 2003. Simulated likelihood estimation of the normal-gamma stochastic frontier function. Journal of Productivity Analysis, 19(2-3), 179–190. https://doi.org/10.1023/A:1022853416499
Greene, W. 2010. A stochastic frontier model with correction for sample selection. Journal of Productivity Analysis, 34(1), 15–24. https://doi.org/10.1007/s11123-009-0159-1
Hajargasht, G. 2015. Stochastic frontiers with a Rayleigh distribution. Journal of Productivity Analysis, 44(2), 199–208. https://doi.org/10.1007/s11123-014-0417-8
Heckman, J. J. 1979. Sample selection bias as a specification error. Econometrica, 47(1), 153–161. https://doi.org/10.2307/1912352
Huang, C. J., Huang, T.-H., and Liu, N.-H. 2014. A new approach to estimating the metafrontier production function based on a stochastic frontier framework. Journal of Productivity Analysis, 42(3), 241–254. https://doi.org/10.1007/s11123-014-0402-2
Jondrow, J., Lovell, C. A. K., Materov, I. S., and Schmidt, P. 1982. On the estimation of technical inefficiency in the stochastic frontier production function model. Journal of Econometrics, 19(2-3), 233–238. https://doi.org/10.1016/0304-4076(82)90004-5
Li, Q. 1996. Estimating a stochastic production frontier when the adjusted error is skewed. Economics Letters, 52(3), 221–228. https://doi.org/10.1016/S0165-1765(96)00857-9
Meeusen, W., and van den Broeck, J. 1977. Efficiency estimation from Cobb-Douglas production functions with composed error. International Economic Review, 18(2), 435–444. https://doi.org/10.2307/2525757
Migon, H. S., and Medici, E. 2001. Bayesian inference for generalised exponential models. Working paper, Universidade Federal do Rio de Janeiro.
Nguyen, N. B. 2010. Estimation of technical efficiency in stochastic frontier analysis. PhD thesis, Bowling Green State University.
O'Donnell, C. J., Rao, D. S. P., and Battese, G. E. 2008. Metafrontier frameworks for the study of firm-level efficiencies and technology ratios. Empirical Economics, 34(2), 231–255. https://doi.org/10.1007/s00181-007-0119-4
Orea, L., and Kumbhakar, S. C. 2004. Efficiency measurement using a latent class stochastic frontier model. Empirical Economics, 29(1), 169–183. https://doi.org/10.1007/s00181-003-0184-2
Papadopoulos, A. 2020. The half-normal specification for the two-tier stochastic frontier model. Journal of Productivity Analysis, 56(1), 1–14. https://doi.org/10.1007/s11123-021-00611-8
Stevenson, R. E. 1980. Likelihood functions for generalised stochastic frontier estimation. Journal of Econometrics, 13(1), 57–66. https://doi.org/10.1016/0304-4076(80)90042-1
Tsionas, E. G. 2007. Efficiency measurement with the Weibull stochastic frontier. Oxford Bulletin of Economics and Statistics, 69(5), 693–706. https://doi.org/10.1111/j.1468-0084.2007.00475.x
Wang, H.-J. 2012. Stochastic frontier models. In A Companion to Theoretical Econometrics, ed. B. H. Baltagi, Blackwell, Oxford.
Wang, H.-J., and Schmidt, P. 2002. One-step and two-step estimation of the effects of exogenous variables on technical efficiency levels. Journal of Productivity Analysis, 18(2), 129–144. https://doi.org/10.1023/A:1016565719882
Wang, W., and Ye, F. 2020. Estimation of the stochastic frontier model with a log-normal composite error. Journal of Productivity Analysis, 54(1), 1–13. https://doi.org/10.1007/s11123-020-00579-x
Dakpo, K. H., Desjeux, Y., and Latruffe, L. 2023. sfaR: Stochastic Frontier Analysis using R. R package version 1.0.1. https://CRAN.R-project.org/package=sfaR
Examples
if (FALSE) { # \dontrun{
###########################################################################
## -------- SECTION 1: Standard SFA Group Frontier ----------------------##
## Using the rice production dataset (ricephil) from Battese et al. ##
## Groups are formed based on farm area terciles (small/medium/large). ##
###########################################################################
data("ricephil", package = "sfaR")
ricephil$group <- cut(ricephil$AREA,
breaks = quantile(ricephil$AREA, probs = c(0, 1 / 3, 2 / 3, 1), na.rm = TRUE),
labels = c("small", "medium", "large"),
include.lowest = TRUE
)
## 1a. sfacross groups + LP metafrontier
## Deterministic envelope via linear programming (Battese et al., 2004).
meta_sfacross_lp <- sfametafrontier(
formula = log(PROD) ~ log(AREA) + log(LABOR) + log(NPK),
data = ricephil,
group = "group",
S = 1,
udist = "hnormal",
groupType = "sfacross",
metaMethod = "lp"
)
summary(meta_sfacross_lp)
# Retrieve individual efficiency and metatechnology ratio estimates:
ef_lp <- efficiencies(meta_sfacross_lp)
head(ef_lp)
## 1b. sfacross groups + QP metafrontier
## Deterministic envelope via quadratic programming.
meta_sfacross_qp <- sfametafrontier(
formula = log(PROD) ~ log(AREA) + log(LABOR) + log(NPK),
data = ricephil,
group = "group",
S = 1,
udist = "hnormal",
groupType = "sfacross",
metaMethod = "qp"
)
summary(meta_sfacross_qp)
## 1c. sfacross groups + Two-stage SFA metafrontier (Huang et al., 2014)
## The group-specific fitted frontier values serve as the dependent
## variable in the second-stage SFA, yielding a stochastic technology gap.
meta_sfacross_huang <- sfametafrontier(
formula = log(PROD) ~ log(AREA) + log(LABOR) + log(NPK),
data = ricephil,
group = "group",
S = 1,
udist = "hnormal",
groupType = "sfacross",
metaMethod = "sfa",
sfaApproach = "huang"
)
summary(meta_sfacross_huang)
ef_huang <- efficiencies(meta_sfacross_huang)
## 1d. sfacross groups + O'Donnell et al. (2008) stochastic metafrontier
## The LP deterministic envelope is used as the second-stage dependent
## variable: the metafrontier is estimated stochastically around the
## envelope.
meta_sfacross_odonnell <- sfametafrontier(
formula = log(PROD) ~ log(AREA) + log(LABOR) + log(NPK),
data = ricephil,
group = "group",
S = 1,
udist = "hnormal",
groupType = "sfacross",
metaMethod = "sfa",
sfaApproach = "ordonnell"
)
summary(meta_sfacross_odonnell)
###########################################################################
## -------- SECTION 2: Latent Class (LCM) Group Frontier ---------------##
## No observed group variable: a pooled sfalcmcross model assigns ##
## observations to 2 latent technology classes; these classes become the ##
## technology groups for the metafrontier. ##
###########################################################################
data("utility", package = "sfaR")
## 2a. sfalcmcross (pooled, 2 classes) + LP metafrontier
meta_lcm_lp <- sfametafrontier(
formula = log(tc / wf) ~ log(y) + log(wl / wf) + log(wk / wf),
data = utility,
S = -1,
groupType = "sfalcmcross",
lcmClasses = 2,
metaMethod = "lp"
)
summary(meta_lcm_lp)
ef_lcm_lp <- efficiencies(meta_lcm_lp)
# Per-class posterior probabilities and class-specific efficiencies are
# included alongside group and metafrontier efficiencies.
## 2b. sfalcmcross (pooled, 2 classes) + QP metafrontier
meta_lcm_qp <- sfametafrontier(
formula = log(tc / wf) ~ log(y) + log(wl / wf) + log(wk / wf),
data = utility,
S = -1,
groupType = "sfalcmcross",
lcmClasses = 2,
metaMethod = "qp"
)
summary(meta_lcm_qp)
## 2c. sfalcmcross (pooled, 2 classes) + Two-stage SFA metafrontier
## (Huang et al., 2014)
meta_lcm_huang <- sfametafrontier(
formula = log(tc / wf) ~ log(y) + log(wl / wf) + log(wk / wf),
data = utility,
S = -1,
groupType = "sfalcmcross",
lcmClasses = 2,
metaMethod = "sfa",
sfaApproach = "huang"
)
summary(meta_lcm_huang)
ef_lcm_huang <- efficiencies(meta_lcm_huang)
## 2d. sfalcmcross (pooled, 2 classes) + O'Donnell et al. (2008)
meta_lcm_odonnell <- sfametafrontier(
formula = log(tc / wf) ~ log(y) + log(wl / wf) + log(wk / wf),
data = utility,
S = -1,
groupType = "sfalcmcross",
lcmClasses = 2,
metaMethod = "sfa",
sfaApproach = "ordonnell"
)
summary(meta_lcm_odonnell)
###########################################################################
## -------- SECTION 3: Sample Selection SFA Group Frontier -------------##
## Simulated dataset with a Heckman selection mechanism. Only selected ##
## observations (d == 1) participate in the frontier and metafrontier. ##
## Efficiency estimates for non-selected observations are NA. ##
###########################################################################
N <- 2000
set.seed(12345)
z1 <- rnorm(N)
z2 <- rnorm(N)
v1 <- rnorm(N)
v2 <- rnorm(N)
g <- rnorm(N)
e1 <- v1
e2 <- 0.7071 * (v1 + v2)
ds <- z1 + z2 + e1
d <- ifelse(ds > 0, 1, 0) # binary selection indicator
group <- ifelse(g > 0, 1, 0) # two technology groups (0 and 1)
u <- abs(rnorm(N))
x1 <- rnorm(N)
x2 <- rnorm(N)
y <- x1 + x2 + e2 - u
dat <- as.data.frame(cbind(y = y, x1 = x1, x2 = x2, z1 = z1, z2 = z2, d = d, group = group))
## 3a. sfaselectioncross + LP metafrontier
## Selection bias is corrected via the Greene (2010) two-step probit
## approach. The LP envelope envelopes both groups' selected-sample
## frontier fitted values.
meta_sel_lp <- sfametafrontier(
formula = y ~ x1 + x2,
selectionF = d ~ z1 + z2,
data = dat,
group = "group",
S = 1L,
udist = "hnormal",
groupType = "sfaselectioncross",
modelType = "greene10",
lType = "kronrod",
Nsub = 100,
uBound = Inf,
method = "bfgs",
itermax = 2000,
metaMethod = "lp"
)
summary(meta_sel_lp)
ef_sel_lp <- efficiencies(meta_sel_lp)
## 3b. sfaselectioncross + QP metafrontier
meta_sel_qp <- sfametafrontier(
formula = y ~ x1 + x2,
selectionF = d ~ z1 + z2,
data = dat,
group = "group",
S = 1L,
udist = "hnormal",
groupType = "sfaselectioncross",
modelType = "greene10",
lType = "kronrod",
Nsub = 100,
uBound = Inf,
method = "bfgs",
itermax = 2000,
metaMethod = "qp"
)
summary(meta_sel_qp)
## 3c. sfaselectioncross + Two-stage SFA metafrontier (Huang et al., 2014)
meta_sel_huang <- sfametafrontier(
formula = y ~ x1 + x2,
selectionF = d ~ z1 + z2,
data = dat,
group = "group",
S = 1L,
udist = "hnormal",
groupType = "sfaselectioncross",
modelType = "greene10",
lType = "kronrod",
Nsub = 100,
uBound = Inf,
simType = "halton",
Nsim = 300,
prime = 2L,
burn = 10,
antithetics = FALSE,
seed = 12345,
method = "bfgs",
itermax = 2000,
metaMethod = "sfa",
sfaApproach = "huang"
)
summary(meta_sel_huang)
ef_sel_huang <- efficiencies(meta_sel_huang)
## 3d. sfaselectioncross + O'Donnell et al. (2008) stochastic metafrontier
meta_sel_odonnell <- sfametafrontier(
formula = y ~ x1 + x2,
selectionF = d ~ z1 + z2,
data = dat,
group = "group",
S = 1L,
udist = "hnormal",
groupType = "sfaselectioncross",
modelType = "greene10",
lType = "kronrod",
Nsub = 100,
uBound = Inf,
method = "bfgs",
itermax = 2000,
metaMethod = "sfa",
sfaApproach = "ordonnell"
)
summary(meta_sel_odonnell)
} # }
