Apollo package (2): Mixed Logit model

What if we want to complement our analysis with accounting for preference heterogeneity? That is, what if we suspect individuals in our sample have different preferences towards the explanatory variables we want to consider?

The figure below (that I produced for my environmental valuation lectures) summarizes how preferences for a good may change and actually look like a distribution. Suppose that the average person is indifferent towards a good. Individual 1 does not like the good, so (s)he would be an the left side of this spectrum (see figure). On the other hand, individual 2 really appreciates the good, so (s)he is on the right side of this spectrum.

If we were to ignore preference heterogeneity, we would capture the average person’s preference (i.e. indifference) and find a zero impact of the good on utility. We would conclude people do not care about this good. Accounting for preference heterogeneity would instead yield an average zero effect but a statistically significant estimate of the standard deviation of this distribution. We can conclude instead that, while most people are indifferent, some people care a lot about it, while other dislike it.

In this blog post, I estimate a mixed logit model using data from a stated preference survey. The data is described in Axhausen et al. (2008) about mode of transport choice in Switzerland. Two alternatives were presented in a choice card differing in terms of cost, time (congested and uncongested), headway time and number of changes (if public transportation).

I use the Apollo package, which I already covered in this blog post. I will skip the explanation about each part of the function since I covered a lot of this in my previous blog post. There are a few differences between running a mixed logit and a conditional logit model in Apollo. The first one is that mixing has to be set to TRUE, as well as nCores need to be defined to a number (which will determine the speed of the optimization).

The second alteration is the inclusion of the standard deviation parameters for the ASCs and time and cost (e.g. sigma_cost). I also had to define the apollo_draws function, wherein I define the type of draws (e.g. halton) and how many. The number and types of draws will have an impact on how well the parameter solution will be found. I can also define different types of draws depending on what I want the distribution of the coefficients to be. Uniform or normal draws can be defined, but the code allows for different distributions (like triangular or log-normal) by making changes in the randcoeff list, wherein random coefficients are defined as depending on the beta and standard deviation parameters.


apollo_control = list(
  modelName       = "MNL_SP",
  modelDescr      = "MXL model",
  indivID         = "ID", 
  mixing          = TRUE,
  nCores          = 4,
  outputDirectory = "output"

apollo_beta=c(asc_alt1                      = 0,
              asc_alt2                      = 0.009726,
              beta_cost                = -5,
              beta_time                = -0.121072 , 
              beta_hw              = 0.466756 , 
              beta_ch              = -1.561858 ,
              sigma_asc_alt1            = -0.408566 ,
              sigma_asc_alt2            = -0.061177 ,
              sigma_cost               = -1.129939 , 
              sigma_time               = -0.050321  )

apollo_fixed = c("asc_alt1" )

apollo_draws = list(
  interDrawsType = "halton",
  interNDraws    = 5000,
  interUnifDraws = c(),
  interNormDraws = c("draws_alt1","draws_alt2", "draws_time", "draws_cost"),
  intraDrawsType = "mlhs",
  intraNDraws    = 0,
  intraUnifDraws = c(),
  intraNormDraws = c()

apollo_randCoeff = function(apollo_beta, apollo_inputs){
  randcoeff = list()
  randcoeff[["a_alt1"]]    = asc_alt1   + sigma_asc_alt1     * draws_alt1    
  randcoeff[["a_alt2"]]    = asc_alt2   + sigma_asc_alt2     * draws_alt2    
  randcoeff[["d_time"]]    = beta_time  + sigma_time * draws_time
  randcoeff[["d_cost"]]    = -exp(beta_cost  + sigma_cost * draws_cost)

If all the inputs are correctly defined, then the following command should not yield any error messages:

> apollo_inputs = apollo_validateInputs()
apollo_draws and apollo_randCoeff were found, so apollo_control$mixing was set to TRUE
Several observations per individual detected based on the value of ID. Setting panelData in apollo_control set to TRUE.
All checks on apollo_control completed.
All checks on database completed.
Generating inter-individual draws .... Done

The second part of the code is again similar to running a conditional logit model with a few changes. Instead of depending on fixed parameters, the utility from choosing a mode of transport depends on random coefficients. I also have to add a command to account for the fact that this is a mixed logit model by typing P = apollo_avgInterDraws(P, apollo_inputs, functionality).

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  P = list()
  V = list()
  V[["alt1"]]     = a_alt1     + tc1   * d_cost  + tt1  * d_time   + hw1 * beta_hw * d_time + ch1 * beta_ch
  V[["alt2"]]     = a_alt2     + tc2   * d_cost  + tt2  * d_time   + hw2 * beta_hw * d_time + ch2 * beta_ch 

  mnl_settings = list(
    alternatives  = c(alt1=1, alt2=2), 
    choiceVar     = choice,
    utilities     = V
  P[["model"]] = apollo_mnl(mnl_settings, functionality)
  P = apollo_panelProd(P, apollo_inputs, functionality)
  P = apollo_avgInterDraws(P, apollo_inputs, functionality)
  P = apollo_prepareProb(P, apollo_inputs, functionality)

model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)

This code takes a while to run (around 15 minutes) because we are using draws in a mixed logit model. Finally, I can see the estimated parameters:

> apollo_modelOutput(model)
Model run by anafl using Apollo 0.2.7 on R 4.1.1 for Windows.

Model name                       : MNL_SP
Model description                : MXL model
Model run at                     : 2022-04-05 12:32:09
Estimation method                : bfgs
Model diagnosis                  : successful convergence 
Number of individuals            : 310
Number of rows in database       : 2790
Number of modelled outcomes      : 2790

Number of cores used             :  4 
Number of inter-individual draws : 5000 (halton)

LL(start)                        : -1388.58
LL(0)                            : -1933.88
LL(C)                            : -1933.86
LL(final)                        : -1209.05
Rho-square (0)                   :  0.3748 
Adj.Rho-square (0)               :  0.3702 
Rho-square (C)                   :  0.3748 
Adj.Rho-square (C)               :  0.3701 
AIC                              :  2436.1 
BIC                              :  2489.51 

Estimated parameters             :  9
Time taken (hh:mm:ss)            :  00:15:3.29 
     pre-estimation              :  00:03:21.43 
     estimation                  :  00:05:59.59 
     post-estimation             :  00:05:42.27 
Iterations                       :  34  
Min abs eigenvalue of Hessian    :  3.866723 

Unconstrained optimisation.

                  Estimate        s.e.   t.rat.(0)    Rob.s.e. Rob.t.rat.(0)
asc_alt1          0.000000          NA          NA          NA            NA
asc_alt2          0.009758    0.064605      0.1510    0.063797        0.1530
beta_cost        -1.418772    0.170848     -8.3043    0.195676       -7.2506
beta_time        -0.121074    0.009798    -12.3572    0.012796       -9.4622
beta_hw           0.466740    0.035100     13.2974    0.050871        9.1750
beta_ch          -1.561821    0.077975    -20.0297    0.106579      -14.6542
sigma_asc_alt1   -0.408136    0.151940     -2.6862    0.188017       -2.1707
sigma_asc_alt2   -0.062260    0.503149     -0.1237    0.050507       -1.2327
sigma_cost       -1.129896    0.119504     -9.4548    0.120414       -9.3834
sigma_time       -0.050320    0.007452     -6.7527    0.008173       -6.1570

The estimates conform to my expectations. Cost of travel and travel time are estimated as distributions. Travel time was defined as normally distributed, and the estimated mean in -0.12 and estimated standard deviation is 0.05. This is how the distribution of the travel time coefficient looks like:

> x <- seq(-0.3, 0.05, length=100)
> y <- dnorm(x, -0.121074,0.050320)
> plot(x,y, type = "l", lwd = 2, xlab = "", ylab = "")

It seems that the vast majority of respondents perceive travel time as something that decreases their utility (Some more than others), but there is also a small fraction of people who may perceive travel time to be something good (or at least something they are indifferent to).

The Apollo package contains other features and ways to refine mixed logit models to account for various aspects of decision-making. I highly recomment playing around with the package to take advantage of its many features.


Axhausen, K. W., Hess, S., König, A., Abay, G., Bates, J. J., & Bierlaire, M. (2008). Income and distance elasticities of values of travel time savings: New Swiss results. Transport Policy15(3), 173-185.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s