Sometimes it is handy to know how to write log-likelihood functions from scratch. I have read many papers that illustrate how a certain refinement of the econometric model conforms to economic theory and yields unbiased parameter estimates. To publish such a paper, you have to start from a “standard” log-likelihood function and make some type of adjustment.

For example, Englin and Shonkwiler (1995) want to correct for truncation and endogenous stratification in their data. To do so, they adjust the standard negative binomial model. They derive the density function that is most relevant for their data, which implies small changes in the corresponding log-likelihood function.

Other examples exist that propose an adjustment of the log-likelihood to conform with theory (e.g. choice set considerations). Bottom line is: it is useful to know how to program these maximum likelihood routines in R to more readily apply different models. Many of these novel models are not programmed into any software, and the authors do not often share publicly their code (which might explain why these model developments are rarely implemented by other authors).

Which leads me to the question: how to write a log-likelihood function in R?

It was useful to remind myself what maximum likelihood is all about. The maximum likelihood estimator is the one that maximizes the log-likelihood function (Wooldridge, 2010). This function is simply the product of the assumed density function of all N observations in the sample:

So the researcher’s first step is to make an assumption about the conditional density function f of a variable given (Wooldridge, 2010). There are “standard” models with well-known density functions. For example, the probability of a success in the logit model is:

The density function of observation is:

If we replace this density function in the first equation, we get the likelihood function for the logit model:

Maximizing the likelihood is equivalent to maximizing the log-likelihood, that is why we add a log to the equation above. When we take the log, instead of the product of N observations, we deal with the sum of N observations.

For illustration purposes, I will use the Poisson model. I want to write the Poisson’s log-likelihood function in R. I use R to write a *function* that depends on my dataframe (called *df*) and a number of parameters in a vector that I called *par*. So I write the log-likelihood function depending on Y (the dependent variable), X_{1} and X_{2} , as well as three parameters (par[1], par[2] and par[3]). The function returns the log-likelihood value.

```
poisson.lik = function(df,par){
log_likelihood = sum( df$y*(par[1] + par[2]*df$X_1 + par[3]*df$X_2)
-exp(par[1] + par[2]*df$X_1 + par[3]*df$X_2)
-lfactorial(df$y)
)
return(log_likelihood)
}
```

Now, this function needs to find the three parameter values that maximize the function I typed. To find such values, I use the *optim* function. This function requires the arguments of the functions, which are a set of starting values (3 in our case), and the name of the dataframe, as well as the name of the function to be optimized.

`optim(par=(1,1,1),poisson.lik,df=data,control=list(fnscale=-1))`

Since we want to maximize (rather than minimize which is the default in optim), I add *control=list(fnscale=-1)* to the list of options in the optim function.

I illustrate below how this function can estimate the parameters of interest. I first generate a sample of 100 observations that follow a Poisson distribution conditional on X_{1} and X_{2}. The true parameters are 0.1, -0.05 and 0.5. Then I write the log-likelihood function and run the *optim* function. The code below can be copied and replicated without any trouble. *Note: the precise outcome will change due to the data generated being slightly different in every computer. *

```
x_1 <- runif(100, min = 0 , max=50)
x_2 <- runif(100, min = 0 , max=10)
beta_1 <- 0.1
beta_2 <- -0.05
beta_3 <- 0.5
mu <- exp(beta_1 + beta_2*x_1 + beta_3*x_2)
y <- rpois(n=100, lambda = mu)
data <- data.frame(ID=c(1:100), y, x_1, x_2)
poisson.lik = function(df,par){
log_likelihood = sum( df$y*(par[1] + par[2]*df$x_1 + par[3]*df$x_2)
-exp(par[1] + par[2]*df$x_1 + par[3]*df$x_2)
-lfactorial(df$y)
)
return(log_likelihood)
}
optim(par=c(1,-1,1),poisson.lik,df=data,control=list(fnscale=-1))
```

I get the following output:

```
$par
[1] 0.26442288 -0.04681021 0.46706506
$value
[1] -207.9706
$counts
function gradient
154 NA
$convergence
[1] 0
$message
NULL
```

My log-likelihood function estimated 0.26442288, -0.04681021 and 0.46706506 as the parameters of interest. These values are very close to the true parameters (that I defined myself). The output also indicates that -207.97 is the maximum value that the log-likelihood function takes.

When I use the glm function to estimate a Poisson model, I get the following output:

```
> summary(glm(y ~ x_1 + x_2, family="poisson", data=data))
$par
[1] 0.1234623 -0.0576165 0.5048355
$value
[1] -210.9898
$counts
function gradient
140 NA
$convergence
[1] 0
$message
NULL
Call:
glm(formula = y ~ x_1 + x_2, family = "poisson", data = data)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.4597 -0.7073 -0.1575 0.7335 2.6595
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.124948 0.146268 0.854 0.393
x_1 -0.057629 0.002568 -22.439 <2e-16 ***
x_2 0.504665 0.016912 29.841 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 2418.230 on 99 degrees of freedom
Residual deviance: 98.362 on 97 degrees of freedom
AIC: 427.98
Number of Fisher Scoring iterations: 4
```

The estimated parameters are also very close (but surprisingly the log-likelihood value is lower). Hence, writing the log-likelihood function from scratch or using a command in R that is already built-in yields almost the same results.

## 1 Comment