iaa.pta {UCS}R Documentation

Inter-Annotator Agreement: Estimates for the Proportion of True Agreement (iaa)


Compute confidence interval estimates for the proportion of true agreement between two annotators on a binary variable, as described by Krenn, Evert & Zinsmeister (2004). iaa.pta.conservative computes a conservative estimate that is rarely useful, while iaa.pta.homogeneous relies on additional assumptions. The data can either be given in the form of a 2-by-2 contingency table or as two parallel annotation vectors.


iaa.pta.conservative(x, y=NULL, conf.level=0.95, debug=FALSE)

iaa.pta.homogeneous(x, y=NULL, conf.level=0.95, debug=FALSE)


x either a 2-by-2 contingency table in matrix form, or a vector of logicals
y a vector of logicals; ignored if x is a matrix
conf.level confidence level of the returned confidence interval (default: 0.95, corresponding to 95% confidence)
debug if TRUE, show which divisions of the data are considered when computing the confidence interval (see Krenn, Evert & Zinsmeister, 2004)


This approach to measuring intercoder agreement is based on the assumption that the observed surface agreement in the data can be divided into true agreement (i.e. candidates where both annotators make the same choice for the same reasons) and chance agreement (i.e. candidates on which the annotators agree purely by coincidence). The goal is to estimate the proportion of candidates for which there is true agreement between the annotators, referred to as PTA.

The two functions differ in how they compute this estimate. iaa.pta.conservative considers all possible divisions of the observed data into true and chance agreement, leading to a conservative confidence interval. This interval is almost always too large to be of any practical value.

iaa.pta.homogeneous makes the additional assumption that the average proportion of true positives is the same for the part of the data where the annotators reach true agreement and for the part where they agree only by chance. Note that there is no a priori reason why this should be the case. Interestingly, the confidence intervals obtained in this way for the PTA correspond closely to those for Cohen's kappa statistic (iaa.kappa).


A numeric vector giving the lower and upper bound of a confidence interval for the proportion of true agreement (both in the range [0,1]).


iaa.pta.conservative is a computationally expensive operation based on Fisher's exact test. (It doesn't use fisher.test, though. If it did, it would be even slower than it is now.) In most circumstances, you will want to use iaa.pta.homogeneous instead.


Krenn, Brigitte; Evert, Stefan; Zinsmeister, Heike (2004). Determining intercoder agreement for a collocation identification task. In preparation.

See Also



## how well do the confidence intervals match the true PTA?
true.agreement <- 700		# 700 cases of true agreement
chance <- 300			# 300 cases where annotations are independent
p <- 0.1			# average proportion of true positives
z <- runif(true.agreement) < p  # candidates with true agreement
x.r <- runif(chance) < p	# randomly annotated candidates
y.r <- runif(chance) < p
x <- c(z, x.r)
y <- c(z, y.r)
cat("True PTA =", true.agreement / (true.agreement + chance), "\n")
iaa.pta.conservative(x, y)	# conservative estimate
iaa.pta.homogeneous(x, y)	# estimate with homogeneity assumption

[Package UCS version 0.5 Index]