Last edited on 1st Oct., 2005 by gs/ gs. - Home page:

Data set:

Short description

Length and width of sepal and petal for three northern american species of iris. The data on iris setosa canadensis and iris versicolor has been used by R. A. Fisher to illustrate linear discriminance analysis. The data on iris virginica have been added as an extension.

iris setosa
iris versicolor
iris virginica
iris setosa iris versicolor
iris virginica
more iris setosa more iris versicolor more iris virginica

Photos with kind permission of The Species Iris Group of North America.

Length and width of sepal and petal are just convenient coordinates suggested by what can easily be measured.

Using sepal length*sepal width and petal length*petal width as a crude approximation to the area coverd by sepal and petal gives a pair of coordinates which may have more botanical meaning. Using these coordinates, all three species in this data set can be separated, with just three misclassifications. Iris setosa and iris versicolor are  separated perfectly.

petal area vs. sepal area

If you want to publish a classification procedure using the iris data set as an illustration, make sure you have a better performance.


Fisher, R.A. (1936). The use of multiple measurements in taxonomic problems. Ann. Eugenics 7, Pt. II, 179-188.

See also: Andrews & Herzberg: Data


Tab-separated ascii text



G.Sawitzki, StatLab Heidelberg

The data analysis has been done using R.