UC Irvine
ML Repository
Theme

Iris

Download(3.7 KB)
Thumbnail

About

A small classic dataset from Fisher, 1936. One of the earliest known datasets used for evaluating classification methods. This is one of the earliest datasets used in the literature on classification methods and widely used in statistics and machine learning. The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are not linearly separable from each other. Predicted attribute: class of iris plant. This is an exceedingly simple domain. This data differs from the data presented in Fishers article (identified by Steve Chadwick, spchadwick@espeedaz.net ). The 35th sample should be: 4.9,3.1,1.5,0.2,"Iris-setosa" where the error is in the fourth feature. The 38th sample: 4.9,3.6,1.4,0.1,"Iris-setosa" where the errors are in the second and third features.
Subject Area
Biology
Instances
150
Features
4
Data Types
Tabular
Tasks
Classification
Feature Types
Continuous

Features

NameRoleTypeUnitsMissing ValuesDescription

Introductory Paper

The Iris data set: In search of the source of virginica
A. Unwin, K. Kleinman. 2021.
Significance, 2021

Additional Metadata

Keywords
Authors
R. A. Fisher
Year Created
1936
License
CC BY 4.0