UC Irvine
ML Repository
Theme

Letter Recognition

Download(378.1 KB)

About

Database of character image features; try to identify the letter The objective is to identify each of a large number of black-and-white rectangular pixel displays as one of the 26 capital letters in the English alphabet. The character images were based on 20 different fonts and each letter within these 20 fonts was randomly distorted to produce a file of 20,000 unique stimuli. Each stimulus was converted into 16 primitive numerical attributes (statistical moments and edge counts) which were then scaled to fit into a range of integer values from 0 through 15. We typically train on the first 16000 items and then use the resulting model to predict the letter category for the remaining 4000. See the article cited above for more details.
Subject Area
Computer Science
Instances
20,000
Features
16
Data Types
Multivariate
Tasks
Classification
Feature Types
Integer

Features

NameRoleTypeUnitsMissing ValuesDescription

Introductory Paper

Additional Metadata

Authors
David Slate
Year Created
1991
License
CC BY 4.0