UC Irvine
ML Repository
Theme

Internet Advertisements

Download(135.4 KB)
Thumbnail

About

This dataset represents a set of possible advertisements on Internet pages. This dataset represents a set of possible advertisements on Internet pages. The features encode the geometry of the image (if available) as well as phrases occuring in the URL, the image's URL and alt text, the anchor text, and words occuring near the anchor text. The task is to predict whether an image is an advertisement ("ad") or not ("nonad").
Subject Area
Computer Science
Instances
3,279
Features
1,558
Data Types
Multivariate
Tasks
Classification
Feature Types
Categorical, Integer, Continuous

Features

NameRoleTypeUnitsMissing Values

Introductory Paper

–

Additional Metadata

Keywords
–
Authors
Nicholas Kushmerick
Year Created
1999
License
CC BY 4.0