Browse Datasets

DeFungi

DeFungi

DeFungi is a dataset for direct mycological examination of microscopic fungi image. The images are from superficial fungal infections caused by yeasts, moulds, or dermatophyte fungi. The images have been manually labelled into five classes and curated with a subject matter expert assistance. The images have been cropped with automated algorithms to produce the final dataset.

NASA Flood Extent Detection

NASA Flood Extent Detection

This dataset contains synthetic aperture radar (SAR) raster imagery for various flood events acquired from the European Space Agencys Sentinel-1A and Sentinel-1B missions, providing C-Band dual-polarized imagery that spans geographical areas of interest in the United States and Bangladesh. The main emphasis was on the labeling of open water areas where specular reflection of the radar signal off of the relatively still, flat open water surface results in reduced backscatter, low amplitude, and an overall darkened appearance within the image. The labels for the water surface reflectance are also provided in GeoTiff rasterized file format in scenes aligned with the SAR source raster imagery.

Land Mines

Land Mines

Detection of mines buried in the ground is very important in terms of safety of life and property. Many different methods have been used in this regard; however, it has not yet been possible to achieve 100% success. Mine detection process consists of sensor design, data analysis and decision algorithm phases. The magnetic anomaly method works according to the principle of measuring the anomalies resulting from the object in the magnetic field that disturbs the structure of it, the magnetic field, and the data obtained at this point are used to determine the conditions such as motion and position. The determination of parameters such as position, depth or direction of motion using magnetic anomaly has been carried out since 1970.

Multivariate Gait Data

Multivariate Gait Data

Bilateral (left, right) joint angle (ankle, knee, hip) times series data collected from 10 healthy subjects under 3 walking conditions (unbraced, knee braced, ankle braced). For each condition, each subject’s data consists of 10 consecutive gait cycles.

Glioma Grading Clinical and Mutation Features Dataset

Glioma Grading Clinical and Mutation Features Dataset

Gliomas are the most common primary tumors of the brain. They can be graded as LGG (Lower-Grade Glioma) or GBM (Glioblastoma Multiforme) depending on the histological/imaging criteria. Clinical and molecular/mutation factors are also very crucial for the grading process. Molecular tests are expensive to help accurately diagnose glioma patients. In this dataset, the most frequently mutated 20 genes and 3 clinical features are considered from TCGA-LGG and TCGA-GBM brain glioma projects. The prediction task is to determine whether a patient is LGG or GBM with a given clinical and molecular/mutation features. The main objective is to find the optimal subset of mutation genes and clinical features for the glioma grading process to improve performance and reduce costs.

accelerometer_gyro_mobile_phone_dataset

accelerometer_gyro_mobile_phone_dataset

data collected on 2022, in King Saud University in riyadh for recognizing human activities using mobile phone IMU sensors (Accelerometer, and Gyroscope). these activity is calssified to standing(stop), or walking.

Dataset based on UWB for Clinical Establishments

Dataset based on UWB for Clinical Establishments

The authors come forth with a data set acquired from an intelligent surveillance system based on the bleeding edge technology – Ultra wide band technology. The intelligent surveillance system is proposed to prefect the movement of patients in and out of hospitals and other clinical establishments. The raw data is amassed from UWB anchors and tags affixed in the clinical arena using a wearable tag. The chronophagous behaviour of following up on the records of patients with respect to their arrival and departure manually is abhorred using the proposed surveillance system. The data described in the manuscript is a result of the system implemented in an area of 12.5m X 16.5m inside a hospital premises.

Bosch CNC Machining Dataset

Bosch CNC Machining Dataset

Manufacturing processes have undergone tremendous technological progress in recent decades. To meet the agile philosophy in industry, data-driven algorithms need to handle growing complexity, particularly in Computer Numerical Control machining. To enhance the scalability of machine learning in real-world applications, this paper presents a benchmark dataset for process monitoring of brownfield milling machines based on acceleration data. The data is collected from a real-world production plant using a smart data collection system over a two-years period. In this work, the edge-to-cloud setup is presented followed by an extensive description of the different normal and abnormal processes. An analysis of the dataset highlights the challenges of machine learning in industry caused by the environmental and industrial factors. The new dataset is published with this paper and available at: https://github.com/boschresearch/CNC_Machining.

Similarity Prediction

Similarity Prediction

Molecular similarity assessments by expert chemists. Useful for the prediction of molecular similarity evaluations by humans.

Sirtuin6 Small Molecules

Sirtuin6 Small Molecules

The dataset includes 100 molecules with 6 most relevant descriptors to determine the candidate inhibitors of a target protein, Sirtuin6. The molecules are grouped based on their low- and high-BFEs.

1 to 10 of 623

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Learn More