From: A systematic review of data mining and machine learning for air pollution epidemiology
Author | Year | Sub-field | Environmental agent | Data mining techniques | Objective |
---|---|---|---|---|---|
Kolehmainen et al. [60] | 2001 | Outdoor air pollution | NO2 | ANN | Comparing two Neural Nets for their suitability in forecasting Air Quality |
Kukkonen et al. [33] | 2003 | Outdoor air pollution | PM NO2, | ANN | Machine Learning Model comparison for forecasting NO2 and PM10 concentrations |
Niska et al. [59] | 2004 | Outdoor air pollution | NO2 | Genetic Algorithms, ANN | Investigate the use of GA to find a better ANN model to forecast air quality |
Ghanem et al. [69] | 2004 | Outdoor air pollution | SO2,C6H6,NO,NO2,O3 | Hierarchical clustering | Monitor chemicals and outline challenges related to collection and processing. |
Corani [68] | 2005 | Outdoor air pollution | Ozone, PM10 | ANN, Lazy Learning | Predict levels of air pollutants from meteorological and other local variables. |
Dominici et al. [67] | 2006 | Outdoor air pollution | PM2.5 | Bayesian Hierarchical Models | Assess the association of air pollution levels with the number of deaths per day |
Ma et al. [58] | 2008 | Outdoor air pollution | SO2, O3, NOx, C6H6 | k-means | Developing a distributed air pollution monitoring system & use data mining to find patterns of pollutant distribution |
Pegoretti et al. [62] | 2009 | Indoor air pollution | Rn | Geostatistical Models, KNN | Forecasting the indoor Radon concentrations |
Aquilina et al. [39] | 2010 | Outdoor air pollution | particle-associated PAH | DT, ANN | Predict personal exposure to particle-associated polycyclic aromatic hydrocarbons (PAH) |
Padula et al. [57] | 2012 | Outdoor air pollution | Traffic-related pollution | Targeted maximum likelihood estimation | Estimate the probability of low birth weight among full-term infants based on the mother’s exposure to traffic-related air pollution |
Zhu et al. [35] | 2012 | Urban outdoor air pollution | SO2, NO2, PM10, Respiratory diseases | ARM, GMDH | Forecasting the number of respiratory patients based on the seasonal effects of air pollution |
Singh et al. [24] | 2013 | Outdoor air pollution | AQI | PCA, Ensemble Decision DT, SVM | Predicting the Air Quality and identifying major sources of air pollution |
Beckerman et al. [66] | 2013 | Outdoor air pollution | NO2, PM2.5 | GLM | Develop a better land use regression model for using machine learning methods |
Pandy et al. [38] | 2013 | Outdoor air pollution | UFP, PM | DT, RF, etc. | Test machine learning classifiers for predicting air quality and assess the impact of weather and traffic related variables on UFP and PM. |
Philibert et al. [56] | 2013 | Setting | N 20 | RF | Predict NO2 emissions using variables related to chemical fertilizer treatments applied to agricultural plots. |
Chen et al. [54] | 2014 | Outdoor air pollution | Smog | ANN, Social Network Analysis | Predicting Smog based Health Hazardous regions |
Dias et al. [55] | 2014 | Outdoor air pollution | PM2.5 | Density-based Clustering | Quantification of human exposure to traffic related air pollution |
Lary et al. [4] | 2014 | Outdoor air pollution | PM2.5 | Ensemble Algorithms RF, SVM, ANN | Estimating the daily distributions of PM2.5 |
Jiang et al. [26] | 2015 | Outdoor air quality | AQI | Correlation Analysis | Monitoring the dynamics of air quality in large cities based on social media |
Wang et al. [27] | 2015 | Outdoor air pollution | Generic | Topic Models LDA, NLP | Evaluating the use of social media data to estimate air pollution and public response |
Reid et al. [34] | 2015 | Outdoor air pollution | PM2.5 | Generalized boosting model, GAM, RF, SVM, KNN Regression, etc. | Predicting PM2.5 during wildfire |
Lary et al. [52] | 2015 | Outdoor air pollution | PM2.5 | Ensemble regression models | Estimating PM2.5 distribution and relationship of such air pollutants with mental health |
Lewis et al. [49] | 2016 | Outdoor air quality | NOx,O3, SO2, CO, VOCs, PM | Boosted regression DT, gaussian process emulation | Improve the accuracy of common low cost air pollution sensors |
Hu et al. [65] | 2016 | In/Outdoor air pollution | Generic | RF | Understanding, exposure to air pollution by predicting time-activity tracking of individuals |
Challoner et al. [61] | 2015 | Indoor air pollution | PM NO2, | ANN | Predicting the indoor air quality from outdoor monitors |
Mirto et al. [48] | 2016 | Outdoor air pollution and climate | Generic | Spatial data mining, hot spot analysis | Finding correlations between diseases and air pollution due to climatic factors |
Xu et al. [30] | 2017 | Outdoor air pollution | PM, CO O3, SO2 NO2, | SVM, Fuzzy Evaluation, Empirical Mode Decomposition | Air quality forecasting and evaluation |
Min et al. [43] | 2017 | Outdoor air pollution | PM2.5 | K-Means | Apply K-Means to the identify potential new monitoring sites by considering a larger set of 313 variables in their models. Traffic and urbanicity are found to be useful to guide site selection |
Keller et al. [44] | 2017 | Outdoor air pollution | PM2.5 | Modified K-Means | A clustering method to assess exposure to air pollution in health-related studies. They consider the multivariate nature of the exposure and spatial misalignment likely to occur when using data from central monitoring stations and the actual location of the cases |
Liu et al. [47] | 2017 | Outdoor air pollution | PM, SO2, CO, NO2, O3 | SVM Regression | Apply support vector regression for air pollution forecasting using six criteria pollutants, five meteorological conditions and the Air Quality Index |