ISSN: 2182-2069 (printed) / ISSN: 2182-2077 (online)
An Approach towards Forecasting Time Series Air Pollution Data Using LSTM-based Auto-Encoders
Artificial Intelligence-based algorithm is used extensively for predicting the concentration of different pollutants under various conditions. Recently, Long-Short Term Memory (LSTM) and its variant is getting popular attention related to the prediction of Air Quality Index (AQI) across various polluted cities. The accuracy of the prediction is found to be depending on the processing step of input data. Here we present a study of combining both Random Forests (RF) based regression for data pre-processing step and multi-variate time series coupled with Multistep Multiwindow LSTM with auto encoder and decoder to predict the pollutant concentration in the urban city area of Bengaluru. In this approach, the RF algorithm is used for imputing the missing values of the input vector. We have implemented this technique from the data collected from the Karnataka State Pollution Control Board (KSPCB), for the city limits of Bengaluru with four years of data from 2019 to 2022, mainly focusing on the six regions where pollution is found to be maximum. We found that our Multistep Multivariate LSTM-based autoencoder gives better accuracy than the conventional LSTM data based on a single-step pipeline model with respect to Recall and F-Score values.