GSA Connects 2023 Meeting in Pittsburgh, Pennsylvania

Paper No. 65-8
Presentation Time: 3:45 PM

PREDICTION MODELING OF ARSENIC CONTAMINATION IN GROUNDWATER USING MACHINE LEARNING AND GEOSPATIAL TECHNIQUES IN THE MOST AFFECTED DISTRICTS IN BANGLADESH


HOSSAIN, Jakir1, MOURIN, Mahbuba Maliha2, KAMAL, A.S.M. Maksud3, RAHMAN, M. Zillur3, CHOWDHURY, Mahabub Arefin4, KHAN, Mahfuzur R.5, HOSSAIN, Shakhawat6, AHMED, Kazi Matin7 and ZAHID, Anwar8, (1)Groundwater Hydrology Division-2, Bangladesh Water Development Board, 72, Green Road, Dhaka 1205, Dhaka, Dhaka 1205, Bangladesh, (2)Department of Computer Science and Engineering, University of Dhaka, Dhaka, Dhaka 1000, Bangladesh, (3)Department of Disaster Science and Climate Resilience, University of Dhaka, Dhaka, 1000, Bangladesh, (4)Department of Civil and Environmental Engineering, Virgina Tech, Blacksburg, Virginia, VA 24061, (5)Department of Geology, University of Dhaka, Dhaka, Bangladesh, (6)Department of Disaster Science and Climate Resilience, University of Dhaka, Dhaka, Dhaka 1000, Bangladesh, (7)Department of Geology, University of Dhaka, Dhaka, 1000, Bangladesh, (8)Directorate of Groundwater Hydrology, Bangladesh Water Development Boards (BWDB), Dhaka, 1000, Bangladesh

Elevated concentration of As in groundwater is a great challenge for safe drinking water supply for the millions of groundwater-dependent population in Bangladesh. We used GIS-based machine learning models (Logistic Regression, Boosted Regression Tree and Random Forest) to predict groundwater arsenic contamination risk in the most affected districts in Bangladesh. The possibility of groundwater arsenic exceeding 10 μg/L (WHO limit for drinking water) and 50 μg/L (Bangladeshi standard for drinking water) was estimated using the three models mentioned above. The models were developed using groundwater As field measurement data and the spatially continuous predictor variables of geologic, geomorphologic, hydrologic, climatic, and soil parameters that influence the spatial distribution of the arsenic contaminant in groundwater. The Random Forest (RF) outperformed the Boosted Regression Tree and Logistic Regression models, and it was used to prepare the final prediction map of As concentration probability at 1 Km spatial resolution. The model results demonstrate that the predicted distribution of As is strongly influenced by the local geology, hydrogeology, and groundwater-fed irrigated area (%) of the study area. The central districts of the country (Chandpur, Munshiganj, Madaripur, Shariatpur, Gopalganj, Faridpur, and Comilla) are the most affected and are considered to be hot spot areas. The RF prediction results suggest that about 78% of the area of the study region and 23 million people are found to be exposed to high groundwater As contamination. The prediction performance of the model was satisfactory over the validation dataset, demonstrating the local-scale predictive capacity of the model. The study could help effectively identify elevated groundwater As risk area with limited water quality measurement and develop an awareness system for public health safety.