Skip to main content Skip to main navigation


Bacterial prediction using internet of things (IoT) and machine learning

Hamza Khurshid; Rafia Mumtaz; Noor Alvi; Ayesha Haque; Sadaf Mumtaz; Faisal Shafait; Sheraz Ahmed; Muhammad Imran Malik; Andreas Dengel
In: Environmental Monitoring and Assessment, Vol. 194, Pages 1-20, Springer, 1/2022.


Water is a basic and primary resource which is required for sustenance of life on the Earth. The importance of water quality is increasing with the ascending water pollution owing to industrialization and depletion of fresh water sources. The countries having low control on reducing water pollution are likely to retain poor public health. Additionally, the methods being used in most developing countries are not effective and are based more on human intervention than on technological and automated solutions. Typically, most of the water samples and related data are monitored and tested in laboratories, which eventually consumes time and effort at the expense of producing fewer reliable results. In view of the above, there is an imperative need to devise a proper and systematic system to regularly monitor and manage the quality of water resources to arrest the related issues. Towards such ends, Internet of Things (IoT) is a great alternative to such traditional approaches which are complex and ineffective and it allows taking remote measurements in real-time with minimal human involvement. The proposed system consists of various water quality measuring nodes encompassing various sensors including dissolved oxygen, turbidity, pH level, water temperature, and total dissolved solids. These sensors nodes deployed at various sites of the study area transmit data to the server for processing and analysis using GSM modules. The data collected over months is used for water quality classification using water quality indices and for bacterial prediction by employing machine learning algorithms. For data visualization, a Web portal is developed which consists of a dashboard of Web services to display the heat maps and other related info-graphics. The real-time water quality data is collected using IoT nodes and the historic data is acquired from the Rawal Lake Filtration Plant. Several machine learning algorithms including neural networks (NN), convolutional neural networks (CNN), ridge regression (RR), support vector machines (SVM), decision tree regression (DTR), Bayesian regression (BR), and an ensemble of all models are trained for fecal coliform bacterial prediction, where SVM and Bayesian regression models have shown the optimal performance with mean squared error (MSE) of 0.35575 and 0.39566 respectively. The proposed system provides an alternative and more convenient solution for bacterial prediction, which otherwise is done manually in labs and is an expensive and time-consuming approach. In addition to this, it offers several other advantages including remote monitoring, ease of scalability, real-time status of water quality, and a portable hardware.