Respiratory Monitoring System

Published: May 23, 2025

Group Leader, Principal Contributor
Undergraduate Final Year Project (FYP) (self-proposed), supervised by Prof. Jun ZHANG
Keywords: VAE, LSTM, Multi-Head Attention, Anomaly Detection, Time-Series Prediction, Sleep Apnea, ESP32, ESP-NOW
Coding & Enviroment: Python, PyTorch, C++
Detailed implementation can be found in project Github repository

Project Overview

Constructed multi-dimensional respiration datasets, designed and implemented multimodal AI models for time-series prediction and anomaly detection of respiratory diseases (sleep apnea), and constructed wireless end-to-end real-time monitoring system.

Below shows my contributions to the project, including:

AI model framework design, implementation, and optimization for time-series prediction and anomaly detection (VAE, LSTM, Multi-Head Attention)
Hardware sensor integration and construction of transmission protocol (ESP32 boards, ESP-NOW protocol)
Multi-modal respiration datasets construction (audio, temperature and humidity signals)
Real-time GUI monitoring system

System Real-Time Demonstration

This video is this project’s integrated demonstration, showing the real-time GUI, real tester, and the real-time anomaly detection and time-series prediction inference.

System Architecture

We designed a system architecture that integrates the AI models, hardware sensors, and the real-time GUI monitoring system. The system architecture is shown as the following figure:

Hardware System Setup

The hardware system collects air humidity, air temperature, and audio signals from the patient’s breathing. The audio signals are collected by a microphone (KY-028), and the air humidity and temperature are collected by the sensor AHT10. Once the sensor signal buffers are full, the data will be transmitted to the ESP32 board via the ESP-NOW protocol. The general hardware system structure is shown as the following figure:

Software System Setup

1 - Dataset Construction

The data collected by the hardware are processed into rolling windows, which will be used for the AI model training and testing. The software system structure is shown as the following figure:

2 - Anomaly Detection Process

In this project’s settings, the anomaly samples are selected based on the comparison and deviation between the predicted and the ground truth values. We first define the anomaly threshold based on the ground truth values. Then, utilize the AI model to make predictions based on the most recent data intervals. If the predicted value is significantly different from the ground truth value, the data interval is considered as an anomaly.

The anomaly detection process is shown as the following figure:

3 - Time-Series Prediction with AI Models

In this project, we proposed our AI model MHA-VAE-LSTM and have tested the model on our real-collected respiration datasets. We also made some model performance comparisons with varying data input length, with other time-series prediction AI models - LSTM (in our project, we named it as pure-LSTM) and VAE-LSTM.

Here is our MHA-VAE-LSTM model’s structure:

By introducing the Multi-Head Attention (MHA) mechanism and the residual connection, the model can have a better performance in the data feature extraction and reconstruction, espeically compared with the VAE-LSTM model which uses the VAE to extract local features and uses the LSTM to make future predictions.

Here is the data reconstruction results of the VAE-LSTM model:

Based on the above figure, the VAE-LSTM model tends to use the mean values to reconstruct the data, which is not very accurate. The MHA-VAE-LSTM model can have a better performance in the data reconstruction, which is shown as the following figure:

The performances of these three models on the same real-collected dataset (collection duration $\approx$ 1 hour with 20~30Hz sample rates, data length $\approx$ 100,000 sensor readings) are shown as the following figure:

pure-LSTM model:

pure_lstm_anomaly_detection_comparison_thre80-noseinout9-winlen800-lookback1

VAE-LSTM model:

vae-lstm_anomaly_detection_comparison_thre80-noseinout9-winlen800-lookback1

Our MHA-VAE-LSTM model:

mha_vae_lstm_win_len1000_look_back1_anomaly_detection_comparison_thre80

4 - Model Performance Comparisons on Different Data Input Lengths

Input data length is a crucial factor in the time-series prediction and anomaly detection. In this project, we have tested the model performance on different input data lengths, where single window lengths are from 200 to 1800 with varying look-back window numbers from 1 to 5.

We take the detection rate (Recall) and the $F_{\beta}$ score as the primary performance metrics, with the $\beta$ value set to 2 hoping to put more emphasis on the detection rate.

Comparisons on short input data lengths:

From the above figure, we can see that all the three models can have decent performances.

Comparisons on long input data lengths:

Now, with much longer input data lengths, the pure-LSTM model cannot have a good performance, and the VAE-LSTM model and our MHA-VAE-LSTM model can have a better performance.

Conclusion

We have successfully constructed a wearable, affordable, real-time monitoring, edge-compatible (can be deployed on PC or embedded devices), and open-source system for sleep apnea detection, which can be used at home or in the hospital. We hope our work and algorithms can foster the development of the sleep apnea detection and treatment in the future.