Tutorial

1. Prepare devices 

You need a server machine with at leaset one GPU. We recommend using NVIDIA GPU with Ampere or newer architecture (A100, H100, etc.).

2. Clone repository and run Docker containers 

Steps:
Clone repository
cd /path/to/your/working_dir
git clone https://github.com/yuakagi/Watcher.git
Configure settings There is a file named .env.example in the root directory. Please rename this file to .env and configure the parameters in it.

Open the .env file and set the required environment variables.
Run Docker containers
cd Watcher
docker compose up
Now, you can run scripts inside the container:
docker exec watcher-pytorch-1 python3 /code/mnt/path/to/script.py
Replace watcher-pytorch-1 with the name of your PyTorch Docker container if different. (PYTHONPATH is set to /code.)
For time-consuming jobs (i.e., model training), consider using -d option:
docker exec -d watcher-pytorch-1 python3 /code/mnt/path/to/script.py

3. Upload clinical records to database 

Warning

Please note that this Python package performs only minimal data cleaning on preprocessing (e.g., dropping records with missing critical fields, removing duplicates).

Therefore, it is important to pre-clean your data before uploading (e.g., normalizing laboratory test results and units, mapping medical codes, etc.).

Note

You can use any medical coding system (e.g., ICD-10, LOINC, ATC, or custom codes like sequential numbers). What matters is the consistency of coding.

The database is exposed on the port ${POSTGRES_PORT}, which you configure in the .env file.

You can connect directly to this database using external SQL clients for inspection or manual queries if needed.

Prepare your clinical dataset by referring to Clinical Records.

The docker-compose.yml automatically launches a PostgreSQL server container. Upload your clinical records into this database using watcher.db.init_db_with_csv().

This database will serve as the source for both model training and evaluation.

4. Create dataset 

Note

If you plan to fine-tune the model later using an update dataset, please set the argument update_period appropriately when creating the dataset.

Train the model dataset using watcher.preprocess.create_dataset().

Once the dataset is created, it can be used for both model training.

You can also retrieve patient IDs used in the dataset (for training, validation, or testing) using watcher.preprocess.get_patient_ids().

5. Pretrain models 

Train the model using watcher.training.train_watcher().

6. Perform simulations 

Perform simulations using watcher.models.Simulator.simulate(). The details of the simulation results (pandas DataFrame) are available at Simulation Results.

You can list up test patient IDs using watcher.preprocess.get_patient_ids() for model evaluation.

If needed, you may also directly connect to the PostgreSQL database and perform SQL queries to extract patient subsets of interest.

7. [OPTIONAL] Fine-tune models 

The model learns from all training data without weighting. Such training may be suboptimal because medical practices shift over time. Therefore, this package allows fine-tuning the model using only the latest data.

Fine-tune the model using watcher.training.train_watcher().

8. [OPTIONAL] Simulator demo with GUI 

You can explore the pretrained model’s inference capabilities using our interactive demo GUI. To run the demo, open the notebook at watcher/notebooks/demo_gui.ipynb.

9. [OPTIONAL] Use Simulation API 

Note

This API is designed to be used as the simulator backend in our digital-twin EHR system (https://github.com/yuakagi/TwinEHR)

Please use this API in combination with digital-twin EHR together with this AI

Launching the API Server

First, please open api_launcher.py, and set the argumets (blueprint, gpu_ids, log_dir, etc.) in the script. Then, run the API server using the following command:
docker exec watcher-pytorch-1 gunicorn api_launcher:app --bind 0.0.0.0:63425 --workers 1
currently, we do not support multiple gunicorn workers. Ensure that the number of workers is set to 1 ( –workers 1).