The Sendit application is an on demand application that works in two stages to optimally anonymize and push anonymized images and metadata to Google Cloud Storage, and Google Cloud BigQuery, respectively. It works as follows:
- the researcher starts the anonymization pipeline with an input of one or mode folders
- each folder is added as a “Batch” with status “QUEUE” to indicate they are ready for import
- anonymization is performed (status “PROCESSING”), meaning removing/replacing fields in the header and image data, .
- when status “DONEPROCESSING” is achieved for all in the queue, the researcher triggers the final job to send data to storage (status “SENT”)
The base of the image is distributed via sendit-base. This image has all dependencies for the base so we can easily bring the image up and down.
- Application: If you are a new developer, please read about the application flow and infrastructure first. Sendit is a skeleton that uses other python modules to handle interaction with Stanford and Google APIs, along with anonymization of datasets.
- Setup: Basic setup (download and install) of a new application for a server.
- Configuration: How to configure the application before starting it up.
- Start: Start it up!
- Interface: A simple web interface for monitoring batches.
- Management: an overview of controlling the application with manage.py
- Logging: overview of the logger provided in the application
- Watcher: configuration and use of the watcher daemon to detect new DICOM datasets
Steps in Pipeline
- Dicom Import: The logic for when a session directory is detected as finished by the Watcher.
- Anonymize: the defaults (and configuration) for the anonymization step of the pipeline. This currently includes just header fields, and we expect to add pixel anonymization.
- Storage: Is the final step to move the anonymized dicom files to OrthanCP and/or Google Cloud Storage.
- Error Handling: an overview of how the application managers server, API, and other potential issues.