Closed
Description
Is your feature request related to a problem? Please describe.
There are cases where multiple AI models are needed in the same application to provide the final inference result, typically one model will provide the image ROI for another model, for example,
- a prostate tumor segmentation model requires the image of the prostate itself as input, which is the output of a prostate segmentation model with a DICOM CT series as input
- A COVID classification model that requires the ROI of the lung only, which is segmented with a lung seg model
The ROI image can be generated using non-DL computer vision based algorithm, but it is becoming common with DL models.
Describe the solution you'd like
- The App SDK has been designed to support multiple processing logic/algorithm, called
operators
, e.g. multiple inference operators each supporting a specific named model - The App SDK supports sharing results, as in-memory objects, from one operator to downstream operators, so for example a segmentation inference operator can generate a organ segment image object, which is linked to be the input to the tumor segmentation operator by creating the network within the application.
- App SDK inference operator needs to load named model, and have the model specific transformation and inference logic. As of now, the segmentation inference operator load the default model, and needs to passed a model name/UID on instantiation.
- Multiple models can already be loaded and made available in the execution context, by the base Application class.
- App SDK packager support packaging multiple models in the MONAI App Package (Docker image), accessible to the app for loading at start-up
Alternative Solution
- Create a operator to encapsulate the complete inference logic and model itself, a "black box" operator. For example, for an organ segmentation operator, the input to the operator would be the in-memory volumetric image (converted from a DICOM series), and the output is an in-memory organ seg image. The disadvantage of this is that model file needs to be embedded in the operator code itself.
- Build model specific application, each as a MAP (current limitation on MAP sepc, single Docker image), and then use an external orchestrator or platform to manage the execution of each MAP. The disadvantages include shared memory is unlikely, and the need of external orchestration.
Additional context
App SDK standardizes the in-memory image representation, ensuring consistency and correctness in passing image objects among operators within the same app Make DICOMSeriesToVolumeOperator consistent with ITK in serving NumPy array #238