Deploying healthcare ML functions in the PHYSICS way

Automatic understanding of patients to drive personalized interventions is becoming important in personalized healthcare, especially when patients are suffering from chronic conditions, ideally spending most of their patient journey away from clinical facilities. In most cases using models to infer on patients is a sporadic task, each patient being processed once or a few times per day. Deployed inference services are mostly inactive, occasionally serving bursts of requests. This is a clear indication of benefits that can be achieved if such services are deployed as functions. PHYSICS provides the tools for designing, testing, deploying and evaluating such Functions as a Service (FaaS). Engineers can use the provided Design Environment (DE) to first design their service. The PHYSICS DE integrates Node-RED and extends its palette of nodes, facilitating graphical service implementation as a flow. Some of the nodes do accept Javascript and Python scripts for advanced functionality. The flow can be tested locally, optimized and finalized.

An eHealth inference service designed as a Node-RED flow

The DE then provides the means to build the flow into an action that can be invoked within OpenWhisk. A Jenkins pipeline is executed, and as a result, the engineer comes up with a new action from their flow.

An eHealth inference flow converted to actions to be used within OpenWhisk

The deployed actions are then tested via flows dedicated to invoking deployed actions. The outputs of the deployed actions are compared to those of the local flows.

Flow for testing deployed actions

Finally, deployed actions can be evaluated under different scenarios implemented using dedicated load generator nodes.

Flow for generating and evaluating different load scenarios for deployed actions

The results of such tests can be as follows, where experiments last 2 minutes continuous requests, each request arriving at a fixed delay after the previous. As long as the delays are larger than the inference time, the achieved rate follows the increase of that of a system with infinite resources. When the delay drops below the execution time, then the achieved rate reaches a plateau. Even more frequent requests push Openwhisk beyond the accepted maximum request level, dropping the requests, resulting to a performance collapse.

Evaluating achieved response rate for different inter-request delays

You might be interested in …


View our previous Newsletters

Sign up to stay informed on our latest updates!