An ontology is a formal description of domain knowledge that encompasses information about categories, entities, their relationships, and properties. This type of structured information representation follows the paradigm of semantic and linked data and allows machines to read and infer knowledge. Typically, this type of information representation, utilizes triplets in the form of subject, predicate, object. An example of the aforementioned schema is the following, “John is a friend of Sophia”, where “John” is the subject, “is a friend of” the predicate and “Sophia” the object. If in the respective ontology it is explicitly defined that this type of relationship requires the subject and object to be of type: human a typical reasoner would already know that both John and Sophia are humans. Another fact that can be easily inferred is that Sophia is also a friend of John. Through these simple examples we observe some of the potential of ontology usage. Ontologies up until today are widely used in the Semantic Web domain, but recent work emerges in different domains such as the cloud service provisioning, to tackle problems such as vendor differences in service descriptions both in terms of offerings and functional properties.
Semantic description of resources
Within the PHYSICS platform we use this kind of semantic modelling in two occasions; The first one takes place when an application is modelled while the second when a cluster is registered to the platform. One key aspect of PHYSICS is the utilization of a multi-cluster scenario to optimize application deployments. While the reader can refer to a previous blog post about the semantics block as a whole, in this post we will discuss the specifics of the resources ontology.
To guide our ontology creation, we have based our process on the four key aspects to be described for each cluster:
- Cluster capabilities: Functional properties of the cluster such as the available nodes at the time of description, their respective available CPU and RAM allocatable values, whether they are GPU enabled etc.
- SLA: The necessary classes and relationships to address SLA terms, the rebate in case of agreement breach and their target values.
- Cost: A cloud service cost such as the instance maintenance cost or cost per service request.
- Energy: Classes that address how energy efficient are machines are used to comprise a cluster and what percentage of this energy is coming from renewable sources of energy.

These 4 pillars of information provide the necessary knowledge to compare clusters effectively in order to manage them or select one for a specific application that is to be deployed. Several classes, properties and relationships are defined to capture the aforementioned concepts such as in the following picture, where the essentials of SLA terms are captured.

Information Extraction
After the ontology is formulated the next question that arises is: “How are we going to retrieve this kind of information?”. For the Cost and SLA categories we can safely rely on the public documents provided by the cloud vendors in the case of public cloud or apply a formula to calculate an approximation of energy consumption cost if the respective rates are provided for the region where private clouds reside. For the specific case of public SLA documents, pattern matching, and natural language processing techniques have been successfully used previously to automatically extract information directly to the ontology and we will be utilizing this approach. For the energy certificates and information on sources unfortunately there has not been much standardization on how providers list these kinds of details and as a result we can only rely for the time being on getting this information manually.

Finally, the cluster capabilities information can be retrieved by the Kubernetes API for any Kubernetes cluster. The various API client libraries provided, allow for configuration from within pod making REST API calls efficient and accessible from the same service that also injects their response information into the defined ontology format. After all the clusters have been described in the ontology context, information is passed to the project’s knowledge base so it can be examined and reasoned to guide the cluster selection and management process.