Artificial Intelligence, Data Science, Machine Learning, Software Engineering

Enhancing Machine Learning with Symbolic Knowledge Injection

December 21, 2023

Machine learning models have revolutionized various applications, from image recognition to natural language processing. However, these models often suffer from limitations such as high memory footprint, data requirements, lack of interpretability, and long training times. Symbolic knowledge injection techniques aim to address these challenges by incorporating prior knowledge into machine learning models. In this article, we will explore the concept of symbolic knowledge injection, its benefits, and challenges, and how the PSyKI library can be used to enhance machine learning models.

What is Symbolic Knowledge Injection?

Symbolic knowledge injection (SKI) is a neuro-symbolic integration technique that involves incorporating prior symbolic knowledge into sub-symbolic predictors. The goal is to utilize existing knowledge to improve the performance and interpretability of machine learning models. PSyKI, which stands for Platform for Symbolic Knowledge Injection, is a Python library that provides SKI algorithms and quality of service (QoS) metrics for knowledge injection.

Benefits of Symbolic Knowledge Injection

There are several benefits of injecting symbolic knowledge into machine learning models:

Sustainability: By minimizing the learning time required, knowledge injection methods enhance the sustainability of machine learning models. This can lead to improvements in model accuracy and reduce the need for extensive training on large datasets.
Data Efficiency: Injecting prior knowledge allows the model to learn from smaller datasets. This can be particularly useful in situations where gathering large amounts of labeled data is challenging or expensive.
Model Interpretability: By incorporating symbolic knowledge, machine learning models become more interpretable. The injected knowledge prevents the models from becoming black boxes and provides insights into the decision-making process.

Challenges of Symbolic Knowledge Injection

While symbolic knowledge injection offers significant benefits, there are challenges that need to be addressed:

Quantifying Benefits: Accurately quantifying the benefits of knowledge injection, such as sustainability, accuracy improvements, dataset needs, and interpretability, in a consistent manner remains a challenge. PSyKI provides QoS metrics to address this challenge, which we will discuss later in the article.
Knowledge Representation: Knowledge can be represented in various ways, the most common being logic formulae. Each injector in PSyKI may have its own requirements on the knowledge representation. Integrating different knowledge representations and ensuring compatibility can be a complex task.

PSyKI: Platform for Symbolic Knowledge Injection

The PSyKI library is designed to facilitate symbolic knowledge injection in machine learning models. It offers a range of SKI algorithms, or “injectors,” along with QoS metrics for evaluating the impact of knowledge injection. The library integrates the 2ppy Python porting of 2p-kt, a multi-paradigm logic programming framework. This allows PSyKI to support knowledge represented in the formalism of Prolog, enabling the use of various subsets of the Prolog language.

Some of the available injectors in PSyKI include:

KBANN: One of the first injectors introduced in literature.
KILL: Constrain a neural network (NN) by altering its predictions using symbolic knowledge.
KINS: Structure the knowledge by adding ad-hoc layers into a NN.

Evaluating Symbolic Knowledge Injection with QoS Metrics

To evaluate the impact of symbolic knowledge injection, PSyKI provides QoS metrics that measure various aspects of the injected models, including memory footprint, data efficiency, latency, and energy consumption.

Memory Footprint

The memory footprint metric measures the amount of total operations (FLOPs or MACs) required by the injected model. This metric quantifies the reduction in learning complexity achieved through knowledge injection.

Data Efficiency

The data efficiency metric measures the amount of training data required to achieve a certain level of performance with the injected model. By incorporating prior knowledge, some portions of the training data become unnecessary, resulting in reduced dataset needs.

Latency

The latency metric measures the average time required to draw a prediction from the injected model. Knowledge injection can remove unnecessary computations, leading to faster inference times.

Energy Consumption

The energy consumption metric measures the average energy required to train and run the injected model. By reducing the learning complexity and unnecessary computations, knowledge injection can lead to energy savings during both training and inference.

Example Usage of QoS Metrics in PSyKI

Let’s see an example of how to use the QoS metrics in PSyKI to evaluate the impact of symbolic knowledge injection on machine learning models:

“`python
import pandas as pd
from tensorflow.keras.models import Model
from psyki.logic import Theory
from psyki.ski import Injector

Create and train the uneducated predictor

def create_uneducated_predictor() -> Model:
…

dataset = pd.read_csv(“path_to_dataset.csv”) # load dataset
knowledge_file = “path_to_knowledge_file.pl” # load knowledge
theory = Theory(knowledge_file, dataset) # create a theory
uneducated = create_uneducated_predictor() # create an uneducated NN

Create an injector and inject knowledge into the uneducated predictor

injector = Injector.kins(uneducated) # create an injector
educated = injector.inject(theory) # inject knowledge into the NN

Now you can use the educated predictor for inference

predictions = educated.predict(test_data)
“`

In the example above, we load a dataset and knowledge file, create an uneducated neural network, and then use an injector from PSyKI to inject knowledge into the uneducated model. Finally, we use the educated model for making predictions.

Conclusion

Symbolic knowledge injection techniques provide a powerful approach to enhance machine learning models by incorporating prior knowledge. The PSyKI library offers SKI algorithms and QoS metrics to evaluate the impact of knowledge injection on memory footprint, data efficiency, latency, and energy consumption. By leveraging symbolic knowledge, machine learning models can achieve improved sustainability, accuracy, and interpretability. With PSyKI, developers and researchers can explore the potential of knowledge injection and create more efficient and interpretable machine learning models.

References:

Matteo Magnini, Giovanni Ciatto, Andrea Omicini. “On the Design of PSyKI: A Platform for Symbolic Knowledge Injection into Sub-Symbolic Predictors”, in: Proceedings of the 4th International Workshop on EXplainable and TRAnsparent AI and Multi-Agent Systems, 2022.
Andrea Agiollo, Andrea Rafanelli, Matteo Magnini, Giovanni Ciatto, Andrea Omicini. “Symbolic knowledge injection meets intelligent agents: QoS metrics and experiments”, in: Autonomous Agents and Multi-Agent Systems, 2023.

Group Sum