Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
[run]
omit =
tests/*
source=.

[report]
exclude_lines =
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -73,5 +73,5 @@ jobs:
- name: Testing and coverage report
run: |
pip install coverage
coverage run -m pytest -p no:warnings -x
coverage run -m pytest -p no:warnings -x --durations=0
coverage report -m
42 changes: 29 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,17 +133,34 @@ Fore more please refer to the [examples](https://github.com/dice-group/Ontolearn

<details><summary> Click me! </summary>

Load an RDF knowledge graph
The webservice exposes a lightweight HTTP/JSON API for running Ontolearn learners remotely.
Start it with a local knowledge base or a remote triplestore.
Submit learning problems as JSON to the `/cel` endpoint
(e.g., `POST http://<host>:8000/cel` with `pos`, `neg`, `model`
and optional parameters for the particular model like `path_embeddings`, `max_runtime`, etc.).
The service returns learned OWL class expressions (DL and SPARQL/OWL serializations)
and performance metrics in the JSON response.

### Local Dataset

```shell
ontolearn-webservice --path_knowledge_base KGs/Mutagenesis/mutagenesis.owl
```
or launch a triplestore server and load Mutagenesis there.
Some leads to launch the triplestore server:
- https://docs.tentris.io/binary/load.html
- https://ontolearn-docs-dice-group.netlify.app/usage/04_knowledge_base#loading-and-launching-a-triplestore

### Remote Dataset

```shell
ontolearn-webservice --endpoint_triple_store <your_triples_store_sparql_endpoint>
```

Some leads to hosting your own triplestore endpoint:
- https://docs.tentris.io/binary/load.html
- https://ontolearn-docs-dice-group.netlify.app/usage/04_knowledge_base#loading-and-launching-a-triplestore

### Using the Webservice

#### DRILL

The below code trains DRILL with 6 randomly generated learning problems
provided that **path_to_pretrained_drill** does not lead to a directory containing pretrained DRILL.
Thereafter, trained DRILL is saved in the directory **path_to_pretrained_drill**.
Expand All @@ -169,6 +186,9 @@ for str_target_concept, examples in learning_problems.items():
})
print(response.json()) # {'Prediction': '∀ hasAtom.(¬Nitrogen-34)', 'F1': 0.7283582089552239, 'saved_prediction': 'Predictions.owl'}
```

#### TDL

TDL (a more scalable learner) can also be used as follows
```python
import json
Expand All @@ -180,6 +200,9 @@ response = requests.get('http://0.0.0.0:8000/cel',
"model": "TDL"})
print(response.json())
```

#### NCES

NCES (another scalable learner). The following will first train NCES if the provided path `path_to_pretrained_nces` does not exist
```python
import json
Expand Down Expand Up @@ -250,15 +273,8 @@ To compute the test performance, we compute F1-score of H w.r.t. test positive a
python examples/concept_learning_cv_evaluation.py --kb ./KGs/Family/family-benchmark_rich_background.owl --lps ./LPs/Family/lps_difficult.json --path_of_nces_embeddings ./NCESData/family/embeddings/ConEx_entity_embeddings.csv --path_of_clip_embeddings ./CLIPData/family/embeddings/ConEx_entity_embeddings.csv --max_runtime 60 --report family_results.csv
```

```shell
# To download learning problems and benchmark with selected learners on the Family benchmark dataset with benchmark learning problems.
python examples/concept_learning_cv_evaluation.py --kb ./KGs/Family/family-benchmark_rich_background.owl --lps ./LPs/Family/lps_difficult.json --learner_types ocel drill tdl nces --path_of_nces_embeddings ./NCESData/family/embeddings/ConEx_entity_embeddings.csv --path_of_clip_embeddings ./CLIPData/family/embeddings/ConEx_entity_embeddings.csv --max_runtime 60 --report family_results.csv
```
You can also select specific learners by using the flag `--learner_types` followed by the learner short names separated by space. E.g., `--learner_types ocel drill tdl nces`

```shell
# To download learning problems and benchmark with a single learner on the Family benchmark dataset with benchmark learning problems.
python examples/concept_learning_cv_evaluation.py --kb ./KGs/Family/family-benchmark_rich_background.owl --lps ./LPs/Family/lps_difficult.json --learner_types nces --path_of_nces_embeddings ./NCESData/family/embeddings/ConEx_entity_embeddings.csv --path_of_clip_embeddings ./CLIPData/family/embeddings/ConEx_entity_embeddings.csv --max_runtime 60 --report family_results.csv
```
In the following python script, the results are summarized and the markdown displayed below generated.
```python
import pandas as pd
Expand Down
5 changes: 3 additions & 2 deletions examples/concept_learning_neural_evaluation.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@
import numpy as np
from ontolearn.utils.static_funcs import compute_f1_score
from ontolearn.triple_store import TripleStore
from ontolearn.owl_neural_reasoner import TripleStoreNeuralReasoner
from owlapy.owl_reasoner import EBR
from owlapy.owl_ontology import NeuralOntology
from owlapy import owl_expression_to_dl

pd.set_option("display.precision", 5)
Expand All @@ -40,7 +41,7 @@ def dl_concept_learning(args):
drill_with_symbolic_retriever = Drill(knowledge_base=kb, path_embeddings=args.path_drill_embeddings,
quality_func=F1(), max_runtime=args.max_runtime,verbose=0)

neural_kb = TripleStore(reasoner=TripleStoreNeuralReasoner(path_neural_embedding=args.kge))
neural_kb = TripleStore(reasoner=EBR(NeuralOntology(path_neural_embedding=args.kge)))

drill_with_neural_retriever = Drill(knowledge_base=neural_kb,
path_embeddings=args.path_drill_embeddings,
Expand Down
54 changes: 41 additions & 13 deletions ontolearn/learners/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,25 +29,53 @@
This module provides various concept learning algorithms for ontology engineering and OWL class expression learning.

Available Learners:

Refinement-Based Learners:
- CELOE: Class Expression Learning for Ontology Engineering
- OCEL: A limited version of CELOE

Neural/Hybrid Learners:
- Drill: Neuro-Symbolic Class Expression Learning
- TDL: Tree-based Description Logic Learner

- CELOE: A refinement-operator based learner (originating from DL-Learner).
It performs heuristic-guided search over class expression refinements to
find compact OWL class expressions that fit positive/negative examples.
Suitable when symbolic search with ontological reasoning is required.
- OCEL: A lightweight / constrained variant of CELOE. It uses a smaller set
of refinements or simplified search heuristics to trade expressivity for
speed and lower computational cost.

Neural / Hybrid Learners:
- Drill: A neuro-symbolic learner that combines neural scoring or guidance
with symbolic refinement/search. Typically, uses learned models to rank
candidates while keeping final outputs in an interpretable DL form.
- CLIP: A hybrid approach that leverages pretrained embeddings to assist
candidate generation or scoring (e.g., using semantic similarity signals).
Useful when distributional signals complement logical reasoning.
- NCES, NCES2: Neural concept-expression search variants. These rely on
neural encoders or learned scorers to propose and rank candidate
class expressions; NCES2 represents an improved/iterated version.
- NERO: A neural embedding model that learns permutation-invariant
embeddings for sets of examples tailored towards predicting F1
scores of pre-selected description logic concepts.
- ROCES: A hybrid/refinement-based approach that combines ranking,
coverage estimation, and refinement operators to discover candidate
expressions efficiently. Extension of NCES2.

-Evolutionary:
- EvoLearner: Evolutionary search-based learner that evolves candidate
descriptions (e.g., via genetic operators) using fitness functions
derived from coverage and other objectives.

Query-Based Learners:
- SPARQLQueryLearner: Learning SPARQL queries from DL concepts

Experimental:
- NERO: Neural Evolutionary Reinforcement Ontology learner (experimental)
- SPARQLQueryLearner: Learns query patterns expressed as SPARQL queries
that capture the target concept. Useful when working directly with
SPARQL endpoints or large RDF datasets where query-based retrieval is
preferable to reasoning-heavy symbolic search.

Tree / Rule-Based Learners:
- TDL: Tree-based Description Logic Learner. Adapts decision-tree style
induction to construct DL class expressions from attribute-like splits
or tests, producing interpretable, rule-like descriptions.

Example:
>>> from ontolearn.learners import CELOE, Drill
>>> from ontolearn.knowledge_base import KnowledgeBase
>>>
>>>
>>> kb = KnowledgeBase(path="example.owl")
>>> model = CELOE(knowledge_base=kb)
>>> model.fit(pos_examples, neg_examples)
Expand Down
5 changes: 3 additions & 2 deletions ontolearn/learners/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
from owlapy.render import DLSyntaxObjectRenderer
from ontolearn.abstracts import BaseRefinement, AbstractScorer, AbstractHeuristic, \
AbstractConceptNode, AbstractLearningProblem, AbstractKnowledgeBase
from ontolearn.triple_store import TripleStoreOntology

_N = TypeVar('_N', bound=AbstractConceptNode) #:
_X = TypeVar('_X', bound=AbstractLearningProblem) #:
Expand Down Expand Up @@ -343,9 +344,9 @@ def save_best_hypothesis(self, n: int = 10, path: str = './Predictions', rdf_for
if rdf_format != 'rdfxml':
raise NotImplementedError(f'Format {rdf_format} not implemented.')

assert isinstance(self.kb, KnowledgeBase)
assert isinstance(self.kb, AbstractKnowledgeBase)

if isinstance(self.kb.ontology, Ontology):
if isinstance(self.kb.ontology, Ontology) or isinstance(self.kb.ontology, TripleStoreOntology):
ontology = Ontology(IRI.create(NS), load=False)
elif isinstance(self.kb.ontology, SyncOntology):
ontology = SyncOntology(IRI.create(NS), load=False)
Expand Down
6 changes: 3 additions & 3 deletions ontolearn/learners/celoe.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
from contextlib import contextmanager
from sortedcontainers import SortedSet
from owlapy.utils import OrderedOWLObject
from owlapy.utils import EvaluatedDescriptionSet, ConceptOperandSorter, OperandSetTransform
from owlapy.utils import EvaluatedDescriptionSet, ConceptOperandSorter, CESimplifier
import time
from itertools import islice
from owlapy.render import DLSyntaxObjectRenderer
Expand Down Expand Up @@ -262,7 +262,7 @@ def _add_node(self, ref: OENode, tree_parent: Optional[TreeNode[OENode]]):
# ignoring refinement, it has been refined from another parent
return False

norm_concept = OperandSetTransform().simplify(ref.concept)
norm_concept = CESimplifier().simplify(ref.concept)
if norm_concept in self._seen_norm_concepts:
norm_seen = True
else:
Expand All @@ -288,7 +288,7 @@ def _add_node(self, ref: OENode, tree_parent: Optional[TreeNode[OENode]]):
return True

def _add_node_evald(self, ref: OENode, eval_: EvaluatedConcept, tree_parent: Optional[TreeNode[OENode]]): # pragma: no cover
norm_concept = OperandSetTransform().simplify(ref.concept)
norm_concept = CESimplifier().simplify(ref.concept)
if norm_concept in self._seen_norm_concepts:
norm_seen = True
else:
Expand Down
15 changes: 10 additions & 5 deletions ontolearn/learners/nces.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ def _set_prerequisites(self):
del dicee
except Exception:
print('\x1b[0;30;43m dicee is not installed, will first install it...\x1b[0m\n')
subprocess.run('pip install dicee==0.1.4')
subprocess.run('pip install dicee==0.2.0')
if self.auto_train:
print("\n"+"\x1b[0;30;43m"+"Embeddings not found. Will quickly train embeddings beforehand. "
+"Poor performance is expected as we will also train the synthesizer for a few epochs."
Expand All @@ -111,16 +111,21 @@ def _set_prerequisites(self):
try:
path_temp_embeddings = self.path_temp_embeddings if self.path_temp_embeddings and isinstance(
self.path_temp_embeddings, str) else "temp_embeddings"
path_temp_embeddings = os.path.abspath(path_temp_embeddings)
if not os.path.exists(path_temp_embeddings):
os.makedirs(path_temp_embeddings)
path_temp_triples = "temp_embeddings/abox.nt"
if os.path.exists(path_temp_triples):
os.remove(path_temp_triples)
# Use a separate directory for triples to avoid deletion by dicee
temp_triples_dir = os.path.abspath("temp_triples")
if not os.path.exists(temp_triples_dir):
os.makedirs(temp_triples_dir)
path_temp_triples = os.path.join(temp_triples_dir, "abox.nt")

with open(path_temp_triples, "a") as f:
with open(path_temp_triples, "w") as f:
for s, p, o in self.knowledge_base.abox():
f.write(f"<{s.str}> <{p.str}> <{o.str}> .\n")

assert os.path.exists(path_temp_triples), "Triples file not found"

self.knowledge_base_path = path_temp_triples

subprocess.run(f"dicee --path_single_kg {self.knowledge_base_path} "
Expand Down
9 changes: 4 additions & 5 deletions ontolearn/learners/nces2.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,13 +65,12 @@ def __init__(self, knowledge_base, nces2_or_roces=True,
drop_prob, num_heads, num_seeds, m, ln, learning_rate, tmax, eta_min, clip_value, batch_size,
num_workers, max_length, load_pretrained, verbose)

temp_triples_dir = "temp_embeddings"
# Use a separate directory for triples to avoid deletion
temp_triples_dir = os.path.abspath("temp_triples")
if not os.path.exists(temp_triples_dir):
os.makedirs(temp_triples_dir)
path_temp_triples = "temp_embeddings/abox.nt"
if os.path.exists(path_temp_triples):
os.remove(path_temp_triples)
with open(path_temp_triples, "a") as f:
path_temp_triples = os.path.join(temp_triples_dir, "abox.nt")
with open(path_temp_triples, "w") as f:
for s, p, o in self.knowledge_base.abox():
f.write(f"<{s.str}> <{p.str}> <{o.str}> .\n")

Expand Down
117 changes: 0 additions & 117 deletions ontolearn/scripts/litserve_neural_reasoner.py

This file was deleted.

Loading