Skip to content

Commit ddb0d92

Browse files
Merge pull request #8 from pietercolpaert/main
Feedback before CR
2 parents 22654e6 + fa66342 commit ddb0d92

File tree

9 files changed

+38
-39
lines changed

9 files changed

+38
-39
lines changed

code/Q1.ttl

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
PREFIX sosa: <http://www.w3.org/ns/sosa/>
22
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
33
PREFIX wgs: <http://www.w3.org/2003/01/geo/wgs84_pos#>
4-
PREFIX etsi: <https://saref.etsi.org/core/>
4+
PREFIX saref: <https://saref.etsi.org/core/>
55

66
SELECT * WHERE {
7-
?s etsi:hasTimestamp ?t.
8-
?s etsi:hasValue ?result.
9-
?s etsi:measurementMadeBy ?sensor.
7+
?s saref:hasTimestamp ?t.
8+
?s saref:hasValue ?result.
9+
?s saref:measurementMadeBy ?sensor.
1010
?sensor <https://dahcc.idlab.ugent.be/Ontology/Sensors/analyseStateOf> ?stateOf.
1111
?sensor <https://saref.etsi.org/core/measuresProperty> <https://dahcc.idlab.ugent.be/Homelab/SensorsAndActuators/energy.consumption>
1212

code/Q2.ttl

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
PREFIX sosa: <http://www.w3.org/ns/sosa/>
22
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
33
PREFIX wgs: <http://www.w3.org/2003/01/geo/wgs84_pos#>
4-
PREFIX etsi: <https://saref.etsi.org/core/>
4+
PREFIX saref: <https://saref.etsi.org/core/>
55

66
SELECT * WHERE {
7-
?s etsi:hasTimestamp ?t.
8-
?s etsi:hasValue ?result.
9-
?s etsi:measurementMadeBy ?sensor.
7+
?s saref:hasTimestamp ?t.
8+
?s saref:hasValue ?result.
9+
?s saref:measurementMadeBy ?sensor.
1010
?sensor <https://dahcc.idlab.ugent.be/Ontology/Sensors/analyseStateOf> ?stateOf.
1111
?sensor <https://saref.etsi.org/core/measuresProperty> <https://dahcc.idlab.ugent.be/Homelab/SensorsAndActuators/environment.light>
1212

code/Q3.ttl

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
PREFIX sosa: <http://www.w3.org/ns/sosa/>
22
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
33
PREFIX wgs: <http://www.w3.org/2003/01/geo/wgs84_pos#>
4-
PREFIX etsi: <https://saref.etsi.org/core/>
4+
PREFIX saref: <https://saref.etsi.org/core/>
55

66
SELECT * WHERE {
7-
?s etsi:hasTimestamp ?t.
8-
?s etsi:hasValue ?result.
9-
?s etsi:measurementMadeBy ?sensor.
7+
?s saref:hasTimestamp ?t.
8+
?s saref:hasValue ?result.
9+
?s saref:measurementMadeBy ?sensor.
1010
?sensor <https://dahcc.idlab.ugent.be/Ontology/Sensors/analyseStateOf> ?stateOf.
1111
?sensor <https://saref.etsi.org/core/measuresProperty> <https://dahcc.idlab.ugent.be/Homelab/SensorsAndActuators/energy.consumption>
1212

code/Q4.ttl

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
PREFIX sosa: <http://www.w3.org/ns/sosa/>
22
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
33
PREFIX wgs: <http://www.w3.org/2003/01/geo/wgs84_pos#>
4-
PREFIX etsi: <https://saref.etsi.org/core/>
4+
PREFIX saref: <https://saref.etsi.org/core/>
55

66
SELECT * WHERE {
7-
?s etsi:hasTimestamp ?t.
8-
?s etsi:hasValue ?result.
9-
?s etsi:measurementMadeBy ?sensor.
7+
?s saref:hasTimestamp ?t.
8+
?s saref:hasValue ?result.
9+
?s saref:measurementMadeBy ?sensor.
1010
?sensor <https://dahcc.idlab.ugent.be/Ontology/Sensors/analyseStateOf> ?stateOf.
1111
?sensor <https://saref.etsi.org/core/measuresProperty> <https://dahcc.idlab.ugent.be/Homelab/SensorsAndActuators/energy.consumption>
1212

code/example_sparql_query.ttl

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,11 @@
11
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
2-
PREFIX etsi: <https://saref.etsi.org/core/>
3-
PREFIX dahcc: <https://dahcc.idlab.ugent.be/Ontology/Sensors/>
42
PREFIX saref: <https://saref.etsi.org/core/>
3+
PREFIX dahcc: <https://dahcc.idlab.ugent.be/Ontology/Sensors/>
54

65
SELECT * WHERE {
7-
?s etsi:hasTimestamp ?t;
8-
etsi:hasValue ?result;
9-
etsi:measurementMadeBy ?sensor.
6+
?s saref:hasTimestamp ?t;
7+
saref:hasValue ?result;
8+
saref:measurementMadeBy ?sensor.
109
?sensor dahcc:analyseStateOf ?stateOf;
1110
saref:measuresProperty {:property}.
1211
FILTER(?t="2022-01-03T10:57:54.000000"^^xsd:dateTime)

code/example_tree_relation.ttl

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
@prefix tree: <https://w3id.org/tree#> .
22
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
33
@prefix ex: <https://example.be/> .
4-
@prefix etsi: <https://saref.etsi.org/core/>
4+
@prefix saref: <https://saref.etsi.org/core/> .
55

6-
ex:node tree:relation [
6+
<> tree:relation [
77
a tree:GreaterThanOrEqualToRelation ;
8-
tree:node ex:nextNode ;
9-
tree:value "2022-01-03T09:47:59.000000"^^xsd:dateTime ;
10-
tree:path etsi:hasTimestamp
8+
tree:node <nextNode> ;
9+
tree:value "2022-01-03T09:47:59"^^xsd:dateTime ;
10+
tree:path saref:hasTimestamp
1111
] .

section/conclusion.tex

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
\section{Conclusion}
22

3-
The publication of linked data in SPARQL endpoints is not always a sustainable approach due to unavailability and cost problems.
4-
Our work is centered around decentralized alternatives for linked data publication.
3+
This paper reported on preliminary tests to add guided link traversal support into the Comunica querying engine using a rule-based reachability approach.
54
Our preliminary results show that our rule-based reachability criterion can significantly reduce the execution time of queries aligned with hypermedia description constraints compared to predicate-based reachability
65
opening the possibility for faster and more versatile traversal-based query execution over fragmented RDF documents.
76
Our experiment also highlights that the size of the internal data store might have more impact on performance than noted in previous studies.
87
In future work, we will perform more exhaustive evaluations of other types of domain-oriented fragmentation strategies such as string evaluation and geospatial,
98
and investigate how to generalize our approach to support more expressive online reasoning for online source selection during traversal queries.
9+
Furthermore, we also showed there is still room for optimization by researching ways for pruning useless quads from the internal quadstore as the link traversal is happening.

section/guided_link_traversal.tex

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ \section{A Rule-Based Reachability Criterion}
2121

2222
We define our approach as a rule-based reachability criterion.
2323
Our approach builds upon the concept of structural assumptions~\cite{taelman2023} to exploit the structural properties of TREE annotated datasets.
24-
Concretely, we interpret the hypermedia descriptions of constraints in TREE fragments as boolean expressions $E$ ($?t>= \text{2022-01-03T09:47:59.000000}$ in Figure~\ref{lst:system}).
24+
We therefore interpret the hypermedia descriptions of constraints in TREE fragments as boolean expressions $E$ ($?t>= \text{2022-01-03T09:47:59.000000}$ in Figure~\ref{lst:system}).
2525
Upon discovery of a document, the query engine gathers the relevant triples to form the boolean expression of the constraint on the data of reachable fragments.
2626
After the parsing of the expression, the filter expression $F$ of the SPARQL query is \textit{pushed down} into the engine's source selection component.
2727
The source selection component can be formalized as a reachability criterion~\sepfootnote{sf:reachabilityCriterion}
@@ -36,16 +36,16 @@ \section{A Rule-Based Reachability Criterion}
3636
\end{equation}
3737
hold true given $x$ is the variable targeted by $E_i$ and $i$ is the link towards the next fragment (\texttt{ex:nextNode} from \texttt{ex:node tree:node ex:nextNode} in Figure~\ref{lst:system}).
3838
A variable targetted by $E$ is defined by an RDF object where the predicate as a value \texttt{?target} from the triple
39-
defining the fragmentation path in the form \texttt{?s tree:path ?target} (\texttt{etsi:hasTimestamp} in Figure~\ref{lst:system}).
39+
defining the fragmentation path in the form \texttt{?s tree:path ?target} (\texttt{saref:hasTimestamp} in Figure~\ref{lst:system}).
4040
Upon satisfaction the IRI targeting the next fragment is added to the link queue otherwise the IRI is pruned.
4141
The process is schematized in Figure~\ref{fig:process}.
4242

4343
\begin{figure}[htbp]
4444
\centering
4545
\includegraphics[width=\linewidth]{image/running_example.drawio.pdf}
4646
\caption{A schematization of our rule-based reachability criteria with a TREE document.
47-
First a TREE node is dereferenced, then the TREE relations are transformed into boolean expressions $E$,
48-
followed by the construction of $F$ from the filter expression related to the path of $E$ (the variable $t$ related to \texttt{sosa:resultTime}),
47+
First a TREE node is dereferenced, then the TREE relations are transformed into boolean expressions $E$,
48+
followed by the construction of $F$ from the filter expression related to the path of $E$ (the variable $t$ related to \texttt{saref:hasTimestamp}),
4949
then the satisfiability $E \land F$ is determined and finally links to non-query relevant data are pruned.}
5050
\label{fig:process}
5151
\end{figure}
@@ -92,6 +92,6 @@ \subsection{Preliminary Results}
9292
With Q3 we see that the percentage of reduction is 33\%, this lowering of performance gain might be caused by the increase by a factor of 6 in HTTP requests.
9393
This raises an interesting observation because we do not observe a reduction in execution time with a reduction in HTTP requests.
9494
Previous research has proposed that inefficient query plans might be the bottleneck of some queries in structured environments~\cite{taelman2023,eschauzier_quweda_2023}.
95-
However, our results seem to show that the size of the internal data source might have a bigger impact on performance than noted in previous studies.
96-
This observation might have significant consequences because large-scale web querying might result in the acquisition of a large number of triples.
97-
The query Q4 was not able to be answered, with any setup, because the query requires a larger number of fragments than the other to be processed.
95+
However, our results seem to show that the size of the internal quad store might have a bigger impact on performance than noted in previous studies.
96+
As large-scale guided link traversal over the web will result in the acquisition of a large number of triples, a future interesting research direction would be to find ways to also remove quads that are certain to not lead to a query result anymore from the internal quad store.
97+
The query Q4 was not able to be answered, with any setup, because the query requires a larger number of fragments than the other to be processed.

section/introduction.tex

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ \section{Introduction}
1414
For example, in the case of periodic measurements of sensor data, a fragmentation can be made on the publication date of each data entity.
1515
A fragment can be considered an RDF document published in a server.
1616
TREE aims to describes dataset fragmentation in ways that enable clients to easily fetch query-relevant subsets.
17-
The data inside a fragment are bounded with constraints expressed using hypermedia descriptions~\cite{thomasFieldingPhdThesis}.
18-
More precisely, each fragment describes the constraints of the data of every reachable fragment.
17+
The data within a fragment are bound by constraints expressed through hypermedia descriptions~\cite{thomasFieldingPhdThesis}.
18+
Each fragment contains relations to other pages, and those relations contain the constraints of the data of every reachable fragment.
1919
In this paper, we refer to those constraints as domain-specific expressions.
2020
They can be expressions such as $?t > \text{2022-01-09T00:00:00.000000} \implies \text{ex:afterFirstSeptember}$
2121
given that $?t$ is the date of publication of sensor data and the implication pertains to the location of the data respecting the constraint.
@@ -36,7 +36,7 @@ \section{Introduction}
3636
to define a mechanism of traversal centered around rules.
3737

3838
In this paper, we propose to use a boolean solver as the main link pruning mechanism for a reachability criterion to traverse TREE documents.
39-
The logical operators are defined by the \href{https://treecg.github.io/specification/}{TREE specification}.~\sepfootnote{sf:treeSpec}
39+
The logical operators are defined by the \href{https://w3id.org/tree/specification/}{TREE specification}.~\sepfootnote{sf:treeSpec}
4040
As a concrete use case, we consider the publication of (historical) sensor data.
4141
An example query is presented in Figure~\ref{lst:system} along with the triples representing the link between two documents expressed using the TREE specification.
4242

@@ -55,4 +55,4 @@ \section{Introduction}
5555
The constraint describes publication times ($?t$) where $?t>= \text{2022-01-03T09:47:59.000000}$.}
5656
\label{lst:system}
5757
\vspace*{-0.90cm}
58-
\end{figure}
58+
\end{figure}

0 commit comments

Comments
 (0)