@@ -41,8 +41,7 @@ def rwr(
4141 Parameters
4242 ----------
4343 G : GraphV2
44- The input graph on which the Random Walk with Restart (RWR) will be
45- performed.
44+ The input graph to be sampled.
4645 graph_name : str
4746 The name of the new graph that is stored in the graph catalog.
4847 start_nodes : list of int, optional
@@ -106,55 +105,59 @@ def cnarw(
106105 job_id : Optional [str ] = None ,
107106 ) -> GraphWithSamplingResult :
108107 """
109- Computes a set of Random Walks with Restart (RWR) for the given graph and stores the result as a new graph in the catalog.
108+ Common Neighbour Aware Random Walk (CNARW) samples the graph by taking random walks from a set of start nodes
110109
111- This method performs a random walk, beginning from a set of nodes (if provided),
112- where at each step there is a probability to restart back at the original nodes.
113- The result is turned into a new graph induced by the random walks and stored in the catalog.
110+ CNARW is a graph sampling technique that involves optimizing the selection of the next-hop node. It takes into
111+ account the number of common neighbours between the current node and the next-hop candidates. On each step of a
112+ random walk, there is a probability that the walk stops, and a new walk from one of the start nodes starts
113+ instead (i.e. the walk restarts). Each node visited on these walks will be part of the sampled subgraph. The
114+ resulting subgraph is stored as a new graph in the Graph Catalog.
114115
115116 Parameters
116117 ----------
117118 G : GraphV2
118- The input graph on which the Random Walk with Restart (RWR) will be
119- performed.
119+ The input graph to be sampled.
120120 graph_name : str
121- The name of the new graph in the catalog.
121+ The name of the new graph that is stored in the graph catalog.
122122 start_nodes : list of int, optional
123- A list of node IDs to start the random walk from. If not provided, all
124- nodes are used as potential starting points .
123+ IDs of the initial set of nodes in the original graph from which the sampling random walks will start.
124+ By default, a single node is chosen uniformly at random .
125125 restart_probability : float, optional
126- The probability of restarting back to the original node at each step .
127- Should be a value between 0 and 1. If not specified, a default value is used .
126+ The probability that a sampling random walk restarts from one of the start nodes .
127+ Default is 0.1 .
128128 sampling_ratio : float, optional
129- The ratio of nodes to sample during the computation. This value should
130- be between 0 and 1. If not specified, no sampling is performed .
129+ The fraction of nodes in the original graph to be sampled.
130+ Default is 0.15 .
131131 node_label_stratification : bool, optional
132- If True, the algorithm tries to preserve the label distribution of the original graph in the sampled graph.
132+ If true, preserves the node label distribution of the original graph.
133+ Default is False.
133134 relationship_weight_property : str, optional
134- The name of the property on relationships to use as weights during
135- the random walk. If not specified, the relationships are treated as
136- unweighted.
135+ Name of the relationship property to use as weights. If unspecified, the algorithm runs unweighted.
137136 relationship_types : list of str, optional
138- The relationship types used to select relationships for this algorithm run.
137+ Filter the named graph using the given relationship types. Relationships with any of the given types will be
138+ included.
139139 node_labels : list of str, optional
140- The node labels used to select nodes for this algorithm run .
140+ Filter the named graph using the given node labels. Nodes with any of the given labels will be included .
141141 sudo : bool, optional
142- Override memory estimation limits . Use with caution as this can lead to
143- memory issues if the estimation is significantly wrong .
142+ Bypass heap control . Use with caution.
143+ Default is False .
144144 log_progress : bool, optional
145- If True, logs the progress of the computation.
145+ Turn `on/off` percentage logging while running procedure.
146+ Default is True.
146147 username : str, optional
147- The username to attribute the procedure run to
148+ Use Administrator access to run an algorithm on a graph owned by another user.
149+ Default is None.
148150 concurrency : int, optional
149- The number of concurrent threads used for the algorithm execution.
151+ The number of concurrent threads used for running the algorithm.
152+ Default is 4.
150153 job_id : str, optional
151- An identifier for the job that can be used for monitoring and cancellation
154+ An ID that can be provided to more easily track the algorithm’s progress.
155+ By default, a random job id is generated.
152156
153157 Returns
154158 -------
155159 GraphSamplingResult
156- Tuple of the graph object and the result of the Random Walk with Restart (RWR), including the sampled
157- nodes and their scores.
160+ Tuple of the graph object and the result of the Common Neighbour Aware Random Walk (CNARW), including the dimensions of the sampled graph.
158161 """
159162 pass
160163
0 commit comments