Skip to content

Commit db782eb

Browse files
committed
add readme
1 parent 6770809 commit db782eb

File tree

3 files changed

+39
-31
lines changed

3 files changed

+39
-31
lines changed

.DS_Store

0 Bytes
Binary file not shown.

README.md

Lines changed: 39 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -2,61 +2,69 @@
22

33
**Authors: GEMS Lab Team @ University of Michigan**
44

5-
This SEMB library allows fast onboarding to explore structural embedding of graph data using hetereogenous methods, with a unified API interface and a modular codebase enabling easy intergration of 3rd party methods and datasets.
5+
This SEMB library allows fast onboarding to get and evaluate structural node embeddings. With the unified API interface and the modular codebase, SEMB library enables easy intergration of 3rd-party methods and datasets.
66

7-
The library itself has already included a set of popular methods and datasets ready for use immediately.
7+
The library itself has already included a set of popular methods and datasets ready for immediate use.
8+
9+
- Built-in methods: [node2vec](https://github.com/aditya-grover/node2vec), [struc2vec](https://github.com/leoribeiro/struc2vec), [GraphWave](https://github.com/snap-stanford/graphwave), [xNetMF](https://github.com/GemsLab/REGAL), [role2vec](https://github.com/benedekrozemberczki/role2vec), [DRNE](https://github.com/tadpole/DRNE), [MultiLENS](https://github.com/GemsLab/MultiLENS), [RiWalk](github.com/maxuewei2/RiWalk), [SEGK](https://github.com/giannisnik/segk)
10+
11+
- Built-in datasets:
12+
13+
| Dataset | # Nodes | # Edges |
14+
| ------------------------------------------------------------ | ------- | ------- |
15+
| [BlogCatalog](http://snap.stanford.edu/node2vec/) | 10,312 | 333,983 |
16+
| [Facebook](http://snap.stanford.edu/data/egonets-Facebook.html) | 4,039 | 88,234 |
17+
| [ICEWS](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/QI2T9A) | 1,255 | 1,414 |
18+
| [PPI](snap.stanford.edu/graphsage/) | 56,944 | 818,786 |
19+
| [BR air-traffic](https://github.com/leoribeiro/struc2vec/tree/master/graph) | 131 | 1,038 |
20+
| [EU air-traffic](https://github.com/leoribeiro/struc2vec/tree/master/graph) | 399 | 5,995 |
21+
| [US air-traffic](https://github.com/leoribeiro/struc2vec/tree/master/graph) | 1,190 | 13,599 |
22+
| [DD6](https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets) | 4,152 | 20,640 |
823

924
The library requires *Python 3.7+*.
1025

11-
## Getting started
26+
## Installation and Usage
1227

1328
Make sure you are using *Python 3.7+* for all below!
1429

15-
### Installation
16-
`python setup.py install` (TODO: Pip support will be added soon)
17-
18-
### Import and load a dataset
19-
```py
20-
from semb.datasets import load, get_dataset_ids
21-
# explore all datasets (both built in and extended by 3rd party)
22-
ids = get_dataset_ids()
23-
# load a dataset
24-
graph = load(ids[0])
25-
```
26-
27-
### Import and load a method
28-
```py
29-
from semb.methods import load, get_method_ids
30-
# explore all methods (both built in and extended by 3rd party)
31-
ids = get_method_ids()
32-
# load a method, returns a constructor for a method's base class
33-
Method = load(ids[0])
34-
# create and run a method.
35-
# NOTE: except for the first "graph" arg, everything other argument MUST be in keyword form!
36-
method = Method(graph, a=1, b=2, c=3, ...)
37-
method.train()
38-
embeddings = method.get_embeddings()
39-
```
30+
`python setup.py install`
31+
32+
After installation, we highly recommend you go through our [Tutorial](https://github.com/GemsLab/StrucEmbeddingLibrary/blob/master/Tutorial.ipynb) to see how SEMB library works.
33+
34+
4035

4136
## Extending SEMB
4237

4338
First make sure the `semb` library is installed.
4439

4540
### Developing 3rd party Dataset extension
4641

47-
- Create a Python 3.7+ [package](https://packaging.python.org/tutorials/packaging-projects/) with a name in form of `semb-dataset[$YOUR_CHOSEN_DATASET_ID]`
42+
Currently, SEMB only supports embedding and evaluation on *undirected* and *unweighted* graphs.
43+
44+
- Create a Python 3.7+ [package](https://packaging.python.org/tutorials/packaging-projects/) with a name in form at `semb/datasets/[$YOUR_CHOSEN_DATASET_ID]`
4845
- Within the package root directory, make sure `__init__.py` is present
4946
- Create a `dataset.py` and make a `Method` class that inherits from `from semb.datasets import BaseDataset` and implement the required methods. See `semb/datasets/airports/dataset.py` for more details.
47+
- To use the built-in `load_dataset()`method, we accept the graph edgelist with the following format
48+
- `<Node1_id (int)> <Blank> <Node2_id (int)> <\n>`
49+
- Otherwise, you can overload and implement your own `load_dataset()` function. Please make sure that the returned graph is of `networkx.classes.graph.Graph` datatype.
50+
- If the dataset is accompanied by the label file, to use the built-in `load_label()` function, we accept the label file with the following format
51+
- `<Node_id (int)> <delimeter> <Node_label (int)>`
52+
- Otherwise, you can overload and implement your own `load_label()` function. Please make sure that the returned type is python built-in `dict()` with the key as `<Node_id (int)>` and value as `<Node_label (int)>`
5053
- Install the package via `setup.py` or pip.
5154
- Now the dataset is loadable by the main client program that uses `semb`!
5255

5356
### Developing 3rd party Method extension
5457

55-
- Create a Python 3.7+ [package](https://packaging.python.org/tutorials/packaging-projects/) with a name in form of `semb-method[$YOUR_CHOSEN_METHOD_ID]`
58+
- Create a Python 3.7+ [package](https://packaging.python.org/tutorials/packaging-projects/) with a name in form of `semb/methods/[$YOUR_CHOSEN_METHOD_ID]`
5659
- Within the package root directory, make sure `__init__.py` is present
57-
- Create a `dataset.py` and make a `Dataset` class that inherits from `from semb.methods import BaseMethod` and implement the required methods. See `semb/methods/node2vec/method.py` for more details.
60+
- Create a ` method.py` and make a `Method` class that inherits from `from semb.methods import BaseMethod` and implement the required methods. See `semb/methods/node2vec/method.py` for more details.
61+
- Please make sure that your implemented method accepts `networkx.classes.graph.Graph` as input.
62+
- Please make sure that after `train()` is called, the `self.embeddings` should be a Python built-in `dict()` with key as `<Node_id (int)>` and value(embedding) as `<List(float)>`.
5863
- Install the package via `setup.py` or pip.
5964
- Now the method is load-able by the main client program that uses `semb`!
6065

6166
### Note
6267
For both `dataset` and `method` extensions, make sure the `get_id()` to be overridden and returns the same id as your chosen id in your package name.
68+
69+
70+

semb/.DS_Store

0 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)