Skip to content

Commit 0564cda

Browse files
committed
[SHARE-698][Task] Replace outdated model diagram with a link to the schema endpoint
1 parent 2ba9c73 commit 0564cda

File tree

3 files changed

+16
-233
lines changed

3 files changed

+16
-233
lines changed

docs/harvesters_and_transformers.rst

Lines changed: 12 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
.. _harvesters-and-transformers:
22

33
Harvesters and Transformers
4-
==========================
4+
===========================
55

66
A `harvester` gathers raw data from a source using their API.
77

@@ -11,49 +11,31 @@ Start Up
1111
--------
1212

1313
1. Install `Docker <https://docs.docker.com/engine/installation/>`_.
14-
2. Make sure you're using Python3 - install with `conda <http://conda.pydata.org/miniconda.html>`_ , or `homebrew <http://blog.manbolo.com/2013/02/04/how-to-install-python-3-and-pydev-on-osx#2>`_
14+
2. Make sure you're using Python3 - install with `miniconda <http://conda.pydata.org/miniconda.html>`_ , or `homebrew <http://blog.manbolo.com/2013/02/04/how-to-install-python-3-and-pydev-on-osx#2>`_
1515
3. Install everything inside a Virtual Enviornment - created with `Conda <http://conda.pydata.org/docs/using/envs.html>`_ or `Virtualenv <https://virtualenv.pypa.io/en/stable/>`_ or your python enviornment of choice.
1616

1717
Installation (inside a virtual environment)::
1818

1919
pip install -r requirements.txt
2020

21-
// Creates and starts containers for elasticsearch, rabbitmq,
21+
// Creates, starts, and sets up containers for elasticsearch,
2222
// postgres, and the server
23-
docker-compose up -d web
24-
25-
./up.sh
26-
---------------- or ----------------
27-
pg
28-
createuser share
29-
psql
30-
CREATE DATABASE share;
31-
python manage.py makemigrations
32-
python manage.py maketriggermigrations
33-
python manage.py makeprovidermigrations
34-
python manage.py migrate
35-
python manage.py createsuperuser
36-
23+
docker-compose build web
24+
docker-compose run --rm web ./bootstrap.sh
3725

3826
To run the server in a virtual environment instead of Docker::
3927

40-
docker stop share_web_1
28+
docker-compose stop web
4129
python manage.py runserver
4230

4331
To run celery worker::
4432

4533
python manage.py celery worker -l DEBUG
4634

47-
To monitor your celery tasks::
48-
49-
python manage.py celery flower
50-
51-
Visit http://localhost:5555/dashboard to keep an eye on your harvesting and transforming tasks
52-
5335
.. _running-sources:
5436

5537
Running Existing Harvesters and Transformers
56-
-------------------------------------------
38+
--------------------------------------------
5739

5840
To see a list of all sources and their names for harvesting, visit https://share.osf.io/api/sources/
5941

@@ -126,7 +108,7 @@ To automatically add all harvested and accepted documents to Elasticsearch::
126108

127109

128110
Writing a Harvester and Transformer
129-
----------------------------------
111+
-----------------------------------
130112

131113
See the transformers and harvesters located in the ``share/transformers/`` and ``share/harvesters/`` directories for more examples of syntax and best practices.
132114

@@ -162,10 +144,11 @@ Writing a source.yaml file
162144
The ``source.yaml`` file contains information about the source itself, and one or more configs that describe how to harvest and transform data from that source.
163145

164146
.. code-block:: yaml
147+
165148
name: com.example
166149
long_title: Example SHARE Source for Examples
167150
home_page: http://example.com/
168-
user: providers.com.example
151+
user: sources.com.example
169152
configs:
170153
- label: com.example.oai
171154
base_url: http://example.com/oai/
@@ -224,11 +207,11 @@ Best practices for writing a non-OAI Harvester
224207
.. _writing-transformers:
225208

226209
Best practices for writing a non-OAI Transformer
227-
"""""""""""""""""""""""""""""""""""""""""""""""
210+
""""""""""""""""""""""""""""""""""""""""""""""""
228211

229212
- The transformer should be defined in ``share/transformers/{transformer name}.py``.
230213
- When writing the transformer:
231-
- Determine what information from the source record should be stored as part of the ``CreativeWork`` :ref:`model <creative-work>` (i.e. if the record clearly defines a title, description, contributors, etc.).
214+
- Determine what information from the source record should be stored as part of the ``CreativeWork`` :ref:`model <share-models>` (i.e. if the record clearly defines a title, description, contributors, etc.).
232215
- Use the :ref:`chain transformer tools <chain-transformer>` as necessary to correctly parse the raw data.
233216
- Alternatively, implement ``share.transform.BaseTransformer`` to create a transformer from scratch.
234217
- Utilize the ``Extra`` class

docs/share_api.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -487,7 +487,7 @@ Example Data
487487
488488
489489
Code Examples
490-
~~~~~~~~
490+
~~~~~~~~~~~~~
491491

492492
Python
493493

docs/share_models.rst

Lines changed: 3 additions & 203 deletions
Original file line numberDiff line numberDiff line change
@@ -3,208 +3,8 @@
33
SHARE Models
44
============
55

6-
Model Descriptions
7-
------------------
8-
9-
SHARE model descriptions will be useful when writing normalizers for new providers.
10-
See existing provider normalizers for more detailed examples.
11-
12-
.. _creative-work:
13-
14-
Creative Work
15-
"""""""""""""
16-
17-
**Metadata Fields:**
18-
19-
- title
20-
- description
21-
- contributors
22-
- A list of contributors associated with the work, passed to the ``Person`` class via the ``Contributor`` class
23-
24-
.. code-block:: python
25-
26-
class Person:
27-
family_name = ctx.family_name
28-
given_name = ctx.given_name
29-
30-
class Contributor:
31-
cited_name = ctx.cited_name
32-
person = Delegate(Person, ctx)
33-
34-
class CreativeWork:
35-
contributors = Map(Delegate(Contributor), ctx.contributors)
36-
37-
- awards
38-
- A list of awards associated with the work, passed to the ``Award`` class via the ``ThroughAwards`` class
39-
40-
.. code-block:: python
41-
42-
class Award:
43-
description = ctx.award_description
44-
url = ctx.<award_url>
45-
46-
class ThroughAwards:
47-
award = Delegate(Award, ctx)
48-
49-
class CreativeWork:
50-
awards = Map(Delegate(ThroughAwards), ctx.awards)
51-
52-
- links
53-
- A list of links associated with the work, passed to the ``Link`` class via the ``ThroughLinks`` class
54-
55-
56-
.. code-block:: python
57-
58-
class Link:
59-
url = ctx.<url>
60-
# The type field must be either 'doi', 'provider', or 'misc'.
61-
# If the type field is always the same, it can be made Static,
62-
# otherwise, a function should be written to determine the link type.
63-
type = Static('provider')
64-
# OR
65-
type = RunPython('get_link_type', ctx.<url>):
66-
67-
def get_link_type(self, link):
68-
if 'dx.doi.org' in link:
69-
return 'doi'
70-
return 'misc'
71-
72-
class ThroughLinks:
73-
link = Delegate(Link, ctx)
74-
75-
class CreativeWork:
76-
links = Map(Delegate(ThroughLinks), ctx.links)
77-
78-
- publishers
79-
- A list of publishers associated with the work, passed to the ``Publisher`` class via the ``Association`` class:
80-
81-
.. code-block:: python
82-
83-
class Publisher:
84-
name = ctx.publisher_name
85-
86-
class Association:
87-
entity = Delegate(Publisher, ctx)
88-
89-
class CreativeWork:
90-
publishers = Map(Delegate(Association), ctx.publishers)
91-
92-
- funders
93-
- A list of funders associated with the work, passed to a ``Funder`` class via the ``Association`` class (syntax follows the ``publishers`` example above).
94-
- institutions
95-
- A list of institutions associated with the work, passed to an ``Institution`` class via the ``Association`` class (syntax follows the ``publishers`` example above).
96-
- organizations
97-
- A list of organizations associated with the work, passed to an ``Organization`` class via the ``Association`` class (syntax follows the ``publishers`` example above).
98-
- subjects
99-
- A list of subjects associated with the work, passed to the ``Subject`` class via the ``ThroughSubjects`` class:
100-
101-
.. code-block:: python
102-
103-
class Subject:
104-
name = ctx.subject_name
105-
106-
class ThroughSubjects:
107-
link = Delegate(Subject, ctx)
108-
109-
class CreativeWork:
110-
subjects = Map(Delegate(ThroughSubjects), ctx.subjects)
111-
112-
- tags
113-
- A list of tags associated with the work, passed to the ``Tag`` class via the ``ThroughTags`` class
114-
115-
.. code-block:: python
116-
117-
class Tag:
118-
name = ctx.<tag_name>
119-
120-
class ThroughTags:
121-
tag = Delegate(Tag, ctx)
122-
123-
class CreativeWork:
124-
tags = Map(Delegate(ThroughTags), ctx.tags)
125-
126-
- date_created
127-
- date_published
128-
- date_updated
129-
- free_to_read_type
130-
- free_to_read_date
131-
- rights
132-
- language
133-
134-
**Subclasses:**
135-
136-
- ``Article``
137-
- ``Book``
138-
- ``ConferencePaper``
139-
- ``Dataset``
140-
- ``Dissertation``
141-
- ``Lesson``
142-
- ``Poster``
143-
- ``Preprint``
144-
- ``Presentation``
145-
- ``Project``
146-
- ``ProjectRegistration``
147-
- ``Report``
148-
- ``Section``
149-
- ``Software``
150-
- ``Thesis``
151-
- ``WorkingPaper``
152-
153-
154-
Person
155-
""""""
156-
157-
**Metadata Fields:**
158-
159-
- family_name
160-
- given_name
161-
- additional_name
162-
- suffix
163-
- identifiers
164-
- A list of identifiers associated with a person (such as an ORCID), passed to the ``Identifier`` class via the ``ThroughIdentifiers`` class
165-
166-
.. code-block:: python
167-
168-
class Identifier:
169-
url = ctx.url
170-
171-
class ThroughIdentifiers:
172-
identifier = Delegate(Identifier, ctx)
173-
174-
class Person:
175-
identifiers = ctx.identifiers
176-
177-
- emails
178-
- A list of emails associated with a person, passed to the ``Email`` class via the ``PersonEmails`` class (syntax follows the ``identifiers`` example above).
179-
- affiliations
180-
- A list of affiliations associated with a person, passed to an appropriate entity class via the ``Affiliation`` class
181-
182-
.. code-block:: python
183-
184-
class Institution:
185-
name = ctx.<institution_affiliation_name>
186-
187-
class Affiliation:
188-
# The entity used here could be any of the entity subclasses (Institution, Publisher, Funder, Organization).
189-
entity = Delegate(Institution, ctx)
190-
191-
class Person:
192-
affiliations = Map(Delegate(Affiliation), ctx.<affiliations>)
193-
194-
- location
195-
- url
196-
197-
Entity
198-
""""""
199-
200-
**Subclasses**
201-
- ``Organization``
202-
- ``Publisher``
203-
- ``Funder``
204-
- ``Institution``
205-
206-
Model Diagram
207-
-------------
208-
.. image:: _static/share_vertical_models.png
6+
Due to the nature of the data that are collected by SHARE, the schema model is subject to change.
2097

8+
The current JSON Schema and field descriptions, when available, can be found in `our API`_.
2109

10+
.. _our API: https://share.osf.io/api/v2/schema

0 commit comments

Comments
 (0)