You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2. Make sure you're using Python3 - install with `conda<http://conda.pydata.org/miniconda.html>`_ , or `homebrew <http://blog.manbolo.com/2013/02/04/how-to-install-python-3-and-pydev-on-osx#2>`_
14
+
2. Make sure you're using Python3 - install with `miniconda<http://conda.pydata.org/miniconda.html>`_ , or `homebrew <http://blog.manbolo.com/2013/02/04/how-to-install-python-3-and-pydev-on-osx#2>`_
15
15
3. Install everything inside a Virtual Enviornment - created with `Conda <http://conda.pydata.org/docs/using/envs.html>`_ or `Virtualenv <https://virtualenv.pypa.io/en/stable/>`_ or your python enviornment of choice.
16
16
17
17
Installation (inside a virtual environment)::
18
18
19
19
pip install -r requirements.txt
20
20
21
-
// Createsand starts containers for elasticsearch, rabbitmq,
21
+
// Creates, starts, and sets up containers for elasticsearch,
22
22
// postgres, and the server
23
-
docker-compose up -d web
24
-
25
-
./up.sh
26
-
---------------- or ----------------
27
-
pg
28
-
createuser share
29
-
psql
30
-
CREATE DATABASE share;
31
-
python manage.py makemigrations
32
-
python manage.py maketriggermigrations
33
-
python manage.py makeprovidermigrations
34
-
python manage.py migrate
35
-
python manage.py createsuperuser
36
-
23
+
docker-compose build web
24
+
docker-compose run --rm web ./bootstrap.sh
37
25
38
26
To run the server in a virtual environment instead of Docker::
39
27
40
-
docker stop share_web_1
28
+
docker-compose stop web
41
29
python manage.py runserver
42
30
43
31
To run celery worker::
44
32
45
33
python manage.py celery worker -l DEBUG
46
34
47
-
To monitor your celery tasks::
48
-
49
-
python manage.py celery flower
50
-
51
-
Visit http://localhost:5555/dashboard to keep an eye on your harvesting and transforming tasks
52
-
53
35
.. _running-sources:
54
36
55
37
Running Existing Harvesters and Transformers
56
-
-------------------------------------------
38
+
--------------------------------------------
57
39
58
40
To see a list of all sources and their names for harvesting, visit https://share.osf.io/api/sources/
59
41
@@ -126,7 +108,7 @@ To automatically add all harvested and accepted documents to Elasticsearch::
126
108
127
109
128
110
Writing a Harvester and Transformer
129
-
----------------------------------
111
+
-----------------------------------
130
112
131
113
See the transformers and harvesters located in the ``share/transformers/`` and ``share/harvesters/`` directories for more examples of syntax and best practices.
132
114
@@ -162,10 +144,11 @@ Writing a source.yaml file
162
144
The ``source.yaml`` file contains information about the source itself, and one or more configs that describe how to harvest and transform data from that source.
163
145
164
146
.. code-block:: yaml
147
+
165
148
name: com.example
166
149
long_title: Example SHARE Source for Examples
167
150
home_page: http://example.com/
168
-
user: providers.com.example
151
+
user: sources.com.example
169
152
configs:
170
153
- label: com.example.oai
171
154
base_url: http://example.com/oai/
@@ -224,11 +207,11 @@ Best practices for writing a non-OAI Harvester
224
207
.. _writing-transformers:
225
208
226
209
Best practices for writing a non-OAI Transformer
227
-
"""""""""""""""""""""""""""""""""""""""""""""""
210
+
""""""""""""""""""""""""""""""""""""""""""""""""
228
211
229
212
- The transformer should be defined in ``share/transformers/{transformer name}.py``.
230
213
- When writing the transformer:
231
-
- Determine what information from the source record should be stored as part of the ``CreativeWork`` :ref:`model <creative-work>` (i.e. if the record clearly defines a title, description, contributors, etc.).
214
+
- Determine what information from the source record should be stored as part of the ``CreativeWork`` :ref:`model <share-models>` (i.e. if the record clearly defines a title, description, contributors, etc.).
232
215
- Use the :ref:`chain transformer tools <chain-transformer>` as necessary to correctly parse the raw data.
233
216
- Alternatively, implement ``share.transform.BaseTransformer`` to create a transformer from scratch.
- A list of funders associated with the work, passed to a ``Funder`` class via the ``Association`` class (syntax follows the ``publishers`` example above).
94
-
- institutions
95
-
- A list of institutions associated with the work, passed to an ``Institution`` class via the ``Association`` class (syntax follows the ``publishers`` example above).
96
-
- organizations
97
-
- A list of organizations associated with the work, passed to an ``Organization`` class via the ``Association`` class (syntax follows the ``publishers`` example above).
98
-
- subjects
99
-
- A list of subjects associated with the work, passed to the ``Subject`` class via the ``ThroughSubjects`` class:
- A list of tags associated with the work, passed to the ``Tag`` class via the ``ThroughTags`` class
114
-
115
-
.. code-block:: python
116
-
117
-
classTag:
118
-
name = ctx.<tag_name>
119
-
120
-
classThroughTags:
121
-
tag = Delegate(Tag, ctx)
122
-
123
-
classCreativeWork:
124
-
tags = Map(Delegate(ThroughTags), ctx.tags)
125
-
126
-
- date_created
127
-
- date_published
128
-
- date_updated
129
-
- free_to_read_type
130
-
- free_to_read_date
131
-
- rights
132
-
- language
133
-
134
-
**Subclasses:**
135
-
136
-
- ``Article``
137
-
- ``Book``
138
-
- ``ConferencePaper``
139
-
- ``Dataset``
140
-
- ``Dissertation``
141
-
- ``Lesson``
142
-
- ``Poster``
143
-
- ``Preprint``
144
-
- ``Presentation``
145
-
- ``Project``
146
-
- ``ProjectRegistration``
147
-
- ``Report``
148
-
- ``Section``
149
-
- ``Software``
150
-
- ``Thesis``
151
-
- ``WorkingPaper``
152
-
153
-
154
-
Person
155
-
""""""
156
-
157
-
**Metadata Fields:**
158
-
159
-
- family_name
160
-
- given_name
161
-
- additional_name
162
-
- suffix
163
-
- identifiers
164
-
- A list of identifiers associated with a person (such as an ORCID), passed to the ``Identifier`` class via the ``ThroughIdentifiers`` class
165
-
166
-
.. code-block:: python
167
-
168
-
classIdentifier:
169
-
url = ctx.url
170
-
171
-
classThroughIdentifiers:
172
-
identifier = Delegate(Identifier, ctx)
173
-
174
-
classPerson:
175
-
identifiers = ctx.identifiers
176
-
177
-
- emails
178
-
- A list of emails associated with a person, passed to the ``Email`` class via the ``PersonEmails`` class (syntax follows the ``identifiers`` example above).
179
-
- affiliations
180
-
- A list of affiliations associated with a person, passed to an appropriate entity class via the ``Affiliation`` class
181
-
182
-
.. code-block:: python
183
-
184
-
classInstitution:
185
-
name = ctx.<institution_affiliation_name>
186
-
187
-
classAffiliation:
188
-
# The entity used here could be any of the entity subclasses (Institution, Publisher, Funder, Organization).
0 commit comments