Skip to content

Commit bdc7f35

Browse files
authored
enh: flesh out emeritus dashboard (#145)
* enh: flesh out emeritus dashboard * enh: add editorial rotation data
1 parent 880edb4 commit bdc7f35

File tree

5 files changed

+123
-62
lines changed

5 files changed

+123
-62
lines changed

.github/workflows/update-pr-data.yml

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,14 @@ jobs:
4545
env:
4646
GITHUB_TOKEN: ${{ secrets.PROJECTS_READ }}
4747
run: python scripts/get-sprint-data.py
48-
#- name: Update editors
49-
# run: python scripts/get-editors.py
48+
- name: Update editors
49+
if: github.event_name == 'workflow_dispatch' ||
50+
github.event_name == 'schedule' ||
51+
( github.event_name == 'pull_request' &&
52+
github.event.pull_request.head.repo.full_name == github.repository )
53+
env:
54+
GITHUB_TOKEN: ${{ secrets.PYOS_GHA_TEAMS_READ }}
55+
run: python scripts/get-editors.py
5056
- name: get-review-contributors
5157
run: python scripts/get-review-contributors.py
5258
- name: get-package-data

_data/editorial_team_domains.csv

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
1-
gh_username,active,first_name,last_name,country,state,OS,Domain_areas,Description,technical_areas
2-
ab93,yes,Avik,Basu,United States,California,"Mac, Linux","NLP, text analysis, Linguistics, Mathematics, Statistics, ML, AI, Computer sciences, Education","Deep Learning, time series, industry data science, deep unsupervised learning, ML in finance ","Data visualization, Data munging, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Web API's, Docker, Tool usability / accessibility, Python best practices"
3-
banesullivan,yes,Bane,Sullivan,United States,California,Mac,"Spatial data, spatial analysis, GIS, Geosciences / earth science, 3D visualization","Remote sensing of the environment and subsurface, developer advocacy, data science, 3D visualization","Data visualization, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Web API's, Docker, Tool usability / accessibility"
4-
batalex,yes,Alexandre,Batisse,France,,,"Statistics, ML, AI, Computer sciences, Bioinformatics","I work as a Data Scientist on health care data. I conduct epidemiology studies and maintain private packages (analytics, dataviz).","Data visualization, Data extraction & retrieval, Data munging, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Web API's, Docker"
5-
cmarmo,yes,Chiara,Marmo,United States,Hawaii,Linux,"Spatial data, spatial analysis, GIS, Space sciences, Geosciences / earth science, Astronomy","Data processing in Astronomy, Planetary Sciences, Geospatial data. Standard development, interoperability.","Data extraction & retrieval, Data munging, Data deposition, Documentation quality, Continuous Integration"
6-
coatless,yes,James,Balamuta,United States,California,"Mac, Linux","NLP, text analysis, Spatial data, spatial analysis, GIS, Mathematics, Statistics, ML, AI, Computer sciences, Bioinformatics, Education","Latent variable modeling, restricted latent class models, deep learning, computational statistics, psychometrics, item response theory, biostatistics, genomics","Data visualization, Data extraction & retrieval, Data munging, Data deposition, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Web API's, Web scraping, Security, Docker, Tool usability / accessibility"
7-
crhea93,yes,Carter,Rhea,Canada,,,"NLP, text analysis, Spatial data, spatial analysis, GIS, Physics, Mathematics, Statistics, ML, AI, Computer sciences, Hydrology, Space sciences, Geosciences / earth science",Astronomical Image and Spectral Pipeline and Analysis,"Data visualization, Data extraction & retrieval, Documentation quality, Unit Testing, Web API's, Tool usability / accessibility"
8-
ctb,yes,Titus,,,,,,,
9-
eliotwrobson,yes,Eliot,Robson,United States,Illinois,"Windows, Linux","Mathematics, Computer sciences, Education","Algorithms, specifically involving randomness, geometry, and graph theory.","Data visualization, Data extraction & retrieval, Data munging, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming"
10-
hamogu,yes,Hans Moritz,Günther,United States,MA,"Mac, Linux","Physics, Astronomy","Astronomy wit ha focus on star formation and high-energy observations, also instrument development","Data visualization, Data extraction & retrieval, Data munging, Data deposition, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming"
11-
haozeke,yes,Rohit,Goswami,Switzerland,Vaud,,"Physics, Chemistry, Mathematics, Statistics, ML, AI","Transition state searches, kinetic monte carlo, excited state calculations, heavy element (relativistic) calculations, Gaussian Process Regression, Bayesian Hierarchical models, Numerical lasing studies, molecular dynamics","Data visualization, Data extraction & retrieval, Data munging, Data deposition, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Docker"
12-
jonas-eschle,yes,Jonas,Eschle,Switzerland,Geneva,,"Physics, Statistics, ML, AI",statistical analysis of physics at CERN,"Data visualization, Data extraction & retrieval, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming"
13-
julimillan,Yes,Julieta,Millan,Argentina,Buenos Aires,,"Statistics, ML, AI, Ecology / Biology","Biology, neuroscience, industry data science","Data visualization, Data extraction & retrieval, Python package structure, Documentation quality, Object oriented programming, Tool usability / accessibility"
14-
mjhajharia,,Meenal,,,,,,,
15-
simonmolinsky,yes,Simon,Molinsky,,,,,,
16-
slobentanzer,,Sebastian,Lobentanzer,,,,,,
17-
tkoyama010,yes,Tetsuo,Koyama,Japan,Tokyo,Linux,"Physics, Mathematics",Scientific computing,"Data visualization, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Tool usability / accessibility"
18-
yeelauren,yes,Lauren,Yee,Canada,Ontario,"Windows, Mac, Linux","Spatial data, spatial analysis, GIS, Statistics, ML, AI, Ecology / Biology, Epidemiology, Geosciences / earth science","data scientist, consultant, machine learning and remote sensing, ecology based projects, computer vision, deep learning","Data visualization, Data extraction & retrieval, Data munging, Documentation quality, Web scraping, Docker, Tool usability / accessibility"
1+
gh_username,active,first_name,last_name,country,state,OS,Domain_areas,Description,technical_areas
2+
ab93,yes,Avik,Basu,United States,California,"Mac, Linux","NLP, text analysis, Linguistics, Mathematics, Statistics, ML, AI, Computer sciences, Education","Deep Learning, time series, industry data science, deep unsupervised learning, ML in finance ","Data visualization, Data munging, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Web API's, Docker, Tool usability / accessibility, Python best practices"
3+
banesullivan,yes,Bane,Sullivan,United States,California,Mac,"Spatial data, spatial analysis, GIS, Geosciences / earth science, 3D visualization","Remote sensing of the environment and subsurface, developer advocacy, data science, 3D visualization","Data visualization, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Web API's, Docker, Tool usability / accessibility"
4+
batalex,yes,Alexandre,Batisse,France,,,"Statistics, ML, AI, Computer sciences, Bioinformatics","I work as a Data Scientist on health care data. I conduct epidemiology studies and maintain private packages (analytics, dataviz).","Data visualization, Data extraction & retrieval, Data munging, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Web API's, Docker"
5+
cmarmo,yes,Chiara,Marmo,United States,Hawaii,Linux,"Spatial data, spatial analysis, GIS, Space sciences, Geosciences / earth science, Astronomy","Data processing in Astronomy, Planetary Sciences, Geospatial data. Standard development, interoperability.","Data extraction & retrieval, Data munging, Data deposition, Documentation quality, Continuous Integration"
6+
coatless,yes,James,Balamuta,United States,California,"Mac, Linux","NLP, text analysis, Spatial data, spatial analysis, GIS, Mathematics, Statistics, ML, AI, Computer sciences, Bioinformatics, Education","Latent variable modeling, restricted latent class models, deep learning, computational statistics, psychometrics, item response theory, biostatistics, genomics","Data visualization, Data extraction & retrieval, Data munging, Data deposition, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Web API's, Web scraping, Security, Docker, Tool usability / accessibility"
7+
crhea93,yes,Carter,Rhea,Canada,,,"NLP, text analysis, Spatial data, spatial analysis, GIS, Physics, Mathematics, Statistics, ML, AI, Computer sciences, Hydrology, Space sciences, Geosciences / earth science",Astronomical Image and Spectral Pipeline and Analysis,"Data visualization, Data extraction & retrieval, Documentation quality, Unit Testing, Web API's, Tool usability / accessibility"
8+
ctb,yes,Titus,,,,,,,
9+
eliotwrobson,yes,Eliot,Robson,United States,Illinois,"Windows, Linux","Mathematics, Computer sciences, Education","Algorithms, specifically involving randomness, geometry, and graph theory.","Data visualization, Data extraction & retrieval, Data munging, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming"
10+
hamogu,yes,Hans Moritz,Günther,United States,MA,"Mac, Linux","Physics, Astronomy","Astronomy wit ha focus on star formation and high-energy observations, also instrument development","Data visualization, Data extraction & retrieval, Data munging, Data deposition, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming"
11+
haozeke,yes,Rohit,Goswami,Switzerland,Vaud,,"Physics, Chemistry, Mathematics, Statistics, ML, AI","Transition state searches, kinetic monte carlo, excited state calculations, heavy element (relativistic) calculations, Gaussian Process Regression, Bayesian Hierarchical models, Numerical lasing studies, molecular dynamics","Data visualization, Data extraction & retrieval, Data munging, Data deposition, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Docker"
12+
jonas-eschle,yes,Jonas,Eschle,Switzerland,Geneva,,"Physics, Statistics, ML, AI",statistical analysis of physics at CERN,"Data visualization, Data extraction & retrieval, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming"
13+
julimillan,Yes,Julieta,Millan,Argentina,Buenos Aires,,"Statistics, ML, AI, Ecology / Biology","Biology, neuroscience, industry data science","Data visualization, Data extraction & retrieval, Python package structure, Documentation quality, Object oriented programming, Tool usability / accessibility"
14+
mjhajharia,,Meenal,,,,,,,
15+
simonmolinsky,yes,Simon,Molinsky,,,,,,
16+
slobentanzer,,Sebastian,Lobentanzer,,,,,,
17+
tkoyama010,yes,Tetsuo,Koyama,Japan,Tokyo,Linux,"Physics, Mathematics",Scientific computing,"Data visualization, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Tool usability / accessibility"
18+
yeelauren,yes,Lauren,Yee,Canada,Ontario,"Windows, Mac, Linux","Spatial data, spatial analysis, GIS, Statistics, ML, AI, Ecology / Biology, Epidemiology, Geosciences / earth science","data scientist, consultant, machine learning and remote sensing, ecology based projects, computer vision, deep learning","Data visualization, Data extraction & retrieval, Data munging, Documentation quality, Web scraping, Docker, Tool usability / accessibility"

_data/emeritus_editor_domains.csv

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
gh_username,active,first_name,last_name,country,state,OS,Domain_areas,Description,technical_areas
2-
isabelizimm,yes,Isabel,Zimmerman,USA,Florida,Mac,"Statistics, ML, AI, Computer sciences",building IDEs and MLOps Python frameworks,"Data visualization, Data extraction & retrieval, Python package structure, Documentation quality, Continuous Integration"
3-
lwasser,,,,,,,,,
4-
nimasarajpoor,,,,,,,,,
5-
sneakers-the-rat,yes,Jonny,Saunders,US,CA,"Mac, Linux","NLP, text analysis, Linguistics, Ecology / Biology, Bioinformatics, Bibliometrics (Library science, scientific literature access), Social Sciences","Formerly auditory neuroscience, animal models of phonetics. Realtime experimental hardware/software, low-resource computing, data modeling and schema creation. Currently peer-to-peer social/data systems.","Data visualization, Data extraction & retrieval, Data munging, Data deposition, Python package structure, Documentation quality, Unit Testing, Continuous Integration, Object oriented programming, Web API's, Web scraping, Security, Tool usability / accessibility"
2+
cmarmo,False,Chiara,Marmo,United States,Hawaii,Linux,"Spatial data, spatial analysis, GIS, Space sciences, Geosciences / earth science, Astronomy","Data processing in Astronomy, Planetary Sciences, Geospatial data. Standard development, interoperability.","Data extraction & retrieval, Data munging, Data deposition, Documentation quality, Continuous Integration"
3+
isabelizimm,False,,,,,,,,
4+
lwasser,False,,,,,,,,
5+
nickledave,False,,,,,,,,
6+
nimasarajpoor,False,,,,,,,,
7+
sneakers-the-rat,False,,,,,,,,
8+
xmnlab,False,,,,,,,,

peer-review/editorial-dashboard.qmd

Lines changed: 64 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -109,17 +109,22 @@ active_editor_counts = (
109109
.rename_axis("gh_username")
110110
.reset_index(name="count")
111111
)
112+
112113
```
113114

114115

115116
```{python}
116117
# Open editor team data
117118
# The total list is derived from the scripts/editors.py script which parses the (active) editorial team data.
119+
# The data opened in this section is collected using the scripts/get-editors.py workflow.
118120
119121
editor_path = Path.cwd().parents[0] / "_data" / "editorial_team_domains.csv"
120122
all_editors = pd.read_csv(editor_path)
121123
all_editors = all_editors[["gh_username","first_name","Domain_areas","Description","technical_areas"]]
122124
125+
emeritus_editor_path = Path.cwd().parents[0] / "_data" / "emeritus_editor_domains.csv"
126+
emeritus_editors = pd.read_csv(emeritus_editor_path)
127+
123128
editor_activity = (
124129
all_editors.merge(active_editor_counts, on="gh_username", how="left")
125130
.fillna({"count": 0})
@@ -133,15 +138,31 @@ busy_editors = editor_activity[editor_activity["count"] != 0]
133138
num_available_editors = len(available_editors)
134139
135140
# Rename and cleanup
136-
editor_activity=editor_activity.rename(columns={"count": "active review count"})
141+
editor_activity = editor_activity.rename(columns={"count": "active review count"})
137142
col = editor_activity.pop("active review count")
138143
editor_activity.insert(1, "active review count", col)
139144
140-
```
145+
# Next, compare editor activity to editors that want to offboard after that review
146+
offboard_usernames = emeritus_editors["gh_username"]
147+
active_editors = open_reviews["editor"].dropna()
148+
active_editors = active_editors[active_editors != "TBD"]
149+
active_editors = active_editors.astype(str).str.split(r"[ ,&]", n=1).str[0]
150+
151+
# TODO: note that blockingpy by data need to be updated - carter should be first here...
141152
153+
# TODO - filter reviews df by editors that are in the emeritus df (emeritus_editors)
154+
offboard_usernames = emeritus_editors["gh_username"]
155+
156+
# This doesn't work - pick up here
157+
# offboard_reviews = reviews[editor.isin(offboard_usernames)].copy()
158+
# offboard_reviews
159+
160+
# Active editors are ones that are currently leading reviews. But some may intend to offboard after. Generate a list of reviews lead by those
161+
offboarding_editor_reviews = open_reviews[open_reviews["editor"].isin(offboard_usernames)].copy()
162+
```
142163

143164

144-
## Row {height=.5%}
165+
## Row {height=auto}
145166

146167
```{python}
147168
#| content: valuebox
@@ -176,29 +197,42 @@ dict(
176197
)
177198
```
178199

179-
## Row {height=1%}
200+
## Row {height=5%}
201+
180202
```{python}
181-
#| title: "pyOpenSci Active Editorial Team"
203+
#| title: "pyOpenSci Current Active Editorial Review Counts"
182204
show(editor_activity)
183205
```
184206

185-
## Row {height=1%}
207+
## Row {height=auto}
208+
209+
```{python}
210+
#| title: "Editors Leading Reviews That Are or Have Offboarded After"
211+
212+
# TODO: The table below is a good start but it should capture AMS lead by Nima as well. And the one that Chiara still has opened.
213+
print("Below are reviews that the peer review lead and Editor in Chief should watch. We may need to assign an additional editor the ensure these reviews moves forward.")
214+
215+
offboarding_editor_reviews = offboarding_editor_reviews.drop(columns=["date_accepted", "Categories"], errors="ignore")
216+
offboarding_editor_reviews["Date Opened"] = pd.to_datetime(offboarding_editor_reviews["Date Opened"]).dt.strftime("%Y-%b-%d")
217+
218+
offboarding_editor_reviews
219+
220+
```
221+
222+
## Row {height=auto}
186223

187224
```{python}
188225
#| title: "Busy editors running reviews "
189226
show(busy_editors)
190227
```
191228

192-
## Row {height=.8%}
229+
## Row {height=auto}
193230
```{python}
194231
#| title: "Available Editors"
195232
show(available_editors)
196233
```
197234

198-
199-
200235
```{python}
201-
# TODO: make this focus only on current open reviews vs all reviews over time.
202236
203237
# Get a list of all editors over time that have supported pyOpenSci
204238
ignore_editors = ["TBD"]
@@ -234,23 +268,23 @@ edits = reviews.rename(columns={"Date Opened": "Date"}).copy()
234268

235269
```{python}
236270
237-
# TODO: If this uses open_reviews it's only showing current load
238-
# if it uses the reviews df it's showing reviews all time 2019 to present. open_reviews has a slightly different structure
271+
# Ensure datetime and a clean quarter label
272+
# 2023 is when we started running again with funding
239273
edits = reviews[["editor", "Name", "Date Opened"]]
240274
edits = edits.rename(columns={"Date Opened": "Date", "Name":"package_name"})
241275
edits = edits[edits["editor"] != "TBD"]
276+
edits["Date"] = pd.to_datetime(edits["Date"])
277+
edits["Year"] = edits["Date"].dt.year
278+
edits = edits[edits["Year"] >= 2023]
242279
243280
```
244281

245-
## Editor availability
282+
## Editor Activity by Quarter
246283

247-
## Row {height=6%}
284+
## Row {height=auto}
248285

249286
```{python}
250-
# Cleanup
251-
# Ensure datetime and a clean quarter label
252-
edits["Date"] = pd.to_datetime(edits["Date"])
253-
edits["Year"] = edits["Date"].dt.year
287+
# Add quarter counts
254288
edits["QuarterNum"] = edits["Date"].dt.quarter
255289
edits["QuarterLabel"] = edits["Year"].astype(str) + " Q" + edits["QuarterNum"].astype(str)
256290
@@ -276,6 +310,7 @@ df_full = (
276310
.reindex(full_index, fill_value=0)
277311
.reset_index()
278312
)
313+
279314
```
280315

281316
```{python}
@@ -285,7 +320,7 @@ df["QuarterLabel"] = pd.Categorical(df["QuarterLabel"], categories=quarter_order
285320
facet_wrap = 2
286321
num_editors = len(df["editor"].unique())
287322
num_rows = (num_editors + facet_wrap - 1) // facet_wrap
288-
row_height = 400
323+
row_height = 300
289324
290325
fig = px.bar(
291326
df,
@@ -296,8 +331,8 @@ fig = px.bar(
296331
facet_col_spacing=0.06,
297332
facet_col_wrap=facet_wrap,
298333
color_discrete_sequence=["indigo"],
299-
labels={"count": "Number of edits", "QuarterLabel": "Quarter"},
300-
title="Editor Activity by Quarter (Current Editor Team)",
334+
labels={"count": "Number of edits"},
335+
title="Review Count by Quarter (Since 2023)",
301336
height=row_height * num_rows,
302337
width=1200,
303338
)
@@ -308,16 +343,16 @@ fig = fig.for_each_annotation(
308343
fig = fig.update_xaxes(
309344
tickangle=45,
310345
tickfont=dict(size=10),
311-
title_text="Quarter",
312346
showticklabels=True
313347
)
314348
fig = fig.update_yaxes(
315349
dtick=1,
316350
tickformat=",d",
317-
title_text="Number of edits",
351+
title_text="Review Count",
318352
range=[0, 4]
319353
)
320354
fig = fig.update_layout(
355+
xaxis_title="",
321356
showlegend=False,
322357
margin=dict(t=80),
323358
title_font_size=24,
@@ -326,3 +361,9 @@ fig = fig.update_layout(
326361
fig.show()
327362
328363
```
364+
365+
## Row {height=auto}
366+
367+
## Summary
368+
369+
This is the end of the dashboard.

0 commit comments

Comments
 (0)