Union

Peer Recommendation Engine Designed for American Geophysical Union

By training a machine learning algorithm on the abstracts of previously published journals and live presentations, UDig designed a recommendation system to automate the process of matching peer reviewers for the American Geophysical Union (AGU).

STRATEGIC SNAPSHOT

Challenge

Improve AGU’s approach to matching reviewers with submitted articles to the journal.

Strategy

Automate the process of matching peer reviewers to incoming abstracts based on prior experience to diversify the pool of peer reviewers.

Outcome

Recommendation engine utilizing Natural Language Processing techniques to automatically associate articles by topic and present recommended peer reviewers according to expertise.

By using the methodology developed by UDig, AGU ensures an equitable distribution of Peer Reviewers with representation across many demographics.

Challenge

Scientists from around the world submit articles to be published by the American Geophysical Union (AGU). Each of these articles must first survive a peer review, but the process to select individuals to review submitted content relied heavily on a human component to find appropriate authors. As a result, there was a narrowing of the scientists and authors most often selected to provide peer reviews which led to an overrepresentation of certain socioeconomic classifiers.

Outcome

Using the abstracts from previously published Journals and live presentations table, UDig designed an NLP-backed recommendation system. The NLP portion consisted of a term frequency-inverse document frequency (TF-IDF) model and a Doc2Vec model. TF-IDF is a measure used for information retrieval. Its intention is to reflect term relevance within a particular document. The idea behind TF-IDF is to assign importance when a particular word occurs multiple times within a document as it would appear that this word is meaningful within that document. At the same time, if the word occurs frequently in the target document as well as all other documents in the corpus, it will be assigned less weight as this may just be a frequently occurring word such as stopwords like “the” or “for”.

Doc2Vec’s purpose is to convert words or entire documents into numerical representations. It maintains order and semantic information of any arbitrarily sized text. In our doc2vec model, we used the abstract as the text corpus and the abstract ID to represent the articles associated authors. After text normalization, the modeling phase began. This phase consisted of hyperparameter tuning, training, and result evaluation. Both the doc2vec and the TF-IDF models compute similarity between the target document and the corpus. The abstract with the highest similarity score output by the models would represent our recommendation. Next, we randomly selected a list of 20 target abstracts for recommendations. We output 40 total recommendations: one from the TF-IDF and one from doc2vec for each target abstract.

AGU then had 21 different reviewers analyze the recommendations for relevance. The feedback was clear that TF-IDF outperformed the Doc2Vec model. By using the methodology developed by UDig, AGU ensures an equitable distribution of Peer Reviewers with representation across many demographics.

How We Did It

Automated Taxonomy Creation

Recommendation Engines

Tech Stack

Python
AWS
Postgres

Peer Recommendation Engine Designed for American Geophysical Union

STRATEGIC SNAPSHOT

Challenge

Strategy

Outcome

Challenge

Outcome

How We Did It

Tech Stack

DIG IN > Data

Blog

Customer Master Data Management – Key to Personalizing the Customer Journey

Blog

Digging In with Reid Colson

Blog

A Day in the Life of Interns

Blog

Is the Databricks Lakehouse Right for You?

Blog

A Day in the Life of Brock Dulaney, Senior Data Consultant

Event

Nashville Analytics Summit

Blog

Daily Data – Netflix Viewing Analysis

Event

Resources | Accelerating your Data Roadmap via Targeted POC's

Work

Designing a Data Architecture to Move Business Forward

Designing a Data Architecture to Move Business Forward

Blog

How Do You Sell a Data Strategy to Your C-Suite?

We leave you better.

Peer Recommendation Engine Designed for American Geophysical Union

STRATEGIC SNAPSHOT

Challenge

Strategy

Outcome

Challenge

Outcome

How We Did It

Tech Stack

DIG IN > Data

Blog

Customer Master Data Management – Key to Personalizing the Customer Journey

Blog

Digging In with Reid Colson

Blog

A Day in the Life of Interns

Blog

Is the Databricks Lakehouse Right for You?

Blog

A Day in the Life of Brock Dulaney, Senior Data Consultant

Event

Nashville Analytics Summit

Blog

Daily Data – Netflix Viewing Analysis

Event

Resources | Accelerating your Data Roadmap via Targeted POC's​

Work

Designing a Data Architecture to Move Business Forward

Designing a Data Architecture to Move Business Forward

Blog

How Do You Sell a Data Strategy to Your C-Suite?

We leave you better.

Resources | Accelerating your Data Roadmap via Targeted POC's