International Women’s Day Data Challenge 2024
International Women’s Day takes place on March 8th, 2024. The theme for this year is “Invest in women: Accelerate progress”.
In honour of this theme, R-Ladies Ottawa is hosting a data challenge!
Check out this year’s submissions!
See the 2024 submissions here
Challenge
The objective of this data challenge is to analyze investing in women. This is a very open-ended challenge, so be as creative as you’d like!
Some examples include:
- Analyze venture capital investment data to understand trends in funding for women-led startups.
- Analyze businesses or organizations that promote women’s empowerment.
- Analyze data on gender diversity in leadership positions.
You can use any data source, but the following are examples:
Guidelines
Although this challenge is hosted by R-Ladies Ottawa, you don’t have to use R to participate. You could use R, Python, a combination of the two, or anything you’re comfortable with!
No previous programming experience is required! If you’re interested in participating, but aren’t sure where to start, check out the recordings from our Introduction to R workshops that we hosted in early 2024. The workshops will give you a solid foundation in data analysis and data visualization using R.
While anyone is welcome to participate in the challenge and submit an entry, only submissions by women and gender minorities who reside in Canada will be eligible for the prizes.
You can participate individually, or in teams of up to 5 people!
Need help finding a team member?
If you’re interested in participating in the challenge as part of a team, but need help finding team members, come to our International Women’s Day Networking Event and Data Challenge Kickoff on March 8th!
The slides from our March 8th event can be found below.
Prizes
Prize details will be released at a later date.
How to submit an entry
This will be a month-long data challenge, and you’ll have until April 7th to submit your entry. Your submission can be in any form you like – it can be a Quarto document, an image, a link to a webpage, or a link to a video, for example.
There are two options for submitting an entry:
- Option 1 (preferred option): If you have experience with GitHub and Quarto, you can create a pull request on this repository and add your submission to the “Submissions” part of this page
- Option 2: If you don’t have experience with GitHub, you can email ottawa@rladies.org a link to your submission materials and we will add the link to the “Submissions” part of this page
Showcasing your analysis
After the submission deadline, R-Ladies Ottawa will host a virtual event where participants can showcase their work. This is a great way to learn from one another and to connect with other like-minded individuals!
The date and time of the showcase event will be announced soon.
Submissions
Submissions of Islem Meherzi & Moufida Sidaoui
This submission outlines our participation in the R-Ladies Ottawa data challenge, focusing on analyzing investments in women, specifically within data science research and women’s health. Our project aims to explore women’s contributions in data science, investigate global research content, and examine advancements in women’s health research investments.
Main Objectives
Analyzing Women’s Contributions in Data Science Research
- Identify authors of research papers in data science and assess the significance of women’s contributions.
- Investigate the distribution of research publications conducted by women per country, considering the population size of each country.
Investigating Research on Women’s Health
- Analyze the distribution of research publications on women’s health per country.
- Identify trends in research papers related to women’s health.
Main Steps
- Defining Relevant Keywords
- Identified keywords related to data science and data technologies for the first objective.
- Identified keywords related to women’s health for the second objective.
- Data Extraction from PubMed
- Focused on extracting data concerning the first author of the publication from PubMed.
- Gender Detection Using APIs
- Utilized APIs to detect the gender of the first author by their name.
Country Extraction Algorithm Developed an algorithm and utilized other sources to extract the most likely and relevant country from the given affiliation in PubMed data.
Data Enhancement
- Enhanced the dataset by incorporating additional information such as population and employment ratio.
- Content Analysis
- Analyzed content based on keywords of the articles.
- Dynamic Analysis and Visualization
- Performed dynamic analysis and visualization to provide insights into publication details and countries on a map.
- Deployment
- Deployed the Shiny app using rsconnect (utilizing the free plan of shinyapps.io which permits 25 active hours per month).
Shiny App
We have developed a Shiny app to showcase our analysis and insights. The app dynamically presents details of publications, countries involved, and thematic analysis. You can access the app https://islemmeh.shinyapps.io/shinyappmapsearch/.
Conclusion
Our project involved a systematic approach, from data extraction to analysis and visualization, providing valuable insights into women’s contributions to data science research and investments in women’s health. We are confident that our idea can be further evolved, particularly through future optimizations, additional data gathering, and in-depth content and correlation analysis.This project contributes to ongoing efforts to advance the theme of investing in women and accelerating progress in these critical areas.
Nyamekye Asare
I took on this new challenge to build a dashboard based on “flexdashboard”. Nyamekye Asare’s submission can be found here
Joy Liu
Submission from Joy Liu can be found here.
Alexandra McSween
I created a quarto presentation with some fun astrological analysis of the PWHL. Thank you to Brittny Vongdara for sharing the PWHL idea with me <3 (code here)
Brittny’s submission
You can find the link to my submission here.
It’s a quarto presentation with some visualizations using webscraped Professional Women’s Hockey League data using the fastRhockey
R package. I go through some fun plots and walk through how to make my favourite one. Which plot is my favourite? You’ll have to look at my submission to find out ;)