GraphEidos Dataset
Published dataset on visual rhetoric in digital humanities
The Problem
Researchers studying visual rhetoric in digital humanities lacked a comprehensive, structured dataset of publications in the field. Manually collecting and organizing thousands of publications was infeasible.
Approach
I engineered a modular Python pipeline to programmatically scrape publication metadata from multiple sources, then built cleaning and deduplication workflows to ensure data integrity. The final dataset was structured for accessibility and deposited as an open-access resource.
Results
- 3,274 publication records scraped, cleaned, and structured
- Published as an open-access dataset in the Journal of Open Humanities Data