Rec-Dating Project
This project studies the rec-dating dataset as a role-based bipartite network.
The notebooks are arranged in four steps and can be run in order from data preparation to the final figures.
Core Idea
raternodes send ratingsprofilenodes receive ratings- edge weight is the observed score from
1to10
This framing lets us separate outgoing activity from received attention and apply network measures such as HITS in a role-consistent way.
Main Questions
- How strongly do popularity and prestige align on the profile side?
- How concentrated is received attention?
- Do the profiles dominating overall interaction also dominate high-rating buckets?
- Which profile-side features are most aligned with elite interaction and elite high-rating status?
Project Structure
data/: raw datasetsrc/rec_dating_project/: reusable project codescripts/: analysis scripts used by the notebooksnotebooks/: the ordered notebook workflowpaper/: LaTeX paper draft and bibliographyoutputs/data/: generated tables reused across notebooksoutputs/figures/: generated figures reused in notebooks and paper
Environment Setup
The project was last checked with Python 3.11.9.
From the project root:
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
The raw dataset is expected at:
data/rec-dating/rec-dating.edges
Download it manually from the Network Repository page and place it at the path above:
https://networkrepository.com/rec-dating.php
Recommended Notebook Workflow
Run the notebooks in this order:
notebooks/01_data_preparation.ipynbnotebooks/02_rec_dating_exploration.ipynbnotebooks/03_applications.ipynbnotebooks/04_final_plots_for_paper.ipynb
What each notebook does:
01: inspects the raw file, explains the role-based modeling choice, and builds the cached dataset summary02: explores popularity, prestige, inequality, and descriptive plots03: applies the framework to bucket concentration and feature alignment04: gathers the final figures and reference values used to check the project outputs
How To Run The Notebooks
Option A: Interactive
Launch JupyterLab from the project root:
jupyter lab
Then open the notebooks and run them in the numbered order above.
Option B: Fully Reproducible Terminal Execution
If you want to execute everything from the terminal:
MPLCONFIGDIR=/tmp/matplotlib-codex python3 -m nbconvert --to notebook --execute notebooks/01_data_preparation.ipynb --inplace --ExecutePreprocessor.timeout=1200
MPLCONFIGDIR=/tmp/matplotlib-codex python3 -m nbconvert --to notebook --execute notebooks/02_rec_dating_exploration.ipynb --inplace --ExecutePreprocessor.timeout=1200
MPLCONFIGDIR=/tmp/matplotlib-codex python3 -m nbconvert --to notebook --execute notebooks/03_applications.ipynb --inplace --ExecutePreprocessor.timeout=1200
MPLCONFIGDIR=/tmp/matplotlib-codex python3 -m nbconvert --to notebook --execute notebooks/04_final_plots_for_paper.ipynb --inplace --ExecutePreprocessor.timeout=1200
The MPLCONFIGDIR prefix helps on headless or restricted environments where Matplotlib cannot write to its default cache directory.
Generated Outputs And Rebuild Behavior
- The notebooks reuse cached files in
outputs/data/andoutputs/figures/when they already exist. - Missing artifacts are rebuilt automatically by the relevant scripts.
- If you want a full refresh, set
FORCE_REBUILD = Truein the setup cell of the notebook you are running.
Scripts Used By The Notebooks
The notebooks rely on these scripts:
scripts/01_dataset_overview.pyscripts/02_full_project_analysis.pyscripts/03_profile_rating_extremes.pyscripts/04_profile_feature_alignment.pyscripts/05_degree_distribution_fit.py
If the notebook templates ever need to be regenerated, run:
python3 scripts/06_rebuild_notebooks.py
Paper Reproduction
After 04_final_plots_for_paper.ipynb has been executed, the paper figures should be available under outputs/figures/.
To compile the paper from the project root:
latexmk -pdf -cd paper/main.tex
Notes
- The dataset is large, so cached artifacts are used intentionally to keep the notebook workflow responsive.
- The last notebook is a final check that the generated figures and summary values still line up with the rest of the project.