Example: Joins + Aggregation
Demonstrates joining tables, grouping, summarizing, and producing tidy outputs for downstream analysis.
Curated examples of SQL, R, and Python work focused on reproducible data workflows. Repositories include small demonstration datasets where appropriate. Dissertation-specific code and data archives live under Data Repositories.
Query examples demonstrating joins, aggregation, and analysis-ready extracts. (Where relevant, projects include a small example schema or mock dataset.)
Demonstrates joining tables, grouping, summarizing, and producing tidy outputs for downstream analysis.
Queries that flag missingness, duplicates, and outliers—useful for QA/QC prior to modeling.
Typical tools: SQL (PostgreSQL-style), structured query logic, reproducible extracts for analysis.
Statistical workflows, reproducible analysis scripts, visualization, and model evaluation. Examples emphasize clean data handling, transparent transformations, and interpretable outputs.
End-to-end analysis script illustrating data cleaning, exploratory analysis, dimensionality reduction, and model fitting with exportable outputs.
Example plots and reporting outputs designed for scientific communication (figures, summaries, codebook notes).
Typical tools: tidyverse, ggplot2, data.table; reproducible scripts; versioned outputs.
Scientific programming, data preprocessing, feature extraction, and machine-learning workflows. Portfolio examples are designed to run end-to-end and emphasize clarity, documentation, and reproducibility.
Demonstration pipeline that ingests raw inputs, performs QA/QC and transformations, extracts structured features, and outputs analysis-ready tables.
Lightweight classification workflow showing train/test split strategy, baseline models, evaluation metrics, and clear reporting.
Typical tools: pandas, numpy, scikit-learn, matplotlib; Git/GitHub; reproducible scripts + READMEs.