Coding Portfolio

Curated examples of SQL, R, and Python work focused on reproducible data workflows. Repositories include small demonstration datasets where appropriate. Dissertation-specific code and data archives live under Data Repositories.

SQL

Query examples demonstrating joins, aggregation, and analysis-ready extracts. (Where relevant, projects include a small example schema or mock dataset.)

Example: Joins + Aggregation

Demonstrates joining tables, grouping, summarizing, and producing tidy outputs for downstream analysis.

Typical tools: SQL (PostgreSQL-style), structured query logic, reproducible extracts for analysis.

R

Statistical workflows, reproducible analysis scripts, visualization, and model evaluation. Examples emphasize clean data handling, transparent transformations, and interpretable outputs.

Example Project: Reproducible Analysis Workflow

End-to-end analysis script illustrating data cleaning, exploratory analysis, dimensionality reduction, and model fitting with exportable outputs.

Example Project: Reporting + Visualization

Example plots and reporting outputs designed for scientific communication (figures, summaries, codebook notes).

Typical tools: tidyverse, ggplot2, data.table; reproducible scripts; versioned outputs.

Python

Scientific programming, data preprocessing, feature extraction, and machine-learning workflows. Portfolio examples are designed to run end-to-end and emphasize clarity, documentation, and reproducibility.

Example Project: Data Pipeline + Feature Extraction

Demonstration pipeline that ingests raw inputs, performs QA/QC and transformations, extracts structured features, and outputs analysis-ready tables.

  • Coming Soon!

Example Project: Model Training + Evaluation

Lightweight classification workflow showing train/test split strategy, baseline models, evaluation metrics, and clear reporting.

  • Coming Soon!

Typical tools: pandas, numpy, scikit-learn, matplotlib; Git/GitHub; reproducible scripts + READMEs.

Notes on Data Sharing

Portfolio repositories may include synthetic, anonymized, or downsampled datasets for demonstration purposes. Dissertation-related code and derived datasets are organized separately under Dissertation Data & Code.