Analytics & Modeling

Interactive tools, predictive models, statistical code, and dashboards — translating research-grade methodology into decisions organizations can act on.

Models, dashboards, case studies

Discrete-Time Survival · Stata

Work Arrangements & Mental Health Onset

Survey-weighted discrete-time hazard models for depression and loneliness onset across 9 Household Pulse Survey cycles. Logit and cloglog link functions with marginsplot interaction visualization.

svy: logitHPS Cycles 1–9Age 18–34
Preprint
Ordinal Logistic Regression · SAS

Working Caregiver Strain & Productivity Loss

Cumulative logit model on NSOC data identifying caregiver strain as a predictor of productivity impairment. Results power the ROI calculator above. Proportional odds assumption verified.

n=1,058OR: 40×NSOC
View on Researchgate
Multivariate Regression · SAS GLMSELECT

Competitive Tuition Pricing Model

Stepwise selection from 11 predictors for parochial institution pricing. Multiple imputation, Box-Cox transformation, 60/40 train-test validation split. Adj R² = 0.71.

n=1,283Adj R²=0.71F=392.19
View Code & Output
Multivariate Regression · Minitab/SAS

Call Campaign Impact on Order Volume

Stepwise regression with interaction terms examining how call campaigns, traffic, prospects, and cancellations jointly predict retail order volume. Cook's D diagnostics, VIF checks.

R²=0.64p<.001−1.4 units/call
View Report
Comparative Analysis · R

USG Faculty Salary Analysis (5 Institutions)

R-based salary analysis across Georgia System institutions and faculty ranks. Welch t-tests, Kruskal-Wallis, boxplots with jitter overlay. YOY percent change by position and institution.

5 institutionssqldf · ggplot22011–2012
View R Code
Time Series Forecasting · SAS PROC ARIMA

Demand Forecasting for Operations Planning

Seasonal ARIMA model for operational demand — enabling staffing optimization and capacity planning. ACF/PACF analysis, seasonal decomposition, rolling forecast validation.

PROC ARIMA12-period forecastα=0.05
View on Researchgate
Workforce Analytics · Dashboard · Prototype

Multigenerational Workforce Benefits Dashboardd

23%
Wasted Spend (High use · low satisfaction)
$2.1M
Est. Productivity Loss (WPAI-modeled absenteeism)

Visualizes benefits utilization, engagement, and health priorities segmented by generation (Gen Z, Millennial, Gen X, Boomer). Supports targeted benefits strategy and workforce planning decisions.

WPAI-GHUSPQ-AlignedEfficiency Matrix / Benefit Scoring FrameworkHTML
View Demo ↗
Objectives · Performance Monitoring · Risk Identification · Gap Analysis

TotalHealth OKR Performance Dashboard · Prototype

7
Identified Objectives
8
Financial Performance Indicators (Including

Interactive executive-level OKR tracking dashboard linking workforce health objectives to measurable key results. Built for health plan and employer stakeholders to monitor program performance and goal attainment.

Custom SurveyEvaluation & MeasurementEconomic ModelsHTMLHTML
View Demo ↗
Case Study 01

Chat Expansion A/B Test & Business Case

Pre-post analysis and business case supporting the expansion of chat as a service channel. Regression on handle time, cost-per-contact comparison, and projected ROI from channel shift.

📊 Business case approved — chat expansion implemented with projected 18% cost reduction per contact
View Case Study
Case Study 02

Operations Demand Forecasting & Staffing Model

End-to-end demand forecasting using ARIMA time series modeling. Enabled data-driven staffing decisions by predicting contact volume 12 weeks out with quantified confidence intervals.

📈 Forecast accuracy improved; over/understaffing events reduced by cross-validating model predictions against actuals
View Case Study
Case Study 03

NLP Survey Text Analysis & Theme Extraction

Text mining of open-ended employee survey responses to surface latent themes in workplace wellbeing data. Results directly informed benefit program changes and communication strategy.

🔍 5 actionable themes from 2,000+ responses — directly informed 2023 benefits redesign
View Case Study

The work, underneath

Real code from real studies — not pseudocode. Stata, SAS, and R samples from published or documented analyses.

Canadian_Member_Claims_Integration.sql
-- Canadian Member Claims Integration Query
-- Integrates member demographics, claims activity,
-- and digital-health platform engagement | Ex-Quebec

CREATE OR REPLACE TABLE ANALYTICS_DB.canadian_member_claims AS
WITH base_claims AS (
    SELECT
          c.member_id
        , c.claim_id
        , c.claim_type
        , c.diagnosis_group
        , c.service_date
        , c.paid_amount
        , c.provider_specialty
        , c.province
        , c.country
    FROM CLAIMS_DB.claim_header c
    WHERE c.country = 'Canada'
      AND c.province <> 'QC'
),

platform_activity AS (
    SELECT
          p.member_id
        , COUNT(*) AS total_platform_events
        , COUNT(DISTINCT p.program_id) AS distinct_programs
        , MAX(p.last_active_date) AS last_platform_activity
        , SUM(CASE WHEN p.event_type = 'COACHING_SESSION' THEN 1 ELSE 0 END) AS coaching_sessions
        , SUM(CASE WHEN p.event_type = 'DIGITAL_CHECKIN' THEN 1 ELSE 0 END) AS digital_checkins
    FROM PLATFORM_DB.member_activity p
    GROUP BY p.member_id
),

member_profile AS (
    SELECT
          m.member_id, m.account_owner
        , m.billing_city, m.billing_province, m.billing_postal
        , m.first_eligibility_year, m.last_eligibility_year
        , m.market_segment, m.primary_provider
    FROM MEMBER_DB.member_dim m
    WHERE m.country = 'Canada'
      AND m.billing_province <> 'QC'
),

claim_aggregates AS (
    SELECT
          member_id
        , COUNT(DISTINCT claim_id) AS total_claims
        , SUM(paid_amount) AS total_paid
        , AVG(paid_amount) AS avg_paid_per_claim
        , COUNT(DISTINCT diagnosis_group) AS distinct_conditions
        , MIN(service_date) AS first_claim_date
        , MAX(service_date) AS last_claim_date
    FROM base_claims
    GROUP BY member_id
)

SELECT
      mp.account_owner, mp.member_id AS cuid
    , mp.billing_city, mp.billing_province, mp.market_segment
    , mp.first_eligibility_year, mp.last_eligibility_year
    -- Claims aggregates
    , ca.total_claims, ca.total_paid
    , ca.avg_paid_per_claim, ca.distinct_conditions
    , ca.first_claim_date, ca.last_claim_date
    -- Platform engagement
    , pa.total_platform_events, pa.distinct_programs
    , pa.coaching_sessions, pa.digital_checkins
    -- Derived flags
    , CASE WHEN ca.total_claims > 0 THEN 1 ELSE 0 END AS has_claims
    , CASE WHEN pa.total_platform_events > 0 THEN 1 ELSE 0 END AS engaged_in_platform
FROM member_profile mp
LEFT JOIN claim_aggregates ca  ON mp.member_id = ca.member_id
LEFT JOIN platform_activity pa ON mp.member_id = pa.member_id
ORDER BY mp.market_segment, mp.account_owner, mp.member_id;
// Discrete-Time Survival Analysis: Depression Onset
// Household Pulse Survey, Cycles 1–9 | JD-R Framework

// Set survey design weights
svyset [pw=pweight]

// Model D4: Full JD-R Model — Demands + Resources
svy: logit dep_first_event i.cycle ///
    dem_fininst not_married ///
    i.work_arrangement i.social3 ///
    i.age_group i.gender i.race_ethnicity ///
    i.education i.ann_income i.region, or

// Model D5: Work Arrangement × Social Connection Interaction
svy: logit dep_first_event i.cycle ///
    dem_fininst not_married ///
    i.work_arrangement##i.social3 ///
    i.age_group i.gender i.race_ethnicity ///
    i.education i.ann_income i.region

// Test interaction significance
testparm i.work_arrangement#i.social3

// Marginal effects at representative values
margins work_arrangement, at(social3=(1 2 3)) ///
    vce(unconditional) post
marginsplot, title("Predicted Depression Hazard")
# USG Faculty Salary Analysis — R
# 5 Georgia System institutions, 2011–2012
library(sqldf); library(stringr); library(gplots)

# Merge years across all 5 institutions
mergeUSG <- rbind(gsouth11, gsouth12, gsu11, gsu12,
                   ksu11, ksu12, uwg11, uwg12, vsu11, vsu12)

# Clean salary: strip $, comma, whitespace → numeric
NewSalary <- str_trim(gsub(",", "",
               gsub("^.", "", completeUSG$Salary)))
SalaryNum  <- as.numeric(NewSalary)

# T-test: salary change 2011 → 2012 by position
INST_t       <- t.test(SalaryNum ~ Fiscal.Year, data=INST)
PROFassist_t <- t.test(SalaryNum ~ Fiscal.Year, data=PROFassist)

# Kruskal-Wallis: salary by organization
kruskal.test(SalaryNum ~ as.factor(Fiscal.Year), data=INST)

# YOY percent change by position
INSTyoy <- (mean(INST12$SalaryNum, na.rm=TRUE) -
            mean(INST11$SalaryNum, na.rm=TRUE)) /
            mean(INST11$SalaryNum, na.rm=TRUE)
/* ARIMA Demand Forecasting: Operations Planning */
/* SAS PROC ARIMA — Seasonal ARIMA(1,1,1)(1,1,1)₁₂ */

PROC ARIMA DATA=ops_data;
  /* Identify autocorrelation structure */
  IDENTIFY VAR=demand(1,12) NLAG=36
    STATIONARITY=(ADF);

  /* Estimate seasonal model */
  ESTIMATE
    P=(1)(12) Q=(1)(12) D=1
    NOINT METHOD=ML PRINTALL;

  /* 12-period rolling forecast */
  FORECAST
    LEAD=12 INTERVAL=MONTH ID=date
    NOOUTALL OUT=forecast_out ALPHA=0.05;
RUN;
Code↗

Let's build something
rigorous and useful

Available for full-time remote roles and consulting engagements in health analytics, people analytics, and workforce health research.