Analytics & Modeling Portfolio

Applied quantitative analysis translating research-grade methodology into organizational intelligence — from interactive ROI tools and predictive models to operations dashboards and formal case studies. Work spans multivariate and logistic regression, discrete-time survival analysis, ARIMA forecasting, cost-consequence modeling, A/B testing, longitudinal comparisons, and survey-weighted estimation using Stata, SAS, R, and Power BI. Data sources include NSOC, NHIS, Household Pulse Survey, ICD-10 administrative claims, employer HR systems, and POS operational databases.

Survival Analysis Logistic Regression Multivariate Regression ARIMA Forecasting Survey-Weighted Estimation ROI Modeling Stata · SAS · R Power BI · Excel A/B Testing Claims Data NSOC · NHIS · HPS SQL
Analytics & Modeling → Research & Evidence ↗

Featured Interactive Tool

Built on NSOC Regression Analysis
Interactive Calculator

Caregiver Strain ROI Calculator

Quantifies productivity loss by caregiver strain level and calculates employer ROI from caregiver-support benefit investments. Powered by ordinal logistic regression results from the National Study of Caregiving (NSOC), with strain-level odds ratios ranging from 4× to 40×.

40×
High-strain OR
Low-strain OR
1,058
NSOC sample n
p<.001
Model sig.

Caregiver Strain → Productivity Loss Odds Ratios

No Strain
Low Strain
Moderate
15×
15×
High Strain
40×
40×

Source: NSOC Ordinal Logistic Regression · n=1,058 · p<.001

Interactive Tools & Dashboards

Live Analytic Applications
Health Analytics · Interactive

Caregiver Work Productivity Risk Stratification Tool

Stratifies employees by caregiver strain level and predicts productivity risk tier. Powered by NSOC ordinal logistic regression — 40× peak odds ratio. Built for HR and benefits decision-makers.

Risk StratificationLogistic RegressionNSOCInteractive
Workforce Analytics · Dashboard

Multigenerational Workforce Benefits Dashboard

Visualizes benefits utilization, engagement, and health priorities segmented by generation (Gen Z, Millennial, Gen X, Boomer). Supports targeted benefits strategy and workforce planning decisions.

MultigenerationalBenefits AnalyticsWorkforceInteractive
Strategic Analytics · OKR

TotalHealth OKR Performance Dashboard — FY2024

Executive-level OKR tracking dashboard linking workforce health objectives to measurable key results. Built for health plan and employer stakeholders to monitor program performance and goal attainment.

OKR TrackingHealth PlanExecutive ReportingInteractive

Statistical Models

Formal Analyses & Applied Research
Survival Analysis · Stata

Discrete-Time Survival Analysis: Young Adult Worker Mental Health

Survey-weighted discrete-time hazard models examining timing of depression and loneliness onset across 9 Household Pulse Survey cycles. Grounded in Job Demands-Resources theory with marginsplot visualizations.

svy: logitxtlogitmarginsplotn=7,000+
Multivariate Regression

Impact of Consumer Call Campaign on Order Volume

Retail POS data regression analysis identifying call volume as a significant negative predictor of order volume (−1.398 units/call, p<.001). Adj. R² = 0.64. Interaction and non-linear terms examined.

Stepwise RegressionVIFAdj R²=0.64
Time Series · SAS

Seasonal Demand Forecasting: ARIMA Model with 12-Period Ahead Forecast

PROC ARIMA with seasonal decomposition and spectral analysis. Auto-selected ARIMA(1,1,1)(1,1,1) specification for operational call volume forecasting. Residual diagnostics and confidence interval output.

PROC ARIMASeasonal12-Period Forecast

Code Showcase

SQL · Stata · SAS · SAS ARIMA
canadian_member_claims_integration.sql
-- Canadian Member Claims Integration Query
-- Integrates member demographics, claims activity, and digital health
-- platform engagement into a unified analytic dataset.

CREATE OR REPLACE TABLE ANALYTICS_DB.canadian_member_claims AS

WITH base_claims AS (
  SELECT
      c.member_id, c.claim_id, c.claim_type, c.diagnosis_group
    , c.service_date, c.paid_amount, c.provider_specialty
    , c.province, c.country
  FROM CLAIMS_DB.claim_header c
  WHERE c.country  =  'Canada'
    AND c.province <> 'QC'
),

platform_activity AS (
  SELECT
      p.member_id
    , COUNT(*)                                                        AS total_platform_events
    , COUNT(DISTINCT p.program_id)                                      AS distinct_programs
    , MAX(p.last_active_date)                                            AS last_platform_activity
    , SUM(CASE WHEN p.event_type = 'COACHING_SESSION' THEN 1 ELSE 0 END) AS coaching_sessions
    , SUM(CASE WHEN p.event_type = 'DIGITAL_CHECKIN'  THEN 1 ELSE 0 END) AS digital_checkins
  FROM PLATFORM_DB.member_activity p
  GROUP BY p.member_id
),

claim_aggregates AS (
  SELECT
      member_id
    , COUNT(DISTINCT claim_id)       AS total_claims
    , SUM(paid_amount)                 AS total_paid
    , AVG(paid_amount)                 AS avg_paid_per_claim
    , COUNT(DISTINCT diagnosis_group)  AS distinct_conditions
    , MIN(service_date)                AS first_claim_date
    , MAX(service_date)                AS last_claim_date
  FROM base_claims
  GROUP BY member_id
)

SELECT
    mp.account_owner, mp.member_id AS cuid
  , mp.billing_province, mp.market_segment, mp.primary_provider
  , mp.first_eligibility_year, mp.last_eligibility_year
  , ca.total_claims, ca.total_paid, ca.avg_paid_per_claim
  , ca.distinct_conditions, ca.first_claim_date, ca.last_claim_date
  , pa.total_platform_events, pa.distinct_programs
  , pa.last_platform_activity, pa.coaching_sessions, pa.digital_checkins
  , CASE WHEN ca.total_claims          > 0 THEN 1 ELSE 0 END AS has_claims
  , CASE WHEN pa.total_platform_events > 0 THEN 1 ELSE 0 END AS engaged_in_platform
FROM MEMBER_DB.member_dim mp
  LEFT JOIN claim_aggregates ca ON mp.member_id = ca.member_id
  LEFT JOIN platform_activity pa ON mp.member_id = pa.member_id
WHERE mp.country = 'Canada' AND mp.billing_province <> 'QC'
ORDER BY mp.market_segment, mp.account_owner, mp.member_id;
// Discrete-Time Survival Analysis: Depression Onset
// Household Pulse Survey · Survey-weighted · JD-R Theory

svyset [pw=pweight]

// Model D4: Full JD-R Model
svy: logit dep_first_event i.cycle ///
    dem_fininst not_married ///
    i.work_arrangement i.social3 ///
    i.age_group i.gender i.race_ethnicity ///
    i.education i.ann_income i.region
estimates store D4_full

// Model D5: Interaction — Work Arrangement × Social Connection
svy: logit dep_first_event i.cycle ///
    dem_fininst not_married ///
    i.work_arrangement##i.social3 ///
    i.age_group i.gender i.race_ethnicity ///
    i.education i.ann_income i.region

testparm i.work_arrangement#i.social3

margins work_arrangement, at(social3=(1 2 3)) ///
    vce(unconditional) post
marginsplot, title("Predicted Depression Hazard", size(small)) ///
    xtitle("Social Connection Frequency") ///
    ytitle("P(Depression Onset | At Risk)") ///
    scheme(stgcolor_mv)
graph export "margins_depression_interaction.png", replace width(1200)
/* Tuition Pricing Model — PROC GLMSELECT */
/* n=1,283 · Stepwise selection · Adj R²=0.71 */

PROC MI DATA=project OUT=proj_impute SEED=97071 NIMPUTE=1;
  fcs;
  VAR tuition top25 sf_ratio fac_comp collrate
      graduat pct_phd fulltime alumni num_enrl parochial;
RUN;

PROC GLMSELECT DATA=proj_impute PLOTS=All SEED=523654;
  CLASS parochial;
  PARTITION fraction(Test=0.4);
  MODEL tuition = top25 sf_ratio fac_comp collrate
      graduat pct_phd fulltime alumni num_enrl parochial
      / Selection=Stepwise(Select=SL SLE=0.15 SLS=0.15
        Choose=AdjRSQ) Details=ALL Hierarchy=Single
        Stats=ALL showpvalues;
  OUTPUT Out=Orig;
RUN;

PROC REG DATA=proj_impute;
  MODEL tuition = sf_ratio fac_comp graduat pct_phd
      fulltime alumni parochial enrlproch / PARTIAL;
RUN;
/* ARIMA Demand Forecasting — 12-Period Ahead */
/* Operations Call Volume · Seasonal ARIMA(1,1,1)(1,1,1) */

PROC ARIMA DATA=monthly_calls;
  IDENTIFY VAR=call_volume(1,12)
    NLAG=36 STATIONARITY=(ADF=2);
RUN;

PROC ARIMA DATA=monthly_calls;
  IDENTIFY VAR=call_volume(1,12);
  ESTIMATE P=(1) Q=(1) SP=(1) SQ=(1) NOSTABLE;
  FORECAST LEAD=12 INTERVAL=month
    ID=date OUT=forecast_out;
RUN;

DATA forecast_final;
  SET forecast_out;
  IF _TYPE_ = 'FORECAST';
  upper_bound = forecast + 1.96 * std;
  lower_bound = forecast - 1.96 * std;
  FORMAT date MONYY7.;
  KEEP date forecast std upper_bound lower_bound;
RUN;

Visualizations & Infographics

Complex models to decision-ready visuals

Caregiver Strain & Workforce Productivity

NSOC regression findings translated into a clear, executive-ready narrative.

NSOC
Dataset
β
Regression
Work
Impact
Regression Caregiving Productivity
View infographic

Mental Health & Work Arrangement

HPS survival analysis visualized for intuitive interpretation.

HPS
Dataset
HR
Hazard Ratio
Hybrid
Work
Survival Models Mental Health Work Arrangement
View infographic

"I Can Tell the Story in Any Medium"

Power BI, Excel, Canva, and PowerPoint—same analytic spine, different audiences.

BI
Dashboards
PPT
Decks
Canva
Visuals
Power BI Excel Canva PowerPoint
Explore gallery

Full Analytics Lifecycle

Methodology to Action
1

Descriptive

Data prep, EDA, and baseline profiling

2

Diagnostic

Root cause analysis and pattern detection

3

Predictive

Regression, survival, and forecasting models

4

Prescriptive

ROI modeling and decision frameworks

5

Communicate

Dashboards, reports, and executive summaries

Let's Build Something Together

Available for full-time remote roles and consulting engagements in health analytics, people analytics, and workforce health research.

Analytics & Modeling → Research & Evidence ↗