About — Shubham Gupta | AI Engineer, Penn State CS

about.md markdown

My interest in engineering started with robotics—moving atoms. It evolved into software—moving information. Today, I focus on the 'plumbing' that makes AI viable: the infrastructure, cost-efficiency, and interpretability that allow powerful models to function in the real world.

Whether I am architecting a distributed storage system in C, deploying EfficientNet-B3 to production via Railway and HuggingFace, or researching LLM-based annotation pipelines on HPC clusters — my goal is the same: build systems that are not just fast, but reliable, transparent, and trustworthy.

Engineering Philosophy

Systems First

Optimize the whole before the parts. A 10× faster model loses to a 2× faster pipeline.

Responsible AI

Interpretability and fairness are constraints from line one — not features bolted on at the end. My AIES-26 research asks whether humans actually accept algorithmic explanations, not just whether they exist.

Efficiency Obsessed

Tokens, milliseconds, and bytes are the budget. Class-imbalanced data, cold-start inference, GPU memory pressure — every project on this site is a different version of "make it work under real constraints."

Education

Penn State University

Schreyer Honors College

B.S. Computer Science

Minor in AI Engineering

Dec 2026 Expected Graduation

3.83 GPA

Research

Accepted

AAAI/ACM AIES-26 Conference on AI, Ethics & Society

"Do Personalized Evaluation Functions Reflect Human Preferences? A Study of Weighted Proximity in Algorithmic Recourse"

My contribution:

Engineered the AWP evaluation pipeline and ran ablation studies across 5 distance metrics
Wrote the experimental design section; designed the user study instrument
Result: AWP-generated recourse achieved 84% human preference prediction vs. proximity-only baseline

In Progress

Penn State Schreyer Honors Thesis With Prof. Qunhua Li, Department of Statistics

"LLM-based Codebook Annotation for Political Science Text"

Research question:

Can LLMs reliably annotate political events using expert-defined codebooks — and do they follow definitions, or just label-name heuristics?

Novel contributions:

Definition compliance metric — swapped-label accuracy to test whether models use definitions vs. surface label names
RAG-augmented annotation pipeline (k=5) — weighted F1 = 0.7012 vs. 0.6494 baseline
Running experiments on PSU ICDS HPC across 3 datasets: CCC, BFRS, Manifestos

Experience

AI Innovation Fellow

Deloitte June 2026 – Aug. 2026

Selected for competitive AI Innovation program — building production software systems and scalable AI infrastructure for enterprise applications. Philadelphia, PA.

LLM Engineering Enterprise AI Production Systems

Undergraduate Research Assistant

Penn State — Prof. Qunhua Li Jan. 2026 – Present

Engineered end-to-end NLP pipeline for large-scale text annotation on the ICDS Roar HPC cluster using distributed compute. Designed automated evaluation metrics comparing LLM annotation outputs across thousands of political science text samples.

NLP HPC / SLURM LLM Evaluation Python

Systems Programming Learning Assistant

Penn State University Aug. 2025 – Jan. 2026

Guided 100+ students in C and Unix systems programming — memory management, process control, file I/O, and concurrency. Reduced runtime errors by 30%.

C Unix Systems Concurrency

IT Summer Associate

UPMC June 2025 – Aug. 2025

Engineered Nursing Matrix application using ASP.NET Core and C#, integrating Epic EHR data across four hospital units serving 200+ staff. Architected RESTful API layer with Redis caching, reducing query latency 18% under concurrent load.

C# / ASP.NET Redis Epic EHR React + TypeScript

AI & Automation Developer

Catalyst Solutions May 2024 – Sep. 2024

Built AI and RPA solutions to automate business processes on the Artificial Intelligence/Automation team. Worked with Generative AI, OpenAI API, N8N workflows, Google Cloud, and Google Colab. Remote.

OpenAI API N8N Google Cloud RPA

Open Source

Merged PR

pytorch/ignite 5.1K ⭐ · PyTorch high-level training library

Implemented `CharacterErrorRate` metric — `ignite.metrics.nlp`

Contribution:

Character-level Levenshtein distance metric for ASR/OCR evaluation — fills a gap in ignite's NLP metrics suite
235 lines of implementation · 15 test cases · full Sphinx documentation
Shipped in v0.5.2

PyTorch Python NLP Metrics Levenshtein Distance

Current Focus

Honors Thesis — NLP Research

LLM-based codebook annotation for political science text with Prof. Qunhua Li (Penn State Statistics). Novel contributions: definition-aware synthetic QA generation and a definition compliance metric. Targeting EMNLP / ACL venue.

Deloitte AI Innovation Fellow

Starting June 2026 — building production AI systems and enterprise LLM applications in Philadelphia.

Portfolio Projects

Shipping production-quality ML products: SkinIQ (live at skin-iq.vercel.app), LectureLens (in progress). Each project goes from model training through full deployment.

Beyond the Resume

FIRST Global Mentor

5 years mentoring FTC robotics teams across 4 countries (Belize, Niger, Rwanda, Bolivia). 50+ students. Leadership under hard constraints — robots have to work on competition day.

Ri3D Founding Member + Mechanical Lead

Robot in 3 Days — full competition robot designed and built in 72 hours. Ships under pressure.

Reading

Currently working through Kleppmann's Designing Data-Intensive Applications. Long reads on infra trade-offs over short opinions on frameworks.

Golf

Rewards systems thinking — every round is a feedback loop between strategy, execution, and adjustment under pressure.

Writing

Published

Medium

How I Got My First Contribution Merged Into PyTorch/Ignite

A walkthrough of contributing CharacterErrorRate to pytorch/ignite — finding the right issue, navigating open source conventions, writing the metric and tests, and getting a PR merged into a 5.1K⭐ library.