About
I am a Ph.D. researcher in Computer Science at the University of Central Florida (Institute of Artificial Intelligence), advised by Dr. Mubarak Shah.
My research focuses on inference-time scaling and alignment for foundation models. I specialize in developing training-free methods to extract higher reasoning performance and structural consistency from generative models, and I am currently applying these test-time compute methods to enforce physical and temporal consistency in generative world models.
Prior to my Ph.D., I worked as a Computer Engineer building high-fidelity physics simulations in Unreal Engine 4. I hold a B.Sc. in Computer Science, Physics, and Mathematics (Triple Major) from Vanderbilt University.
Selected Manuscripts
Test-Time Scaling in Diffusion LLMs via Hidden Semi-Autoregressive Experts
J. Lee, H. Moon, K. Zhai, A.K. Chithanar, A.K. Sahu, S. Kar, C. Lee, S. Chakraborty, A.S. Bedi
[ACCEPTED] ICLR 2026
Proposing a training-free inference scaling method (HEX) that outperforms fine-tuned baselines by leveraging hidden semi-autoregressive experts.
CORE: Context-Robust Remasking for Diffusion Language Models
K. Zhai, S. Mollah, Z. Wang, M. Shah
Under Review
Demonstrated that existing diffusion language models suffer from context rigidity. Developed Context-Robust Remasking to identify and revise context-brittle tokens, boosting MBPP code generation by up to +9.2%.
Mitigating Reward Hacking in Inference-Time Noise Optimization for Text-to-Image Diffusion Models
K. Zhai, U. Singh, A. Thatipelli, S. Chakraborty, A.K. Sahu, F. Huang, A.S. Bedi, M. Shah
Under Review
Identified that inference-time noise optimization fundamentally leads to reward hacking. Designed a training-free alignment algorithm utilizing score-based KL regularization to mitigate this vulnerability, achieving over 80% human preference win rates.
Consensus-to-Personal: Inference-Time Alignment for Text-to-Image Personalization
K. Zhai, U. Singh, S. Chakraborty, C. Rastogi, F. Huang, A.S. Bedi, M. Shah
Under Review
Addressed the inherent conflict between consensus alignment and user personalization. Architected a two-stage framework that delegates personalization to inference-time steering, achieving up to 85.31% win rates on personalized safety tasks.
Repair-Aware Forgetting: An Iterative Approach to Unlearning in T2I Diffusion Models
S. Ghosh, K. Zhai, U. Singh, S. Chakraborty, A.S. Bedi, M. Shah
Under Review