I'm an AI Engineer with 4+ years of experience across the full ML stack — from NLP pipelines and distributed inference systems to multi-agent LLM architectures and Vision-Language-Action models for robot learning.
My work sits at the intersection of research and deployment. At Espercare, I architected a router-orchestrated multi-agent system on Gemini handling 2–3K queries/day at 92% routing accuracy and ~2s latency; shipped a grounded-retrieval revenue-code service that replaced ~8 hours of daily manual coding (98% automated across ~14K line items/day); and built a structured-extraction pipeline for UB-04 and HCFA-1500 claims that auto-extracts 96% of records at high confidence — with a 27% accuracy lift from domain-specific fine-tuning. I also designed a continuous evaluation harness on Vertex AI with pairwise preference scoring, human feedback, and an LLM-as-judge layer for automated QA across agents and pipelines.
Earlier in my career, I built production OCR + CNN pipelines that cut document processing time by 82% at 0.92 AUC, and designed async distributed inference systems on RabbitMQ and Celery. Most recently, I trained a 77M parameter VLA model for multi-task robot manipulation using behavior cloning and cross-attention fusion.
Across these roles, the through-line has been the same: take a well-defined problem, pick the right architecture, instrument it properly, and iterate until the system is reliable enough to depend on.
I recently completed a 77M parameter VLA model combining EfficientNet-B0 (vision) + DistilBERT (language) + cross-attention fusion to predict robot manipulation actions from camera images and natural language instructions. Using behavior cloning on scripted expert policies for pick/push/place tasks, I achieved 22% loss reduction through multi-task learning on Apple Silicon GPU with MPS acceleration. The full implementation is on GitHub.
I'm actively looking for teams working on real-world policy learning, sim-to-real transfer, or deploying RL agents in production environments — particularly in robotics, autonomous systems, or LLM-driven decision-making.
I'm most excited by roles where the research-to-deployment gap is the core problem: building reliable systems around learned behavior, designing evaluation frameworks for policies, and making models that generalize beyond their training distribution.
If your team is working on embodied AI, multi-agent coordination, or the infrastructure side of RL at scale — I'd love to connect.
Built a 77M parameter Vision-Language-Action model combining EfficientNet-B0 vision encoder, DistilBERT language encoder, and cross-attention fusion to predict robot manipulation actions from camera images and natural language instructions. Achieved 22% loss reduction via multi-task behavior cloning across pick/push/place tasks.
Read More
Implemented a sequential Deep-Q Learning agent on OpenAI's continuous mountain car environment, converging in 1643 episodes with 12 hidden dimensions.
Read More
GAN network for image translation from line art to colored butterfly images using a Pix2Pix architecture (U-Net Generator + PatchGAN Discriminator). Achieved discriminator and generator loss of 0.91 and 1.23 respectively.
Read More
Image segmentation model using Mask-RCNN for object detection and segmentation masks on nuclei microscopy images. Achieved a mean average precision of 0.73.
Read More
Implemented a modified SAM-SLR framework for real-time continuous sign language recognition, integrating multi-cue visual streams for improved temporal modeling. Master's thesis project at SJSU.
Scraped IMDB movie plot data, applied preprocessing (stop words, PorterStemmer), and fine-tuned a Pegasus model using a Self-Supervised Objective for abstractive summarization.
Read More
Comparative analysis of sentiment classification on Amazon reviews, implementing and benchmarking FNN, CNN, LSTM, and DistilBERT models with word embeddings and data preprocessing.
Read More
Trained an LSTM model on Lo-Fi style music data to generate stylistically similar music sequences from learned temporal patterns.
Read More