Skip to Content
Actively looking for opportunities

Dev Garg

|

Resume

ABOUT

Hi, I'm Dev Garg — a software engineer focused on building cloud-native, AI-first systems that scale reliably in the real world.

I currently work as a Founding Software Engineer at Encando.AI, where I architect backend systems from the ground up for an AI-powered learning platform serving thousands of active users. My work spans designing distributed systems, building agentic RAG pipelines, setting up evaluation frameworks to reduce hallucinations, and optimizing infrastructure for both performance and cost.

Before this, I spent over 2.5 years at Societe Generale, working on large-scale data and risk analytics systems. There, I built high-throughput data pipelines using Spark and Kafka, improved system reliability through automated testing, and supported real-time decision-making in production-critical environments.

My interest in AI started during my undergraduate years at IIT (BHU) and deepened through hands-on work across NLP, retrieval systems, and MLOps. I'm currently completing my Master's in Computer Science at Texas A&M University, where I focus on Large Language Models, system design, and applied machine learning. I even became a certified Google Cloud Professional MLE and worked on numerous research and industry AI projects along the way.

I enjoy working at the intersection of AI, software engineering, and systems design — especially on problems where strong engineering discipline is essential to make advanced models useful, reliable, and scalable.

Here's what I've been working with:

  • Python
  • Java
  • C++
  • JavaScript
  • SQL

EXPERIENCE

  1. May 2025 - Present

    Senior Software Engineer @Encando.AI

    Architected a scalable, AI-first LMS backend from zero-to-one, scaling to 5,000+ active students in 5 months while optimizing architecture to reduce median latency by 95% (1.4s to 67ms) and improve P95 latency by 3.5x.

    Developed and optimized a multi-stage agentic RAG pipeline using LangChain, Pinecone vector databases and OpenAI API to automate complex workflows and boosting response relevance by 20% (measured via Ragas framework).

    Led a team of 5 engineers to establish a high-velocity CI/CD culture, enabling reliable daily releases by implementing rigorous code review standards, agile workflows and automated testing pipelines.

    Established end-to-end ownership of the platform, analyzed and optimized system efficiency, reducing monthly AWS infrastructure costs by 15% and cutting AI inference costs by 35% through token-efficient prompt design.

    Designed and migrated the monolithic backend to a highly available, Multi-AZ architecture using Application Load Balancers, eliminating a single point of failure and increasing system throughput by 50% (18 to 27 RPS) under load.

    Ensured product compliance with WCAG 2.1 AA standards and ADA Title II by implementing accessible React components and conducting regular audits, improving usability for all students.

    • AWS
    • DynamoDB
    • EC2
    • S3
    • React
    • NextJS
    • OpenAI APIs
    • Pinecone
  2. May 2021 - December 2023

    Software Engineer (Data) @Societe Generale

    Led the development of scalable data processing systems for credit risk analysis, enabling high-volume data ingestion for downstream analytics by utilizing Big Data technologies like Apache Spark and Kafka.

    Accelerated system validation cycle times by 50% and reduced quarterly compute costs by €20,000 by building an automated regression testing suite and designing auto-scaling data pipelines.

    Built and optimized distributed data workflows using Java, Spring Boot, SQL, Spark and Scala, supporting critical analytics capabilities for real-time risk management applications.

    Won Spot award for being an excellent team player.

    Mentored new teammates and contributed to every phase of the Software Development Lifecycle, from requirement gathering/data analysis to deployment and prod support.

    • Java
    • Spring Boot
    • Apache Spark
    • Kafka
    • ElasticSearch
    • Jenkins
    • SQL
    • Azure
    • Scala
  3. May 2020 - June 2020

    Data Scientist Intern @Societe Generale

    Developed an ML-based incident resolution recommendation system, reducing operational risks and improving response times by 38%. Worked closely with cross-functional teams to design, develop, and test the MVP, meeting strict deadlines.

    • Python
    • Scikit-learn
    • NLTK
    • Pandas
    • spaCy
    • Networkx
  1. 2024 - Present

    M.S. Computer Science @Texas A&M University

    GPA: 4.0/4.0

    • Large Language Models
    • Deep Learning
    • Software Engineering
  2. 2017 - 2021

    B.Tech. Electronics Engineering @Indian Institute of Technology (BHU)

    GPA: 8.78/10.0

    • Natual Language Processing
    • Computer Vision

PROJECTS

#All
  • PaperFormer: A Citation-Graph Enhanced Language Model for Scientific Applications

    Rithik Kapoor, Dev Garg, Ruihong Huang. [Under Review at Association for Computational Linguistics (ACL 2025)]. A novel citation-aware model for research papers with LoRA fine-tuned LLaMA that achieves 51% perplexity reduction and SOTA summarization improvement.

    • PyTorch
    • Ray
    • LLaMA
  • News Aggregation and Recommendation System

    A distributed, AI-driven platform that ingests and clusters news from multiple sources, generates real-time summaries, and delivers personalized, bias-aware recommendations based on user interactions. The system integrates MLOps for model monitoring and retraining, ensuring scalable, low-latency content delivery.

    • Kafka
    • Spark
    • FAISS
    • MLflow
    • Kubeflow
    • Redis
    • PostgreSQL
    • Elasticsearch
    • ETL
  • Agentic Self-Corrective RAG

    A multi-agent websearch-enabled Retreival Augmented Generation system based on LLama3 to minimize hallucinations. 20% increase in answer relevance and a 5% enhancement in faithfulness.

    • LangChain
    • Ollama
    • AWS
    • Docker
  • Optimized CNN for Cifar10

    Implemented a paper to achieve fast training and high accuracy.

    • PyTorch
  • Stock Market Charting App

    Do CRUD, Analyze and Visualize complex financial data through intuitive charts.

    • Angular
    • Spring Boot
    • PostgreSQL

BLOGS

  • December, 2025

    LLMOps: A Comprehensive Guide

    A comprehensive guide to LLMOps, including the latest advancements in the field and best practices for building and deploying LLMs.

  • February, 2023

    Designing Data Intensive Applications: Notes

    My notes on the book Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems.