ABOUT
Hi, I'm Dev Garg — a software engineer focused on building cloud-native, AI-first systems that scale reliably in the real world.
I currently work as a Founding Software Engineer at Encando.AI, where I architect backend systems from the ground up for an AI-powered learning platform serving thousands of active users. My work spans designing distributed systems, building agentic RAG pipelines, setting up evaluation frameworks to reduce hallucinations, and optimizing infrastructure for both performance and cost.
Before this, I spent over 2.5 years at Societe Generale, working on large-scale data and risk analytics systems. There, I built high-throughput data pipelines using Spark and Kafka, improved system reliability through automated testing, and supported real-time decision-making in production-critical environments.
My interest in AI started during my undergraduate years at IIT (BHU) and deepened through hands-on work across NLP, retrieval systems, and MLOps. I'm currently completing my Master's in Computer Science at Texas A&M University, where I focus on Large Language Models, system design, and applied machine learning. I even became a certified Google Cloud Professional MLE and worked on numerous research and industry AI projects along the way.
I enjoy working at the intersection of AI, software engineering, and systems design — especially on problems where strong engineering discipline is essential to make advanced models useful, reliable, and scalable.
Here's what I've been working with:
- Python
- Java
- C++
- JavaScript
- SQL
EXPERIENCE
May 2025 - Present Senior Software Engineer @Encando.AIArchitected a scalable, AI-first LMS backend from zero-to-one, scaling to 5,000+ active students in 5 months while optimizing architecture to reduce median latency by 95% (1.4s to 67ms) and improve P95 latency by 3.5x.
Developed and optimized a multi-stage agentic RAG pipeline using LangChain, Pinecone vector databases and OpenAI API to automate complex workflows and boosting response relevance by 20% (measured via Ragas framework).
Led a team of 5 engineers to establish a high-velocity CI/CD culture, enabling reliable daily releases by implementing rigorous code review standards, agile workflows and automated testing pipelines.
Established end-to-end ownership of the platform, analyzed and optimized system efficiency, reducing monthly AWS infrastructure costs by 15% and cutting AI inference costs by 35% through token-efficient prompt design.
Designed and migrated the monolithic backend to a highly available, Multi-AZ architecture using Application Load Balancers, eliminating a single point of failure and increasing system throughput by 50% (18 to 27 RPS) under load.
Ensured product compliance with WCAG 2.1 AA standards and ADA Title II by implementing accessible React components and conducting regular audits, improving usability for all students.
- AWS
- DynamoDB
- EC2
- S3
- React
- NextJS
- OpenAI APIs
- Pinecone
May 2021 - December 2023 Software Engineer (Data) @Societe GeneraleLed the development of scalable data processing systems for credit risk analysis, enabling high-volume data ingestion for downstream analytics by utilizing Big Data technologies like Apache Spark and Kafka.
Accelerated system validation cycle times by 50% and reduced quarterly compute costs by €20,000 by building an automated regression testing suite and designing auto-scaling data pipelines.
Built and optimized distributed data workflows using Java, Spring Boot, SQL, Spark and Scala, supporting critical analytics capabilities for real-time risk management applications.
Won Spot award for being an excellent team player.
Mentored new teammates and contributed to every phase of the Software Development Lifecycle, from requirement gathering/data analysis to deployment and prod support.
- Java
- Spring Boot
- Apache Spark
- Kafka
- ElasticSearch
- Jenkins
- SQL
- Azure
- Scala
May 2020 - June 2020 Data Scientist Intern @Societe GeneraleDeveloped an ML-based incident resolution recommendation system, reducing operational risks and improving response times by 38%. Worked closely with cross-functional teams to design, develop, and test the MVP, meeting strict deadlines.
- Python
- Scikit-learn
- NLTK
- Pandas
- spaCy
- Networkx
2024 - Present M.S. Computer Science @Texas A&M UniversityGPA: 4.0/4.0
- Large Language Models
- Deep Learning
- Software Engineering
2017 - 2021 B.Tech. Electronics Engineering @Indian Institute of Technology (BHU)GPA: 8.78/10.0
- Natual Language Processing
- Computer Vision
PROJECTS
PaperFormer: A Citation-Graph Enhanced Language Model for Scientific Applications
Rithik Kapoor, Dev Garg, Ruihong Huang. [Under Review at Association for Computational Linguistics (ACL 2025)]. A novel citation-aware model for research papers with LoRA fine-tuned LLaMA that achieves 51% perplexity reduction and SOTA summarization improvement.
- PyTorch
- Ray
- LLaMA
News Aggregation and Recommendation System
A distributed, AI-driven platform that ingests and clusters news from multiple sources, generates real-time summaries, and delivers personalized, bias-aware recommendations based on user interactions. The system integrates MLOps for model monitoring and retraining, ensuring scalable, low-latency content delivery.
- Kafka
- Spark
- FAISS
- MLflow
- Kubeflow
- Redis
- PostgreSQL
- Elasticsearch
- ETL
Agentic Self-Corrective RAG
A multi-agent websearch-enabled Retreival Augmented Generation system based on LLama3 to minimize hallucinations. 20% increase in answer relevance and a 5% enhancement in faithfulness.
- LangChain
- Ollama
- AWS
- Docker
Stock Market Charting App
Do CRUD, Analyze and Visualize complex financial data through intuitive charts.
- Angular
- Spring Boot
- PostgreSQL
BLOGS
December, 2025 LLMOps: A Comprehensive Guide
A comprehensive guide to LLMOps, including the latest advancements in the field and best practices for building and deploying LLMs.
February, 2023 Designing Data Intensive Applications: Notes
My notes on the book Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems.