About
As a consultant and advisor:
- I specialize in working with Early stage founders, 0 to 1 stage of product development or later stage, adding a new feature to an existing product
- For non-coding founders, I act as a tour guide for advanced AI technologies - improving their understanding of potential applications and limitations alike
This includes things like LLMs and Vector Databases, which are amazing tools which unlock a lot of new, intelligent value but also have their own limits.
Trivia:
- Dr. Andrew Ng recommends Awesome NLP, a repo I’ve maintained since 2018 at Stanford’s Deep Learning course CS 230.
- Top 5 GenAI Scientists in India, Analytics India Magazine
- I have written a book on Natural Language Processing
I am on a sabbatical. I enjoy my time playing board games, running India’s largest GenerativeAI community. I am learning to be better at software library design (agentai) and writing.
Background
Machine Learning and AI Engineer with 7+ years of experience in chat bots, retrieval, ranking and language modeling.
As a Machine Learning Engineer, I
- Trained the first Hindi LM
- Deployed Sentence Transformers and Annoy (vector search library) for cosine Similarity powered search in 2018 in production
- Managed a team of 3 engineers to build a support chatbot for 1M chat messages per month
As an AI Engineer,
- Have built and deployed Question Answering systems for 3+ years, including 2 projects with OpenAI LLMs e.g. text-davinci-003, GPT3.5 and GPT4
- Hallucination-free summarization and question answering systems
PS: What is an AI Engineer? Here you go: https://latent.space/p/ai-engineer
Book
Book: NLP in Python: Quickstart Guide
Code: Github
I wrote this book in 2018 to make Natural Language Programming more accessible for software engineers and programmers. This had a very design and code-first view of tools and their limitations. Today, most of it is outdated, I do not recommend buying it.

Papers and Open Source Contributions
Hinglish: github, paper focussed on code-mixed languages was published in ACL 2019.
Awesome Project Ideas
- Curated list of machine learning (mostly deep learning) project ideas with datasets. These ideas range from Vision, Text, Forecasting to Recommender Systems
Awesome NLP
Curated list of Natural Language Processing Resources. I’ve been the Primary Maintainer for Awesome-NLP
- Recommended by Dr. Andrew Ng’s (Stanford) CS 230
- Featured in Github’s Official Machine Learning Collection since 2016 and
State of the Art Language Modeling in Hindi + new datasets, check the code here at hindi2vec
Comparative Study of Preprocessing and Classification Methods in Character Recognition of Natural Scene Images. In: Machine Intelligence and Signal Processing. Advances in Intelligent Systems and Computing, vol 390. Springer, New Delhi (2015). https://doi.org/10.1007/978-81-322-2625-3_11
Talks
- Fifth Elephant MLOps Conf 2021: Slides
- PyCon India 2019: Slides and Youtube
- inMobi Tech Talks: A Nightmare on the LM Street; Slides
- Wingify DevFest: NLP for Indian Languages; Slides, Youtube
- PyData Bengaluru Inaugral Talk: Quiz Generation with spaCy; Youtube
Web Mentions - My 5 minutes of Internet fame
Search and Informational Retrieval Ranking Challenge hosted by Bing AI Team (2019)
Won the Kaggle Kernel Prize (2019)
- The Hitchhiker’s Guide to NLP in spaCy won the first ever NLP themed Kaggle Kernel award. I won a free licensed copy of Prodi.gy worth $390 with it, and $500 in cash.
Exploratory Programming Notes found helpful by Nobel Laureate (2018)
- Tips, Tricks, Best Practices for working with Jupyter Notebook’s was appreciated by Economics Nobel Laureate 2018 Dr. Paul Romer:
Nirant, this looks very helpful.
— Paul Romer (@paulmromer) April 15, 2018
Re your recommendation to use f-strings, do you know a good place to learn about them for someone new to Python?
Everything I’ve found seems to be for someone making the transition from older ways that a newbie doesn’t need to learn.
- Tips, Tricks, Best Practices for working with Jupyter Notebook’s was appreciated by Economics Nobel Laureate 2018 Dr. Paul Romer:
FactorDaily’s piece on The great rush to data sciences in India ends with a direct quote from me.
- FactorDaily is a new age news company which sits at the intersection of technology with life, culture and society in India.
First Runner’s Up at the Future Group Datathon (March 2019)
- Two stage Machine Learning hackathon called Tathastu, working on recommendation systems and item information extraction problems
Opened AI Hackathon (2019)
- Awesome NCERT won the Best use of IBM Watson API; blog
- Idea: Find recent+relevant news articles against any NCERT chapter in sciences and social studies