Skip to content

Writing

Function, Industry, Geography: Career Framework

Your career is a combination of Function, Industry, Geography.

That's it.

That's the framework.

You can change one of these. Not all three at a time.

Why only one at a time?

If you want to change your function, you need to learn new skills.

If you want to change your industry, you need to understand the new industry and the skills required to be successful in it.

If you want to change your geography, you need to uproot your life and move to a new place.

But what if I want to change all three?

You can, but it's difficult and perhaps needs a lot of thinking cycles and internal conviction.

Cheat code: Higher Education

MS/PhD/MBA helps folk change 2 of these at a time:

  1. You might get into a PhD program which changes your function and industry both
  2. MS abroad program which changes your function and geography
  3. MBA program which changes your industry (e.g. IT services to consulting) and geography (e.g. India to US)

Clarity Helps a Lot

The more granular you get, the easier it is to decide convert the wants into actionable steps.

Painful: I want to be a Machine Learning Engineer.

Tolerable: I want to be a Machine Learning Engineer at a Big Tech company.

Acceptable: I want to be a Machine Learning Engineer at Google.

Good: I want to be a Machine Learning Engineer at Google in New York.

Great: I want to be a Machine Learning Engineer working on the problems around generating human-like speech at Google in New York.

What if I am not clear about what I want?

If you are not clear about what you want, that's totally fine.

You can start with the most granular level and work your way up.

Start with the job description of the role you want. Talk to at least 12 people who are in that role.

Ask them what they do on a day-to-day basis. Ask how they got to where they are today. Ask what they ask when they are hiring or interviewing.

People on the Internet call this thing informational interviews and there's plenty of decent advice out there.

  1. The Antidote to I'm Feeling Stuck? from Swanand
  2. Act Like You're 35 from Nirant

Retrieval Augmented Generation Best Practices

Retrieval and Ranking Matter!

Chunking

  1. Including section title in your chunks improves that, so does keywords from the documents
  2. Different token-efficient separators in your chunks e.g. ### is a single token in GPT

Examples

  1. Few examples are better than no examples
  2. Examples at the start and end have the highest weight, the middle ones are kinda forgotten by the LLM

Re Rankers

Latency permitting — use a ReRanker — Cohere, Sentence Transformers and BGE have decent ones out of the box

Embedding

Use the right embedding for the right problem:

GTE, BGE are best for most support, sales, and FAQ kind of applications.

OpenAI is the easiest for Code Embedding to use.

e5 family does outside English and Chinese

If you can, finetune the embedding to your domain — takes about 20 minutes on a modern laptop or Colab notebook, improves recall by upto 30-50%

Evaluation

Evaluation Driven Development makes your entire "dev" iteration much faster.

Think of these as the "running the code to see if it works"

Strongly recommend using Ragas for something like this. They've Langchain and Llama Index integrations too which are great for real world scenarios.

Scaling

LLM Reliability

Have a failover LLM for when your primary LLM is down, slow or just not working well. Can you switch to a different LLM in 1 minute or less automatically?

Vector Store

When you're hitting latency and throughput limits on the Vector Store, consider using scalar quantization with a dedicated vector store like Qdrant or Weaviate

Qdrant also has Binary Quantization which allows you to scale 30-40x with OpenAI Embeddings.

Finetuning

LLM: OpenAI GPT3.5 will often be as good as GPT4 with finetuning.

Needs about 100 records and you get the 30% latency improvements for free

So quite often worth the effort!

This extends to OSS LLM models. Can't hurt to "pretrain" finetune your Mistral or Zephyr7B for $5

AI4Humans aka Software x LLMs

AI4Bharat, IIT Madras, July 2023

Namaste! 🙏 I'm Nirant and here's a brief of what we discussed in our session.

Why You Should Care?

I have a track record in the field of NLP and machine learning, including a paper at ACL 2020 on Hinglish, the first Hindi-LM, and an NLP book with over 5000 copies sold. I've contributed to IndicGlue by AI4Bharat, built and deployed systems used by Nykaa, and consulted for healthcare enterprises and YC companies. I also manage India’s largest GenAI community with regular meetups since February 2023.

Here's my Github.

AI4Humans: Retrieval Augmented Generation for India

We dived into two main areas:

  1. Retrieval Augmented Generation: Examples of RAG for India, engineering choices, open problems, and how to improve it
  2. LLM Functions: Exploring tool augmentation and "perfect" natural language parsing

Retrieval Augmented Generation (RAG)

RAG is a popular pattern in AI. It's used in various applications like FAQ on WhatsApp, customer support automation, and more. It's the backbone of services like Kissan.ai, farmer.chat and Bot9.ai.

However, there are several open problems in RAG, such as text splitting, improving ranking/selection of top K documents, and embedding selection.

Adding Details to RAG

We can improve RAG by integrating models like OpenAI's GPT4, Ada-002, and others. We can also enhance the system by adding a Cross-Encoder and 2 Pass Search.

RAG Outline

Despite these improvements, challenges remain in areas like evaluation, monitoring, and handling latency/speed. For instance, we discussed how to evaluate answers automatically, monitor model degradation, and improve system latency.

Using LLM to Evaluate

An interesting application of LLM is to use it for system evaluation. For example, we can use LLM to auto-generate a QA test set and auto-grade the results of the specified QA chain. Check out this auto-evaluator as an example.

Addressing Open Problems

We discussed the best ways to improve system speed, including paged attention, caching, and simply throwing more compute at it. We also touched on security concerns, such as the need for separation of data and the use of Role Based Access Control (RBAC).

LLM “Functions”

We explored how LLMs can be used for tool augmentation and converting language to programmatic objects or code. The Gorilla LLM family is a prime example of this, offloading tasks to more specialized, reliable models.

In the context of AgentAI, we discussed how it can help in converting text to programmatic objects, making it easier to handle complex tasks. You can check out the working code here.

Thank you for attending the session! Feel free to connect with me: Twitter, LinkedIn or learn more about me here.

References

Images in this blog are taken from the slides presented during the talk.