AI Topics: Embeddings, ChatGPT, Motion Capture & More

OpenAI and Dense Embeddings:

  • Currently using OpenAI and dense embeddings
  • Planning to move to hybrid soon
  • Using similarity search on a flat embedding space to retrieve context

GitHub and LLM:

  • Came across a GitHub where someone used PEFT to fine-tune an LLM based on their iMessage chats to impersonate and create a bot that talks like them

Supabase:

  • Discussion about Supabase and Postgres performance queries
  • Suggestions to check RLS policies and use explain.dalibo.com to visualize queries
  • Offer to help with specific links on Github or Discord forum
  • Mention of a new branch for Mac release of Automatic1111

ChatGPT:

  • Discussion about school kids in India using ChatGPT
  • Mention of e2eml.school/transformers.html
  • Anecdotal evidence of kids using ChatGPT for school essays

Generative AI Speakers:

  • Discussion about having more Amod-esque speakers
  • Suggestions for speakers from AI21 Labs, Cohere, and Anthorpic
  • Plans to set up an event/session link to register for upcoming talks
  • Discussion about potential themes for upcoming talks, including image generation, video, and sound

Motion Capture and Pose Estimation:

  • Request for someone with experience working with motion capture/face-body reenactment or topics related to pose estimation
  • Offers from multiple people to chat about these topics

Newlines and OpenAI Embedding API:

  • Discussion about differences between using newlines/replacing them when using OpenAI’s embedding API
  • Explanation that new lines are not part of the vocabulary and do not have semantic meaning when encoded
  • Mention of character splitters and the importance of context length

Miscellaneous:

  • Article recommendation on safetyism
  • Discussion about Interoperable Master Format
  • Mention of a hackathon link related to voice models

The description and link can be mismatched because of extraction errors.

  • https://github.com/orgs/supabase/discussions - A link to Supabase’s GitHub discussions page where a user can get a faster response to their query. The message also asks for help analyzing SQL if it’s an RPC call.
  • https://explain.dalibo.com/ - The website where someone posted something, but there has been no response yet. The context of the post is unclear.
  • https://github.com/brkirch/stable-diffusion-webui/releases: This link is mentioned in the message as the location of a new branch for Mac release of Automatic1111, which is faster than the stock version but still experimental.
  • https://github.com/brkirch/stable-diffusion-webui/releases: Link to the releases page of the stable-diffusion-webui repository on GitHub.
  • https://e2eml.school/transformers.html: The message discusses the increasing percentage of users in India for an unspecified platform, and mentions that OpenAI requires users to be 18 years old to use ChatGPT.
  • https://www.reddit.com/r/selfhosted/comments/12w4p2f/localai_openai_compatible_api_to_run_llm_models/ - The link is mentioned in a message asking for suggestions for speakers with a similar style to Amod for a meetup.
  • https://mitchellh.com/writing/prompt-engineering-vs-blind-prompting: The message in the same link as the URL mentions preferred tasks for May and offers help with delivering a talk, and an addendum mentions the April theme of Question Answering.
  • The curator for the May theme on image generation, video, and sound is Soumyadeep, who runs a Generative AI company in Bengaluru. (https://www.linkedin.com/in/soumyadeepmukherjee/?originalSubdomain=in)
  • https://warpspeed2023.devfolio.co/ - Excellent criticism of safetyism and discusses the loudest detractors and their main arguments (excellent if you want to catch up!)
  • https://twitter.com/Uncanny_Harry/status/1650462479237931008?s=2 - The message in the same link as the link is related to live performance capture projects. The given URL discusses how Indian engineering colleges are leading generative AI research projects in Indic languages, but are facing challenges in data sourcing and computing power. The message in the same link as the URL explains that a NLP model from huggingface follows the transformer library specification and has a vocab file that includes the words it was trained to encode, but new lines and tabs are not part of the vocabulary.
  • https://twitter.com/aribk24/status/1650372832524926977?s=20 - A Twitter user offering help and expertise on certain topics.