OpenAI and Generative AI Developments

OpenAI:

  • Doubtful of claim that they don’t need data, they still need more data for training
  • Multinationals may start using OpenAI models
  • OpenAI is looking for data partnerships, but currently only able to cater to proposals from big tech companies
  • No list of data partners available
  • OpenAI’s platform API data policy is public information
  • OpenAI is small and has exhausted internet and academic datasets for training their models
  • Replit Codegen model is better than OpenAI Codex in many human eval tasks and is smaller in size
  • OpenAI’s models are a utility like EC2 and have greatly improved the UX of consuming models
  • Many NLP models now work with OpenAI’s transformer library
  • Stanford NLP is highly regarded
  • Qualcomm has made progress in ML on edge
  • Palantir has launched ChatGPT for war
  • Anduril may come up with something similar
  • OpenAI’s tech use cases range from disaster relief to drone warfare

Music and Audio Generation:

  • Discussion on music generation, audio generation, jingle generation, song generation, and new AI instruments
  • Shared interest in the topic

Generative AI:

  • Accel’s slide deck on opportunities in generative AI and finetuning vs prompting
  • Replit Demo Day videos soon to be on YouTube
  • Dedicated group for music, images, and video on WhatsApp
  • Expert talk on music generation requested
  • Dust.tt is an internal tool for now
  • Recommendation for paperspace for GPU access
  • Google Colab is $12 for 100 compute units
  • Runpod, lambdalabs, and fluidstack recommended for serverless GPU access
  • Compute units on Google Colab are unclear
  • T4 24GB GPU offered on Google Colab for $12
  • Weaviate discussed in relation to storing metadata for vector DBs

Links:

  • https://twitter.com/Mascobot/status/1651022921056555008?t=3qozUieRAPdsnScv3XnQLw&s=19
  • https://www.reddit.com/r/StableDiffusion/comments/12yzd2a/google_researchers_achieve_performance/
  • https://www.qualcomm.com/news/onq/2023/02/worlds-first-on-device-demonstration-of-stable-diffusion-on-android
  • https://chat.whatsapp.com/GThJJhoF3cL7QCmrfIoY8J
  • https://www.linkedin.com/posts/genai-center_ai-generated-pizza-commercial-tools-used-activity-7056952600516046848-mvqm?utm_source=share&utm_medium=member_ios
  • https://research.google.com/colaboratory/marketplace.html
  • https://www.chartgpt.dev/
  • https://news.ycombinator.com/item?id=35697627
  • https://github.com/ai-forever/Kandinsky-2

The description and link can be mismatched because of extraction errors.

  • https://twitter.com/Mascobot/status/1651022921056555008?t=3qozUieRAPdsnScv3XnQLw&s=19 - A tweet discussing the development of a project that was apparently built in under 2 weeks, and mentioning the differences between GPT3.5-turbo, GPT4, and vanilla LLM.
  • The URL is a Reddit post in the r/StableDiffusion subreddit discussing Google researchers achieving performance breakthroughs in machine learning. The message in the same link asks if anyone is working on ML on edge.
  • The message expresses excitement about Qualcomm’s demonstration of stable diffusion on Android, which is showcased in the given URL: https://www.qualcomm.com/news/onq/2023/02/worlds-first-on-device-demonstration-of-stable-diffusion-on-android.
  • https://chat.whatsapp.com/GThJJhoF3cL7QCmrfIoY8J - PSA: Dedicated group for music, images, video. The message also discusses testing Chinchilla limits and training for more tokens than most people have tried for similarly sized models. An expert talk by the person mentioned would be helpful.
  • https://twitter.com/matthieurouif/status/1650904940036890626 - A message about the availability of an API for the gpt-4 multimodal model.
  • https://minigpt-4.github.io/ or https://llava-vl.github.io/ can be used instead of Photoroom as it doesn’t have image understanding.
  • https://minigpt-4.github.io/ or https://llava-vl.github.io/ can be used instead of Photoroom as it doesn’t have image understanding.
  • The URL https://dust.tt is mentioned and it is described as “awesome” but it is currently internal. There is no mention of any image understanding.
  • The LinkedIn post discusses using AI models to generate a background for a photoshoot and includes a link to a pizza commercial created using these tools. The post also mentions someone playing with music generation. (URL: https://www.linkedin.com/posts/genai-center_ai-generated-pizza-commercial-tools-used-activity-7056952600516046848-mvqm?utm_source=share&utm_medium=member_ios)
  • https://www.paperspace.com - suggested as an option for GPU usage in experimentation with StableDiffusion and running Gradio/Automatic1111, as well as storage needs for models like Lyriel/Deliberate+Controlnets, with a note that the 5GB space on Gradient may not be sufficient.
  • The URL https://research.google.com/colaboratory/marketplace.html is mentioned in the context of connecting a GCE VM to Colab for persistent sessions and dedicated compute. The message also recommends using Runpod and stopping the instance when not in use to only be charged for storage.
  • https://news.ycombinator.com/item?id=35697627 - This link provides context to the statement that “For non persistent / spot instances of GPUs GOOG was always in under supply while we were testing” mentioned in the same message.
  • https://news.ycombinator.com/item?id=35697627: A question on Weaviate usage, asking whether metadata is stored in Weaviate or a different database.
  • https://www.youtube.com/watch?v=7TCqGslll-4 and https://github.com/ai-forever/Kandinsky-2 are mentioned in the context of discussing containerization and embedding content in an app.
  • https://www.youtube.com/watch?v=7TCqGslll-4: Link to a YouTube video, context unknown.
  • https://github.com/ai-forever/Kandinsky-2: Link to a GitHub repository for Kandinsky-2, context unknown.