Generative AI Group Chat: Product Inpainting, 3D Capture, and AI Startups

Generative AI Group Chat Transcript

Product Inpainting:

  • Discussion on using separate LoRA for inpainting main product
  • Tutorial using ComfyUI for generating similar images with different ethnicities and anime styles of same character shared
  • Clarification on main product being matchstick box with person holding it
  • Flair and Booth.ai tools discussed for retaining text and product fidelity
  • Better methods like ControlNet, Adapters, Composer suggested for similar use cases as Flair
  • Finetuning models not needed for controlling style, image composition, and color scheme
  • Copy-pasting pixels on final image where product exists suggested for preserving product fidelity
  • Background subtraction and improvement suggested for e-commerce images
  • Adept.ai and similar space discussed for game-changing computer usage

3D Product Capture and Visualization:

  • Nerf capture app like Luma suggested for generating 3D models from any novel viewpoint
  • Workflow for product photography and marketing discussed
  • Niche and industry-specific vertical AI products suggested for startups
  • Canva’s absence in GenAI space discussed

Adobe and Horizontal Gen AI Startups:

  • Adobe’s deep computer vision tools and products discussed
  • Criteria for horizontal products to succeed discussed
  • Firefly’s impressive features discussed
  • Yudkowsky’s propaganda discussed
  • ImageCaptioning models like BLIP and CLIP_prefix_caption discussed
  • BARD and Lamda models discussed

Links:

  • Tutorial using ComfyUI: (https://bit.ly/hf-nvidia-meetup)
  • Deepset’s Haystack: (https://www.deepset.ai/blog/build-a-search-engine-with-gpt-3)
  • Hood project: (https://dolorousrtur.github.io/hood/)
  • Adept.ai: (https://www.adept.ai/)
  • Shi et al.’s research on VQA: (https://proceedings.mlr.press/v70/shi17a.html)
  • Visual ChatGPT: (https://github.com/microsoft/visual-chatgpt)
  • Google’s BARD: (https://sites.google.com/view/bard-challenge/home)
  • CLIP-Interrogator-2: (https://huggingface.co/spaces/fffiloni/CLIP-Interrogator-2)
  • CLIP_prefix_caption: (https://github.com/rmokady/CLIP_prefix_caption)
  • Gist for converting Firefly into API: (https://gist.github.com/ovshake/69efb594f3b1e8d98b34687b16916145)
  • Video of Nvidia livestream: (https://www.youtube.com/watch?app=desktop&v=DiGB5uAYKAg&feature=youtu.be)
  • Yudkowsky’s tweet: (https://mobile.twitter.com/ESYudkowsky/status/1635577836525469697)
  • Vinnie Moura’s tweet: (https://twitter.com/vinniemourax/status/1638218512760971277?s=20)

The description and link can be mismatched because of extraction errors.

  • https://bit.ly/hf-nvidia-meetup: The message in the same link as the URL discusses the use of custom trained models by Flair, which are mask-based and include outpainting. The message also mentions other methods such as ControlNet, Adapters, and Composer for controlling style, image composition, and color scheme without the need for fine-tuning models.
  • https://news.ycombinator.com/item?id=35236275: The message in this link discusses the possibility of creating apps similar to the one mentioned in the link, and the author mentions having worked on document search.
  • https://www.deepset.ai/blog/build-a-search-engine-with-gpt-3: The message suggests checking out this link about building a search engine with GPT-3 by deepset, who also have a framework called Haystack. The context also mentions some smart background replacement images and a slight halo visible when staring at something for too long.
  • https://dolorousrtur.github.io/hood/ - a project with impressive demos mentioned in a message along with a phone number.
  • https://a16z.com/2022/11/17/the-generative-ai-revolution-in-games/ - The message in the same link as the URL is related to the article about the generative AI revolution in games.
  • https://www.adept.ai/ - A link mentioned in a discussion about the nerf space and its potential as a game changer in computer usage.
  • https://www.adept.ai/ and similar space are discussed as a game changer in how we use computers, and the message also mentions related research and prompt engineering links.
  • https://proceedings.mlr.press/v70/shi17a.html - a research area in RL is discussed and the author is curious to learn more about it.
  • https://mobile.twitter.com/ESYudkowsky/status/1635577836525469697 - Adobe is expected to release something big in generative AI, and it’s a tough competition for horizontal gen AI startups. The message also suggests that people may avoid Adobe.
  • https://youtu.be/DiGB5uAYKAg: Nvidia livestream is live Rn, and the message in the same link talks about Canva’s visibility in the GenAI space and their focus on ML research.
  • https://huggingface.co/spaces/fffiloni/CLIP-Interrogator-2: A suggested alternative to the ImageCaptioning model BLIP, along with the recommendation to use a VPN.
  • https://twitter.com/vinniemourax/status/1638218512760971277?s=20 - A tweet asking if people have seen something, possibly related to the following YouTube video.
  • https://www.youtube.com/watch?app=desktop&v=DiGB5uAYKAg&feature=youtu.be - A YouTube video that may be related to the tweet above.
  • https://twitter.com/vinniemourax/status/1638218512760971277?s=20: A tweet asking if people have seen a certain video, followed by a link to the video on YouTube and a question about whether there is an API version of it.
  • https://github.com/rmokady/CLIP_prefix_caption - A GitHub repository containing a script for converting text into an API, which may be useful for integrating with other applications. The message in the same link as the repository is related to the repository and mentions the possibility of using a tool called “replicate” for the same purpose.
  • https://gist.github.com/ovshake/69efb594f3b1e8d98b34687b16916145 - A link to a potentially useful resource, mentioned in the context of a negative opinion about a model.