Skip to content

20241101 ‐ AiAT‐ Lip Syncing with lipdubai

Ras Robo edited this page Nov 4, 2024 · 2 revisions

AI Art Today Twitter Space Briefing - November 1, 2024

Replay

Listen to the full discussion here: AI Art Today Twitter Space

Main Themes

  1. Impact of closed-source AI models on the creative community
  2. Advancements in AI agents and their implications
  3. Integration of audio and video in AI workflows
  4. Evolving landscape of AI music generation
  5. Search for sustainable business models for AI tools

Key Ideas & Facts

  • Closed-source models like ReCraft face skepticism due to limitations on community involvement and customization
  • AI agents show promise but raise ethical concerns about potential misuse
  • Progress in optimizing video models for local use, particularly in ComfyUI
  • Rapid advancements in AI music generation tools offering new creative possibilities
  • Struggle to find pricing models balancing accessibility for creators and company sustainability

Important Quotes

On ReCraft being closed-source:

"No, I didn't care about it once once it was closed source. I was like there's nothing like what's there? I mean, even though it's slightly better, even though it can do these like really oppressive things, I feel like uh being close source means there's no lore support, there's going to be no like you can't mix it with Comy and all the cool stuff you've been doing."

On AI agents' potential:

"Whoever is going to use it for for good is going to use it for good. Whoever's going to use it for evil is going to use it for evil. And it's there's not really much you can do about that."

On future AI-generated video games:

"I hope one day we'll all be able to make them. Uh I know uh Google had Genie which did 2D sides uh sidescrollers. Um not very good though and took a long a very long time. Uh so it wasn't really playable. Uh but uh if you get more playable stuff like this that would be great and I could definitely see simulations of those very old uh sidescrollers or old uh adventures, RPG adventures um back in the '9s."

On circumventing copyright detection in Suno:

"Hey, with Suno, it's not hard to get stuff past the copyright detection. I just change the pitch a little bit and change the tempo in Ableton and it slips right through. my man."

On balancing quality and accessibility in Lip Dub:

"I mean, all the AI creators that I've spoken to at this point all say the same thing to me. They're like, 'The quality is outstanding.' Um the problem is, um I think Clling and Runway and some of these other companies have created this expectation of being able to get immediate feedback and, you know, prompt. You know, I can prompt 10 times in a row and hopefully one of my prompts gets me what I need. Whereas Lip Dub is more of this wait and see approach because it does take a little longer. So that's one potential frustration that I've heard. It's like is there any way to get like that more immediate feedback?"

Community Frustrations and Potential Solutions

Quality vs. Speed

  • High Expectations for Speed: The rapid advancement of AI technology has created user expectations for immediate feedback, with Lip Dub's longer processing time standing out as a potential drawback
  • Desire for Preview Functionality: Implementation of low-resolution preview functionality could help users preview results before full rendering

Cost Concerns

  • Prohibitive Pricing for High-Volume Users: Current pricing model with training costs of $30-$40 per model and $4 per minute poses barriers for high-volume creators
  • Credit Expiration Policies: Non-rollover credit systems cause frustration for users with irregular creative cycles
  • Lack of Duration-Based Pricing: Current pricing doesn't account for variations in audio length

User Experience and Workflow Integration

  • Desire for Seamless Workflow Integration: Strong preference for tools that integrate with existing video editing software or offer robust APIs
  • Need for Batch Processing: High-volume users require simultaneous processing capabilities
  • Customization and Fine-tuning Options: Advanced users seek more granular control over the lip-syncing process

Solutions and Strategies

  • Subscription Tiers: Implementation of usage-based and revenue-based subscription plans
  • Credit System Improvements: Consider credit rollover policies or alternative payment options
  • Workflow Enhancement: Development of plugins for popular video editing software and batch processing capabilities
  • Advanced Features: Introduction of "advanced mode" with detailed controls while maintaining simplicity for beginners
  • Developer Resources: Provision of robust API and comprehensive documentation
  • Improved Communication: Better transparency about training and generation processes

Further Actions

  • Follow up with Lip Dub and other AI companies to explore potential partnerships and collaborations
  • Investigate and experiment with open-source agentic AI environments and their applications in creative workflows
  • Stay informed about advancements in AI music generation tools and explore their potential for creative expression
  • Continue to engage in discussions about sustainable pricing models for AI tools that benefit both companies and creators

The Twitter space provided valuable insights into recent developments and challenges in the AI art community, highlighting the importance of open dialogue and collaboration for its future.