/ais/ - TortoiseTTS, 11labs and other TTS software

Name
Options
Subject
Message	Max message length: 12000
files	Drag files here to upload or click here to select them 0.00 / 50.00 MB Max file size: 32.00 MB Total max file size: 50.00 MB Max files: 5 Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password	(used to delete files and posts)
Misc

TortoiseTTS, 11labs and other TTS software Anonymous 02/02/2023 (Thu) 11:46:18 No. 522

Jointly operating with 4chan frens: We want to fork this text to speech synthesis project to compete with 11labs speech synthesizer. Source code for tortoise tts https://github.com/neonbjb/tortoise-tts here is the specific code we need to break down and understand https://github.com/neonbjb/tortoise-tts/blob/main/tortoise/utils/diffusion.py training models used https://github.com/neonbjb/tortoise-tts/tree/main/tortoise/models Source for the trainer project tortoise used https://github.com/neonbjb/DL-Art-School Tortoise tts videos https://www.youtube.com/watch?v=J3-jfS29RF4 https://www.youtube.com/watch?v=Kfr_FZof_hs https://www.youtube.com/watch?v=FN3yxL0Rr0c we have datasets https://github.com/LAION-AI/audio-dataset Given the potentially large datasets involved it is a prudent strategy to decentralize training going forward. If you know anything about federated training or understand about these github projects can you explain which are the best and most relevant. https://github.com/topics/federated-learning https://en.wikipedia.org/wiki/Federated_learning This is new tech for many anons we need some experienced ML anons to help push this in a useful direction. Impart your wisdom anons. Other datasets and info: https://rentry.org/AIVoiceStuff Additionally, it would be extremely helpful if someone could create repositories of voice lins for individual voice actors or characters. If you're capable of doing that, please, contact us in this thread.

Anonymous 02/02/2023 (Thu) 11:54:53 No. 523

Posted like a retard in the /t/ thread >>522 So what can be done? I could just offline store all the shit i see about this,my computer isnt good enough to train or even use these so i'm kinda limited with how i can help to get AIs get less controlled by feds

Anonymous 02/02/2023 (Thu) 12:01:43 No. 524

>>523 >Posted like a retard in the /t/ thread Sorry about that >So what can be done? Honestly, creating a repository of datasets containing voice lines is already good enough. You can also annotate them, either manually or using tools like https://github.com/openai/whisper which automate the process.

Anonymous 02/02/2023 (Thu) 12:23:02 No. 525

>>524 >Sorry about that Wasn't you i was speaking,its that i answered there instead of here since i had the tab open >https://github.com/openai/whisper If i understood this correctly i can add more audios to this to make a larger dataset? AIs models have growing so fast i'm still being overloaded on how to work well with their these models showing up all the time

Anonymous 02/02/2023 (Thu) 12:38:20 No. 526

>>525 Whisper is just the audio recognition model. You give it an audio file and it will give you the textual interpretation. You can then use said textual interpretation to annotate audio files if your own version of the TTS network uses annotated files.

Anonymous 02/02/2023 (Thu) 13:46:22 No. 527

>>526 I will try to do what i can, i will surely have to learn how these AI work more in depth i never had much contact with shit like python and other stuff they use before so its still a bit confusing to make everything work

Quick Reply


Sage Bypass Check