Serval - Honkai: Star Rail
RVC v21,000 epochsFictionalEnglish

๐Ÿ“ฃ

From the creator

"(CV: Natalie Van Sistine) 40K 8 Hop length Trained on 28 minutes of in-game dialogue"

Description

Introducing the Serval [EN] - Honkai: Star Rail voice model, powered by our advanced RVC technology. This model features high-quality vocals from renowned CV Natalie Van Sistine, trained over 1000 epochs and 115k steps, resulting in incredibly realistic output. With a hop length of 8 and training data sourced from 28 minutes of captivating in-game dialogue, this model truly shines when used to create AI covers or even text-to-speech applications. Dive into the world of AI music and explore the potential of our cutting-edge tech. Try it out today with our free AI tools!

See more

Weekly Metrics

Views

9

Creations

0

Downloads

6

Audio Samples

๐ŸŒŽ All

#1

Male

Speaking

๐Ÿ‡บ๐Ÿ‡ธ English

#2

Male

Singing

๐Ÿ‡บ๐Ÿ‡ธ English

#3

Male

Singing

๐Ÿ‡บ๐Ÿ‡ธ English

#4

Male

Singing

๐Ÿ‡บ๐Ÿ‡ธ English

#5

Male

Singing

๐Ÿ‡บ๐Ÿ‡ธ English

#6

Male

Singing

๐Ÿ‡บ๐Ÿ‡ธ English

#7

Male

Rapping

๐Ÿ‡บ๐Ÿ‡ธ English

#8

Female

Speaking

๐Ÿ‡บ๐Ÿ‡ธ English

#9

Female

Speaking

๐Ÿ‡บ๐Ÿ‡ธ English

#10

Female

Speaking

๐Ÿ‡บ๐Ÿ‡ธ English

#11

Female

Speaking

๐Ÿ‡บ๐Ÿ‡ธ English

#12

Male

Speaking

๐Ÿ‡ช๐Ÿ‡ธ Spanish

#13

Male

Speaking

๐Ÿ‡ซ๐Ÿ‡ท French

#14

Male

Speaking

๐Ÿ‡ฎ๐Ÿ‡น Italian

#15

Male

Speaking

๐Ÿ‡ฐ๐Ÿ‡ท Korean

#16

Male

Speaking

๐Ÿ‡ท๐Ÿ‡บ Russian

#17

Female

Speaking

๐Ÿ‡ช๐Ÿ‡ธ Spanish

#18

Female

Speaking

๐Ÿ‡ซ๐Ÿ‡ท French

#19

Female

Speaking

๐Ÿ‡ฎ๐Ÿ‡น Italian

#20

Female

Speaking

๐Ÿ‡ฐ๐Ÿ‡ท Korean

#21

Female

Speaking

๐Ÿ‡ท๐Ÿ‡บ Russian

Model Files

Compressed File Verified

Serval - Honkai: Star Rail.zip

md5: 5458072494c1925be2cb4177cfb0612b

Individual Files Verified

model.index

md5: 745ce7d502d70c0f06549261c6aa6d94

250.44 MB

model.pth

md5: abe7ae3a75d9c98de06885ce08218c76

52.67 MB

Comments

Audio removed in response to copyright claim.

Audio removed in response to copyright claim.

Audio removed in response to copyright claim.

Audio removed in response to copyright claim.

Audio removed in response to copyright claim.

Audio removed in response to copyright claim.

Audio removed in response to copyright claim.

chose 1000epochs bc it had less artifacting

Audio removed in response to copyright claim.

Audio removed in response to copyright claim.

can't do covers cause im abroad rn lol

Audio removed in response to copyright claim.

wait isnt this over trained to oblivion

listen to this

what steps was e250 at

thats strange that it sounds better?

is that inferenced over the original voiceline?

the voice sounds fine even at 1000 epochs so thats why i kept it

ยฏ_(ใƒ„)_/ยฏ

also because i was using mangio to infer

with rmvpe these problems are usuaslly non-existant

i was testing using worst-case scenario (in terms of sibilant artifacts)

240epoch was 27k so im assumiong 250 was at around 28/29k

image.png

i think the graph smoothing was too much

do you have a 0.95 graph still

nope

unfortunate then

it does display the trend properly though

it's not like this is an exact science so

sometimes the graph skews up at the start quite a bit when at 0.999

nah its a normal trend for all these voices

image.png
image.png
image.png

sampo natasha tingyun graphs btw

all at 0.999 smoothing

seems like there is always 2 dips

although a 50k serval one might be good

ill have to test more

idk why my models are quite fine at low epoch

i was using mangio-crepe to infer which struggles a lot when it comes to sibilants

rmvpe usually has no issues with that

although i dont think inferencing over the original audio is a good idea cause it might overfit?

yeah it's just an easy way to determine how "faithful" the model is

that's why i add multiple examples

like a singing test tts test and the original audio test

to show the models stability

Add a comment

Post
Seรงilen Ses
Selected Audio