Viewing a single comment thread. View all comments

tahansa t1_j63fqca wrote

"Is it a memorization machine or can it create new songs?"

​

From the paper:
"Memorization analysis. Figure 3 reports both exact and
approximate matches when the length of the semantic token
prompt is varied between 0 and 10 seconds. We observe
that the fraction of exact matches always remains very
small (< 0.2%), even when using a 10 second prompt to
generate a continuation of 5 seconds. Figure 3 also includes results for approximate matches, using τ = 0.85.
We can see a higher number of matches detected with this
methodology, also when using only MuLan tokens as input
(prompt length T = 0) and the fraction of matching examples increases as the length of the prompt increases. We
inspect these matches more closely and observe that those
with the lowest matching score correspond to sequences
characterized by a low level of token diversity. Namely, the
average empirical entropy of a sample of 125 semantic tokens is 4.6 bits, while it drops to 1.0 bits when considering
sequences detected as approximate matches with matching
score less than 0.5. We include a sample of approximate
matches obtained with T = 0 in the accompanying material.
Note that acoustic modeling carried out by the second stage
introduces further diversity in the generated samples, also
when the semantic tokens match exactly."

15

bhendel t1_j63jzdy wrote

Anyone got a simpler explanation of that?

8

TFenrir t1_j63np8c wrote

chatGPT (so take it with many grains of salt)

> The paper is discussing a machine that can create new songs or music. They are testing to see if the machine is able to memorize songs or if it can come up with new ones. They are looking at how well the machine does when given different amounts of information to work with. They found that even when given a lot of information, the machine is not able to create exact copies of songs. However, it can create similar songs. They also found that when the machine is given very little information, the songs it creates are not very diverse. They include examples of the machine's output in the accompanying material.

30