Digital Ventures

Back to blog

What is That Sound? Explore “new sounds” from AI



If you prefer sweet, add sugar. You like salty, sprinkle salt. If you like spicy and pungent, add a pinch of pepper. These are familiar flavor mixtures. Delicious meals consist of a combination of tastes. When we try delicious food, some say “it is one big orchestra on a plate”. It may seem overrated, but I’m sure you know what I mean.

What about senses like hearing? We know that different instruments have unique sounds and many can differentiate between a piano, bass, and guitar. It is not a difficult (we can’t say the same when it comes to a violin and a viola).

With food, it is simpler to imagine two flavors such as sweet-bitter chocolate, salty-sweet soy sauce, or sour-salty plum candy. But what is a combination of a piano and a guitar?

In combining, I do not mean playing the piano and guitar at the same time but rather in a sense of a piano and guitar’s “child” (I still have no clue on how they can mate). Can you imagine a “child” of a nimble piano and a gloomy guitar?

Luckily, we don’t need to imagine as Magenta, a small team of AI developers from Google, created NSynth.

NSynth innovates “new sounds” by inputting thousands of sounds from various musical instruments into the neural network. Then, an AI analyzes and learns the sound characteristics using mathematics. Afterward, NSynth will generate “sound vectors” for each instrument. These vectors help to accurately “mimic” sounds.

Aside from “mimicking”, as the sounds are transformed into mathematical equations, the AI can process a sum of two instruments.

Magenta also coded a program that can mix these sounds. It is a sliding tab where you can move left and right. At the far left, the sound will be instrument number 1 (maybe a piano). Then, at the far right, the sound will be instrument number 2 (maybe a flute).

You can position the tab in the middle so the sound is exactly half a piano and a flute. You can also go deeper, for instance, you can choose 30% piano and 70% flute.

Aside from mixing two sounds, Magenta took a step further and mixed four instruments via a 2-dimension interface on an X and Y axis. A point on the graph will indicate how much of which instrument’s characteristics will be played.

Try visualizing a cross of a flute, organ, bass, and horn, it’s beyond my imagination! But with this technology, it is possible.

Blending different musical instruments is not new as it is what orchestras do. The difference is that the orchestra plays two or more instruments at the same time to generate music. While with NSynth, it is not happening simultaneously but the sounds are merged by mathematics. And personally, I think the sound is different from that of an orchestra.

Magenta explains that aside from sounds, they hope to use this study to expand the boundaries of arts. I can’t imagine what other cool stuff NSynth can uncover. For “colors”, we are already familiar with mixtures like white and red that give different shades of pink or “smell” which has a similar process. Nonetheless, I hope that Magenta can take us on new art journeys and add to the collection of “unseen colors”, “unimaginable scents”, “untouched textures”, and “one-of-a-kind sounds”.

Explore the unimaginable at