
Complex System Sound
BIRTH STORY OF VOIBOW
BIRTH STORY OF VOIBOW
BIRTH STORY OF VOIBOW
●Started development@2014.1.1
A certain company was promoting the development of a speech generation method based on a mathematical model (independent theme), but it ended due to company policy. I
was transferred to another theme, but I couldn't demonstrate my abilities, and I was hospitalized due to my physical condition.
I decided to do what I wanted to do outside of company business, and started on January 1, 2014, right after I was discharged from the hospital. I was going to resume voice generation, but I changed my mind to VOIBOW development. Using weekends and other holidays, I started by rereading papers that had helped me in the past (papers by M.E.McIntyre and J.O.Smith, etc.). The New Year was like a fish out of water.
By the way, VOIBOW is a coined word meaning to perform BOWing (rubbing the string with a bow) with voice (VOIce). I got a hint from Voipa (voice percussion).
●A bowed stringed instrument is a complex system(chaotic system)
Bowed stringed instruments meet the requirements of a chaotic system because the bow/string/bridge/body interaction and the non-linear element of friction are present. So does VOIBOW. By skillfully controlling the parameters of the system, it is possible to obtain a timbre with a pleasingly fluctuating feeling (infinite timbre change according to control). As Strange Attractor tells us, from the start to the end of the pronunciation, the amplitude value of the waveform does not follow the same trajectory on the attractor, and the trajectory is infinite due to differences in pronunciation control (bow speed, etc.) Change. It's a feat that can never be done with a sound source that uses the recording and playback method.
It is no exaggeration to say that this feat is the lifeblood of voices, bowed strings, wind instruments, and other sound sources that perform various timbre control (expressions) during pronunciation. Just look at the instruments professional musicians use on stage. However, even with such tricks, if it is not well controlled, the tone will be terrible. If the chaos is too strong, it will produce a nasty "gee" sound like that of someone who has just learned to play it. Disguise. This is what makes chaos systems interesting, and what makes them so annoying. What kind of world would it be if we could establish a methodology to control chaotic systems in a good way?
●Initial sensitivity
When God creates various complex systems (nature ~ instruments ~), the initial value or the sharp action corresponding to the disturbance does not go in a random direction, in other words, the stable to unstable condition is just right. I think that I was careful to harmonize with. because it is the most beautiful.
●Friction is the Heart of Bowed String Instruments
The aforementioned paper describes the mechanism of friction in detail, and I used this as a reference to develop my own model. After years of repeated revisions to achieve a satisfying sound, the VOIBOW was already heavily processed in the middle stages of development. Not only that, but the model had become too complicated, and there were problems with stability.
"It's not a real instrument, so just do what you say!" "No, is it okay to compromise with this?" With just that, the current friction model was completed. As a result, the timbre/stability improved dramatically, and it became a really simple model.
In a nutshell, the mechanism of this friction model can be described as follows. At the moment when a force exceeding the maximum static friction force is applied, it suddenly (discontinuously) transitions to the dynamic friction state (see the diagram on the home page).” This kind of discontinuous behavior creates a “gritty, warped sound (or edge voice in voice)”. On the other hand, the pre-correction model failed to achieve this behavior faithfully, resulting in a mushy and incoherent tone.
(Maybe it's just me.) When I listen to the edgy sound like the former at a high volume, it shakes my guts and lures me into a state of ecstasy. The “sound that resonates with the internal organs” that I am aiming for was obtained through research into friction.
●Presence of bridge
When I was worried about the lack of high-frequency sensation, the bridge, which had been relegated to a corner until then, was illuminated. Yes, the bridge was not simply a part that transmits vibrations to the body. In the book, it says, "Convert the horizontal vibration of the string into vertical vibration and transmit the vibration to the body. Well, that's right, but according to E.V. Jansson's literature, the natural frequency of the bridge itself (the frequency band that is amplified is called Bridge Hill) is the source of creating rich overtones. With reference to the literature, the bridge was also mathematically modeled. As a result, a fine sawtooth wave (equivalent to the Bridge Hill band) appeared in one cycle of the VOIBOW waveform. (Although this is subjective) it sounds natural, so I proudly say, "This mathematical model is a model that matches the physical phenomena of acoustic instruments." I want to prove it.
In response to the question, "Why does it sound natural if it matches the physical phenomena of acoustic instruments?" The only answer I can give is that God created it so that it can be accepted by the people who live in the world without any discomfort.” I mean, I don't know.
●It's better to stop being petty
It is a daily occurrence to get stuck in the reef during development. At that time, I tried makeup with effectors and filters, but it didn't go well. For example, BridgeHill's band is poor, so I used a band-emphasizing filter to boost it. It's like applying a thick layer of makeup that warps and makes you look ugly. By the way, the cause of the lack of bandwidth was the sloppy formula of the equation of motion of the bridge, and it was solved by correcting it.
Also, when the feeling of fluctuation (aperiodic fluctuation) is lacking, I often try to assign some parameter with random numbers. There is no problem with things that are backed by the physics of acoustic instruments (e.g. fluctuating bow speed) and spatial systems outside the chaotic system such as reverberation and EQ, but otherwise it will adversely affect the system balance. It will sound unnatural. Regarding the problem of modeling complex systems, there seems to be no other way than to solve it by modeling head-on.
●bow is mom
"Keep your bow soft" made sense. Instruments such as bowed string instruments and wind instruments, whose body is perturbed by a sustained external driving force to cause self-excited vibrations, produce timbres that deviate from beautiful tones and become terrible depending on how they are driven. This is the very nature of complex systems. If you were to compare it to a creature, it would be like a rampaging horse, or an infant with bad manners. However, if the owner or mother raises them well, children with habits can really demonstrate their excellent abilities. Yes, the bow is the owner/mother of bowed stringed instruments, and the soothing and skillful education is equivalent to "playing with a soft bow." As a result, a state (edge of chaos) emerges in which the musical instrument throbs and produces the most beautiful timbre.
Based on this idea, he derived a mathematical model of the contact point between the bow and the string. It was confirmed that the timbre changes seven times by changing the parameter value of the contact part. However, I still haven't found the best tuning methodology for complex systems. This may be the real thrill of creating sounds.
●Bows and strings are electric wires and pantographs
At the time when I was struggling with modeling the interference motion of the bow and string in the vertical direction (non-friction direction), I saw the pantograph of a train entering the platform while I was waiting for a train to change trains, and I had an image of the equation of motion. I can't hear the sound, but the electric wires rubbing gently on the pantograph are moving really smoothly due to the cushioning effect of the pantograph.
●Pine resin is important
It is said that beautiful tones are created by applying rosin to the bow and forming moderate unevenness on the surface of the bow. I still can't understand the relationship between the irregular fluctuation component that is generated by the complex system and the irregularity of friction due to this uneven shape, but according to the listening experiment, "making the surface of the bow uneven" makes it "more powerful." It is a necessary condition for generating fricative sounds." I could not find a reference example of a concave-convex model, so I developed a new one through trial and error, and managed to apply pine resin to the bow of VOIBOW.
Reducing the unevenness of the bow surface reduces the peak level and variance of the static friction state shown in the home figure, and as a result, suppresses the discontinuity level when transitioning from the maximum static friction state to the dynamic The sound, which should be edgy, deteriorates into a sound like a dull saw blade (a sound like a sideways slipping sound).
●The feat of Mr. C.M. Hutchins
She studied under a master violin maker, disassembled Stradivarius and Guarneri, and studied the vibrations of the top and back plates of the body. I feel like I'm richer than I say! The "resonance characteristics of the board" of the research result was a great reference for my development.
At the conclusion of her paper "Acoustics of Violins", "The natural vibration changes before and after assembly. "Because the assembled violin is a very complex vibrating system," it says. It just suggests the limits of reductionism. Really interesting.
●What should be modeled to what extent?
At the beginning of development, the first question arises, "What and how much should be modeled?" One hint for thinking about this was the evaluation of Nobuyuki Tsujii when he won the Van Cliburn International Piano Competition in 2009. Until then, I had believed that the bottom noise of the hammers was one of the piano-like and important elements. changed. "Elements that make the timbre dirty must be removed from the model, because top-notch musicians dislike dirty timbres."
Next is a sufficient condition (how far should we go?). I still don't have the answer. As mentioned above, a bowed stringed instrument is a complex system, and if the modeled instrument satisfies the necessary conditions for a complex system, it has the basis for realizing complex behavior (irregular fluctuations) without creating a complicated model. are doing. What I wanted to say was that "if you can get a satisfying tone, you can model it well even if the model is not complicated." Based on this way of thinking, VOIBOW was created by repeating "analysis by synthesis (updating the model and auditioning the synthesized sound)" over and over again."European herbs are a must," "7:5 ratio of sauce to ketchup is good," "Is there too much onion?" It was a daunting task because of the huge number of combinations of parameters. If I don't like it, I stop.
As an aside, about 20 years ago, my brother said to me, "Mr. F, who has continued to reign in the world of Go as a top Go player, places more importance on whether he was able to play Go that he was satisfied with rather than winning or losing." Since then, one of my mottos has been "Whether I am convinced or not." VOIBOW's goal was to answer "yes" to the question, "Is the sound really convincing to you? Can you promise not to make excuses if you ask the world?"
●What not to model
What would happen if we modeled that the tension of the strings was overloaded by the bridge and body plate? If the model parameters relating to the thickness (strength) of these members are made smaller than a predetermined value, the plate will break. It would be too sad if it was a live instrument that costs hundreds of millions of yen. What if we don't model it? The thickness of the material can be made as thin as desired. If the optimum thickness is automatically controlled according to the pitch, it should be possible to always achieve sharp resonance regardless of the pitch (I feel that the bridge is more important than the body).
I think this is one of "liberation from physical constraints" and the essential difference between the real thing (physics) and mathematics. A violin with a sharp outline ("ri" instead of "hi") even at a pitch of several kHz, like the sound of a bell. A contrabass with a huge cave-like body and a giant monster-class edge voice. Both seem to resonate with the viscera.
●Why is the fundamental level of the 4 strings of the cello low?
An analysis of the real cello sound reveals a phenomenon peculiar to the 4-string, for example, that the level of the fundamental note of note name C2 is very low compared to the overtones. I think it's probably because the physical design is such that the band is weakened by the interference between the vibration of the string and the vibration of the body. VOIBOW is also somewhat smaller by accident, but not as small as the real thing. I also considered whether I should model the phenomenon faithfully to fill this gap, but decided not to do so. The reason is that if you can hear the fundamental tone clearly (like the 1st to 3rd strings of a cello or the 4th string of a contrabass), the sound will resonate in your gut (and you can convince yourself). Anyway, the title question is still a mystery.
●Bowed instruments are difficult to play
A bowed string instrument controls many parameters when producing a certain sound. For example, pitch / bow pressure / bow speed / position to rub the string (from the bridge) / angle of bow hair application etc. These parameters can be controlled independently, but beautiful tones cannot be obtained unless they are controlled in a well-balanced manner. It is because the performance ignores this balance that the unbearably dirty sound is produced when a person who has just learned the instrument plays.
That? . Even wind instruments control many parameters. But why is it not as problematic as bowed string instruments? Well, that aside... Without daily practice and accurate feedback control by ear during the actual performance, it would be difficult to produce a beautiful tone. The magnificent melodies of bowed strings are works of art that only first-class performers can create.
●Voice drive function and calibration function
The real thrill of music is the sense of accomplishment that comes from mastering the techniques that only first-class performers can do. On the other hand, it is also the real pleasure of music that you can compete with your sensibility alone without skillful hands / good sense of pitch / practice time and expense / guts etc.
For those who sympathize with the latter, we have implemented a "voice-driven function" and a "calibration (automatic correction) function" to eliminate as much difficulty as possible and allow you to play easily (just by intuition). turned into This is another example of "liberation from physical constraints." Even a top-notch cellist will inevitably mix dirty sounds when it comes to fast passages, because they exceed the limits of human motor nerves. The “calibration (automatic correction) function” is designed to support this human limit with mathematics.
There are roughly two methods for the "speech driving function": a method of perturbing the model with speech and a method of extracting features (eg, pitch, volume, etc.) from the speech and controlling the model using them. Since the trumpet model has a high affinity with the voice, the former was adopted. On the other hand, the bowed string instrument model does not have a high affinity with the voice, so we had no choice but to adopt the latter method. In the latter method, the pitch must be extracted, and the responsiveness deteriorates. Therefore, the pitch should be controlled by another means, and only the volume should be extracted from the voice, and the bow pressure, etc., should be controlled based on that.
Aside from that, the "calibration function" in the bowed string instrument model of VOIBOW optimizes the balance of the above-mentioned many parameters, and in the case of voice-driven control, based on the pitch and volume extracted from the voice, e.g. Balance is achieved by internally calculating bow pressure and other factors. When studying mathematical models such as friction, we found "relationship conditions (formulas) required for various parameters to achieve a stable timbre", and based on those formulas, we automatically keep the parameter values at optimal values. ing. As a result, it is possible to always generate a stable tone color. Since the time interval for performing calibration is in units of one sampling time (for example, several tens of microseconds), there is no problem with responsiveness.
However, if the timbre is always stable, it will be uninteresting. For example, at the moment when a strong fricative sound occurs at the beginning of the bow, it is a phenomenon caused by the unbalanced state of bow speed and bow pressure disturbing the resonance period of the string. If you put it away, the "strong fricative sound" will be suppressed and the unique taste of the string (whether salty or sour) will be diluted. As for this control, in addition to being closely related to performance expression, it is also a matter related to sounding delay, so I would like to carefully consider it in the future.
Sake, which is one of the Japanese food culture, has expressions that include time information such as "starting taste", "middle taste" and "aftertaste", but I feel somehow related to sound.
●Surprised by the evolution of PC processing power
When I first became an engineer (early 1980s), even workstation computers (eg VAX) were on the order of a few MIPS. At that time, it was not possible to simulate musical instrument sounds in real time on a computer. Therefore, development starts with making a circuit board for simulation using DSP and logic IC. At that time, I was a young boy who couldn't solder properly, so I remember having a hard time.
In the 1990s, it became possible to realize sound sources using the recording and playback method on a PC. About 30 years have passed since then, and it has become possible to realize sound sources that are all computationally operated like VOIBOW only with a PC. If you look at the resource monitor of your PC, my generation should think, "How much spare time do you have?"
The PC programs for the cello and trumpet were completed, and the trumpet, which has a relatively small amount of processing, was first developed into an Android application. Since it is a musical instrument application, low latency is an essential issue, and we designed it based on the API of "AAudio" and were able to suppress the sound delay to 19mSec. Speaking of greed, I want to improve AAudio to several mSec order, but AAudio itself doesn't seem to reach that level. In any case, we plan to release it on Google Play under the name of "App Trumpet" in 2023.
●EzCello
Develop an Android application based on the cello model prototype (PC program).
FUTURE STORY
●Develop "EzDbass", "EzViolin" etc.
●iOS app development.
●Final goal is beyond strads (mathematics surpasses physics)