Making A Synth With Python — Controllers

This is the finalé.
This is where we’ll finally take the components from the previous two posts on Oscillators and Modulators, and combine them with some additional stuff to make a playable synthesizer.
Note
- In case you don’t have a MIDI controller such as this, then you’ll have to set up a virtual MIDI controller that will allow you to use your computer keyboard as a MIDI controller; you can use VMPK to setup the virtual controller.
- Some samples are in stereo, so better to use earphones or headphones.
- I am a garbage keyboardist, so please excuse my terrible playing in the samples.
MIDI 🎹
MIDI stands for Music Instrument Digital Interface and refers the the set of things (protocols, connectors, etc) that allow digital instruments and controllers communicate with each other.
For our purpose, MIDI defines the type of controller we’ll be using and the type of values we can expect from the controller when we use it. To use it we’ll have to first set it up.
Setup
To deal with MIDI input we’ll use pygame.midi
. Using it is pretty simple, you first initialize and then set it up:
midi.init()
default_id = midi.get_default_input_id()
midi_input = midi.Input(device_id=default_id)
This sets up the midi.Input
object which will allow us to receive data from our controller.
A few things:
- Make sure that
default_id
isn't -1, this happens if your MIDI controller isn't connected properly or if one isn’t plugged in; to ensure that its being detected you can runmidi.get_device_info(default_id)
to make user that the name you see is the same your device. - Also, a MIDI controller can have multiple devices on it,
midi.get_count()
should give you the number of devices, you can switch the one being used by selecting a different id.
Usage
Now using midi_input
we can get MIDI messages from our controller. There are two main functions here that we'll be using:
midi_input.read
: This will return the MIDI events (such as pressing a key) that occured and were stored in a buffer as a list of events, each event in the list has the form of[[status, data1, data2, data3], timestamp]
, all the named values here are numbers.midi_input.poll()
: This will basically tell us if there's any event to be read by returning abool
. So we need to callmidi_input.read
only if this returnsTrue
.
Each number, except timestamp
is a byte, and the value of status
tells us what the other (data*
) numbers mean.
Here we’ll be concerned with only two different status
values:
0x90
i.e. 144 : This stands for note on, this is the status code that is sent when you push down a key on the MIDI controller.0x80
i.e. 128 : This stands for note off, this is the status code that is sent when lift your finger off a key.
For both the above status
values, the data*
values are the same
data1
: This gives us the MIDI value of the note, example a value of 60 indicates C4.data2
: This indicates the velocity for both note on and note off, you can use this to set the amplitude of your note. Think of this as how quickly a note has been pressed or released.
Note : Before using data2
you'll have to ensure that your MIDI device is velocity sensitive.
On my MIDI controller data3
just holds 0 for both the status codes, so this can be ignored. Both data1
and data2
have values in the range of 0 to 127.
If your MIDI controller is pressure sensitive or has some other fancy features (knobs, sliders) that you would like to map to some code you can use a simple loop to figure out what’s what:
try:
while True:
if midi_input.poll():
print(midi_input.read(num_events=16))
except KeyboardInterrupt as err:
print("Stopping...")
I found this link to be useful in decoding the status
codes.
Super Simple Synth 🐥
Now that MIDI is kinda out of the way, we can create a synth using the most basic oscillator a sine wave oscillator. The diagram below describes the flow of data from midi controller to the speaker:

Super Simple Setup
In the first post on oscillators we had coded a simple sine wave oscillator, we’ll use this for the Super Simple Synth; just a small change: we’ll add an additional argument amp
, this is to scale the amplitude of the note on the basis of note on velocity.
Before we get to the playing part we need to route the synthesizer output to the speaker, for this we can use pyaudio
. We need to first set up a PyAudio stream to which we can write the synth output values.
stream = pyaudio.PyAudio().open(
rate=44100,
channels=1,
format=pyaudio.paInt16,
output=True,
frames_per_buffer=256
)
The parameters of the stream object:
rate
is the sample rate of the audio.channels
is the number of audio channels.format
is the format of each sample which here is set to signed 16 bit integers.output
is set toTrue
cause we are "writing" to a speaker rather than "reading" from some input.frames_per_buffer
tells the stream object the number of samples we'll be feeding it at a time.
Now that we have our stream object we just need to call the stream.write
method and pass it the samples to get it to play our synth sounds.
We just need another function to get the required number of samples from an oscillator in the correct format, which here is unsigned 16 bit ints, which means that the numbers range from (-2^15) to (2^15 - 1), i.e. (-32768, 32767), so we need to scale our oscillator output from (-1, 1) to the given range.
def get_samples(osc):
return [int(next(osc) * 32767) for i in range(256)]
The above function will get us 256 samples from the oscillator osc
in the correct format.

Playing the SSS
We’ll be using a dict
to keep track of all the notes that are pressed down so that we can easily remove notes that are not being played. The dict
key will be the midi note value and item will be the oscillator.
This is the flow of how we’ll “play” the synth:
Note is played
- Oscillator of the correct frequency is created and added to a
dict
. - Values from all the oscillators in the
dict
are added and returned in a buffer of a given size. - Values in the buffer are written to
stream
.
Note is released
- Oscillator is removed from the
dict
.
We can get the frequency of the note using the pygame.midi.midi_to_frequency
function, this will be used to set the oscillator frequency.
Since we are using a dict
of oscillators to play the notes from, we'll have to rewrite the get_samples
function:
Basically what’s going on in the above list comprehension is:
- Get values from each oscillator in the
notes_dict
by callingnext
on them. - Convert the values to the unsigned int 16 format.
- Sum all the values of the oscillators.
- Do the above 3 steps
num_samples
number of times to fill up the buffer.
Since we have to write to stream in bytes
format we have to use np.int16(samples).tobytes()
. If your input and output code is setup properly then running the code below runs a polyphonic sine wave synthesizer that you can play using your MIDI controller:
I have put together all of the above code in this file here, so you can run it using:
$ python super_simple_synth.py
you can quit by pressing control + c
which is why we use the try
, except
block.
This is how it sounds, a very raw synth, no modulations, pure sine waves. Notice the clicks at the start and end of a note, this is because there is no ADSR applied to the amplitude.
I got a bit carried away while making coding the Super Simple Synth trying to minimize the code, so here is the same thing in 19 lines. 😅
PolySynth 👯♂️
We’ll wrap up all the above code in a class, cause classes generally make things easier to deal with.
The PolySynth
class will help us create and play a polyphonic synth using a MIDI controller. The section below explains what’s going on in it.
The PolySynth Methods
__init__
The object initialization function __init__
takes two new arguments:
amp_scale
: this scales the value of a sample, this is used to reduce the volume of a single note, because during playback when multiple notes are played, their values are added so output can get loud.max_amp
: this is used to clip all the values to a range of[-max_amp, max_amp]
this is so that we don't damage our speakers if the volume gets too loud. Note : If you set this value to be sufficiently low (example 0.01) it will create a distortion effect.
Both the above parameters are for speaker safety, other then these two, it will also take in num_sampels
and sample_rate
these are the same as the ones explained in the earlier section.
When a Player
instance is created, it initializes the midi.Input
object and saves the passed arguments.
_init_stream
This function is used to initialize the Stream
object before a synth is played, this is set as a separate function and not initialized in __init__
cause the number of channels are decided by the type of oscillator being used.
_get_samples
This function is mostly similar to the get_samples
from the previous section. It will basically call next
on each of the oscillators in the notes_dict
sufficient number of times to fill the buffer which will be fed to the output .
The output of each of the oscillators is summed up, then scaled using amp_scale
then it's clipped to max_amp
and finally it is converted to 16 bit integers before returning.
Since the oscillators can return 2 samples if it’s a stereo generator, we use numpy
for quicker operations.
The notes_dict
values are a list of two items: [0]
is the oscillator and [1]
is a boolean flag that indicates if a note can be deleted, this flag is used if the oscillator has a trigger_release
function.
play
This is the main function of the class and is what will help us play the synthesizer and record what we play to a .wav
file.
This function takes in two arguments:
1. osc_function
: This is any function that will return an oscillator depending on the midi input, more on this in a while.
2. close
: If this is set to True
then it will close the midi input object.
The general structure of the play
function is the same as the loop described earlier, the major differences are
- It allows for generators that have release triggers (using
trigger_release
) on them such asModifiedOscillators
using anADSREnvelope
. - If a release trigger is present it doesn’t delete a note immediately, only when the note has
ended
and the delete flag innotes_dict
has been set.
osc_function
The osc_function
, is the function that defines the synthesizer’s sound, it has the following signature:
osc_function(freq:float, amp:float, sample_rate:int) -> Iterable
when a midi note is played, the frequency and amplitude of the note is passed to the osc_function
which will then return an oscillator (osc
) which, when next
is called on it (next(osc)
), it will return the value to be played.
An example of the osc_function
is the get_sin_oscillator
function described in one of the earlier sections, this is the simplest example of osc_function
.
We can play the synth now using:
synth = PolySynth()
synth.play(osc_function=get_sin_oscillator)
PolySynth Playback
Using the components defined in the earlier parts of this series, we can make a more complex osc_function
such as:
def osc_function(freq, amp, sample_rate):
return iter(
Chain(
TriangleOscillator(freq=freq,
amp=amp, sample_rate=sample_rate),
ModulatedPanner(
SineOscillator(freq/100,
phase=90, sample_rate=sample_rate)
),
ModulatedVolume(
ADSREnvelope(0.01,
release_duration=0.001, sample_rate=sample_rate)
)
)
)
We have to call iter
on it to initialize the underlying components such as the Oscillator
before next
can be called.
The above designed osc_function
sets the ModulatedPanner
's oscillator as a function of note frequency, this is what it sounds like.
Caveat
One caveat about the design of PolySynth
is that it doesn’t limit the number of notes being played, so if the oscillators can’t generate notes fast enough your output will be messed up, timing maybe off if you are recording from the buffer, or it may distort. Example:
This means that you can’t use very fancy osc_function
s, for that to work properly you’d have to use more efficient code, for example, where the oscillator samples are generated in required batch sizes rather than one by one, and all operations are vectorized, or by using a faster (perhaps compiled) language.
This is one of the reasons people say Python is slow, it can’t perform compile time optimization cause the reference implementation is interpreted, but I can’t think of any other language that will let me code a polysynth in fewer than 20 lines which is why it’s awesome.
MonoSynth 💃
The PolySynth
class is just one way to go about creating a synth, you can have other designs too, for example: a synth where only one note plays at a time using just one oscillator by setting it’s ._f
parameter.
Here’s a MonoSynth
class that does exactly that by overriding the PolySynth
's play
and _get_samples
functions. Instead of playing the next frequency immediately it glides to the played note’s frequency, the duration is decided by the glide
argument.
The get_glider
function basically uses np.linspace
to interpolate between the old and the new frequencies frequencies. This is what it sounds like
If you increase the glide
duration then it sounds very much like a theremin.
The PolySynth
and MonoSynth
may give you an idea on how to go about putting together a synth.
And that is basically how one would go about making a synth in python, a few things could have been done better to increase the number of concurrent oscillators, but since we aren’t making a VST that we can plug into our DAW it’s alright, if you do want to do that, you may have to use C++ or some other compiled language to create the synth.
Anyway here’s Für Elise, in stereoscopic and polyphonic glory:
The stereo was possible by setting the Panner
parameter r
as function of input freq
.

All the code is in this repo; the code used in this post is in this notebook.
For recording most of the samples I used BlackHole along with GarageBand.
Other posts in this series
Thanks for reading. 👋