Minor Improvements
Shortly after marking this project “complete”, I had some ideas for improvements. I put these changes on the back burner for a couple months, but continued to come back and hack at them.
The first change was the addition of the command argument
-l
/--length-seconds
that sets the length of audio to generate in seconds.
Previously, the length of the output file was specified indirectly using the
count of samples to generate per horizontal pixel which was cumbersome to figure
out.
The next change was making the -v
/--volume
argument accept more standard
values of gain (0.0 = 0%, 1.0 = 100%). After a lot of experimentation, I found a
way to calculate the gain of each sine wave generator so that the sum of all
generators is loud, but unlikely to clip. I ended up with a formula that uses
the mean brightness of all pixels and the count of sine wave generators to
calculate a gain that is applied to each sine wave generator. I figured this out
by guessing and checking with bright and dark variations of the spectrogram
image. This is in no way scientific and will likely break given the right input
image.
During testing I got frustrated waiting on each run to finish so I took a little
time to try to improve it. One of the main bottlenecks was pixel access using
Image.getpixel.
I removed calls to getpixel
and replaced them with access to a list
of pixel values loaded with Image.getdata.
I did a very basic comparison between both versions of the program using the
same input image which spanned 5.2 seconds of audio. The old version of the
script tool just over 5 minutes to convert the audio, while the new version
took just over 2 minutes. There was improvement, but generation still takes a
long time. I added a decimal point to the percentage indicator so progress
changes more often.
Other changes were minor. The count of sine wave generators was changed from a fixed count to a variable set to the number of integer frequencies between the high and low frequencies. The minimum frequency was also bumped from 1Hz to 10Hz after I realized the random fuzzing of generators was sometimes causing sub-hertz frequencies.
I ran the final version of the script several times and compared the outputs to the previous version, but I couldn’t tell a difference by ear. Instructions and example usage have been uploaded to the code repo linked on the project page.