We have been trained in presentation techniques and rhetoric, but for the post-pandemic world the basics of “how to sound good online” is still new territory. So when YouTuber Anders Jensen asked a question about “audio” in LinkedIn I gladly threw some lessons-learned from the past years at him. If these pre-released key takeaways of our upcoming online event are unfamiliar then join us on February the 25th! Registration does not cost anything and after the event no one will try to sell anything to you.

My name is Christian Prior-Mamulyan and I would be glad to see you in the online event!

A flipped classroom … aims to increase student engagement and learning by having pupils complete readings at home and work on live problem-solving during class time. … With a flipped classroom, students watch online lectures, collaborate in online discussions, or carry out research at home, while actively engaging concepts in the classroom, with a mentor’s guidance. Source: Wikipedia entry for “Flipped_classroom”, visited at 2022-02-13

Key takeaways

Good sound for webinars or video conferences starts at the beginning of an audio signal chain with the room sound and the microphone. The room sound is greatly influenced by reflective, absorptive and diffusive qualities of the materials within. The microphone’s polar pattern and frequency response feeds on mic level into the production workflow. A dedicated audio interface solves the analog-digital conversion in hardware; by the way acceptable hardware involved is not wireless. Distributing to YouTube demands for tight control of the normalized loudness and true peak, which can be achieved with compressor, limiter and e.g. the software Youlean Loudness Meter. While doing so an applied EQ (“equalizer”) curve may “fix in post” some deficiencies at the beginning of the audio signal chain. Whether that happens analog at the beginning of the signal chain or digitally in post-production: Audio recording is best understood on the basis of electrical signals and their visual representations like waveforms and frequency spectrums.

Key takeaways
Preface
- Goals
Preparation
How it started
Analysis of the YouTube channel
Remedies
knowledge, facts, data
Skills, Workflows
Off-The-Shelf Technology
Applications

Automated Analysis of a YouTube Channel https://www.youtube.com/c/andersjensenorg

Importance of Audio as a success factor Analysis of channel https://www.youtube.com/c/andersjensenorg Remedies Recommended Reading/Watching

Preface

Join Anders Jensen and me on a live session where

we are going to wrap up first improvements for his video channel and his teaching business
I share the condensed learning of my past 2 years
and paint a roadmap for future improvements.

Participants will

be able to derive actionable ideas from this journey for their own improvements
distinguish elements of good audio to apply to their own online presence
rank these by difficulty and affordability
recognize basic terms and concepts of audio engineering

Goals

name the YouTube stats for nerds information relating to audio
predict the YouTube processing ahead of an upload
differentiate analog from digital components in a recording chain
rank factors of audio by importance
Execute a zielführend workflow/timeline of production and post-production

Preparation

Checklists

1. Dress Rehearsal Zoom Interview - Participant
1.	Is the Zoom username defined and unikely to change?	CHECK
2.	In the Zoom App, is under Settings > Video near Camera the HD checkbox ticked?	CHECK
3.	In the Zoom App, is under Settings > Audio the appropriate Suppress background noise set?	CHECK
4.	In the Dress Rehearsal, can Original Sound be set to On?	CHECK
5.	Is the microphone level adjustable?	CHECK
6.	Is the loudness meter visible to the participant?	CHECK

2. Dress Rehearsal Zoom Interview - Host
1.	Can the participant be pinned manually?	CHECK
2.	Can the participant be pinned by e.g. ZoomOSC?	CHECK
3.	What is the participant's Zoom name?	CHECK
4.	Can the participant be scraped from a (virtual) monitor with Zoom in fullscreen?	CHECK
5.	Will the participant be fed into OBS with a Window Capture?	CHECK
6.	a	CHECK
7.	b	CHECK
8.	c	CHECK
9.	d	CHECK
10.	e	CHECK
11.	f	CHECK
12.	g	CHECK
13.	h	CHECK
14.	i	CHECK
15.	j	CHECK
16.	k	CHECK

3. Streaming Preshow
1.	connect Wacom One before OBS start	CHECK
2.	Vicreo Hotkey started	CHECK
3.	EpicPen	STARTED
4.	EpicPen in VoiceMeeter	MUTED
5.	Companion instance atem	STATUS OK
6.	Companion instance Playoutbee	STATUS OK
7.	Companion instance OBS	STATUS OK
8.	Companion instance h2r http	STATUS OK
9.	Companion instance h2r http configured	HTTP://10.38.20.11:4001/API/LIVE/
10.	Companion instance VICREO hotkey	STATUS OK
11.	Moodlebox server	HTTP OK
12.	Moodlebox authorization	TUTOR LOGIN
13.	ATEM Software Control	STARTED
14.	ATEM Software Control restore	LATEST NET*.XML
15.	ATEM Software Control	MEDIA POOL CHECKED
16.	ATEM Mini	HDMI 1 IS CAM 1
17.	ATEM Mini	HDMI 2 IS
18.	ATEM Mini	HDMI 2 IS
19.	ATEM Mini	HDMI 4 IS PLAYOUTBEE
20.	ATEM Mini	PLAYOUTBEE MIC ON
21.	Camera	SWITCHED ON
22.	Camera	IN FOCUS
23.	OBS Broadcast	SELECTED
24.	OBS Profile	NET.CPRIMA.LIVE 1080P
25.	OBS Scene Collection	NET.CPRIMA.LIVE
26.	OBS Fullscreen Projector (preview)	WACOM ONE
27.	OBS Virtual Camera	STARTED
28.	Zoom Camera	OBS VIRTUAL CAMERA
29.	Zoom Microphone	VOICEMEETERVAIO3 OUTPUT
30.	YouTube Studio	OPENED
31.	Playout Preshow	SELECTED AND STOPPED
32.	Voicemeeter	VAIO3 HAS LOCAL AUDIOINTERFACE AND ZOOM AND ...
33.	Voicemeeter Input ATEM Mic	ROUTED TO BUS OUTPUT VAIO 3
34.	Voicemeeter Input Audiointerface	ROUTED TO BUS OUTPUT VAIO 3
35.	Voicemeeter Input with Zoom App	ROUTED TO BUS OUTPUT VAIO 3
36.	Voicemeeter Input with Desktop Apps	NOT ROUTED TO BUS OUTPUT VAIO 3
37.	OBS Audio Mixer	VAIO3 UNMUTED
38.	Mic preamp DBX286s input gain	45
39.	Mic preamp DBX286s phantom power	ON
40.	Mic preamp DBX286s 80Hy bypass	ON
41.	Mic preamp DBX286s Process Bypass	OFF
42.	Mic preamp DBX286s Compressor	MODERATE
43.	Mic preamp DBX286s De-Esser	MODERATE
44.	Mic preamp DBX286s Enhancer	LF OFF, HF OFF
45.	Mic preamp DBX286s Expander/Gate	30DB 2:1
46.	Mic preamp DBX286s Output Gain	NEUTRAL
47.	Zoom started	CHECK
48.	Evo Control with Line Input	GAIN AT 7-8 OCLOCK
49.	H2R Graphics Settings Data sources YouTube API key	FOR CORRECT CHANNEL (POST@CHRISTIANPRIOR.DE)
50.	H2R Graphics Data Source Livestream video ID	FRG5-22OCHS
51.	H2R Graphics Data Source Livestream status	CONNECTED
52.	ZOOM YLM Guest	JOINED

How it started

Picture description

Step 1: Reading the incident report

Let us read this comment on LinkedIn as if we were an IT Supporter and this was an incident:

The reporter mentions that the volume is being turned up
The room has a certain qualitiy attributed to it (“is quiet”)
Off-the-shelf technology is mentioned (The RØDE wirelss Go microphone)
The quality of the recording is described as having white noise

verfifying the incident report

Listening to the linked video the room is indeed quiet, but has a reverb like a kitchen or
The term “white noise” is not 100% correctly used, it is safe to assume that “noise floor” is meant
The description of the audio signal chain is not complete

ticket triage

more information is needed,
contact reporter

Analysis of the YouTube channel

YouTube recommendations

foo

https://support.google.com/youtube/answer/1722171

Sources of Noise

Evaluation of all 350+ videos

Hidden Treasures in the YouTube API and youtube-dl output

Remedies

Eliminate *noise

Room

switch off all noise sources
minimize sound from the outside coming in
use headphones (maybe in-ear)
consider “silent” keyboards
proper microphone placement as per cardioid pattern and keybord location
behind your microphone, place something absorbant

Microphone Technique

balance closeness to the microphone and amplification to minimize amplification of electrical noise

Audio Signal Chain

do not use an onboard soundcard
use noise cancellation with care

Room Acoustics, Reverberation Time and Clap Test

The journey to improve the audio quality of a recorded narration for webinars or in videoconferences does not start with buying equipment. Because equipment is used inside a room and the room is where it starts.

The requirements to record for webinars are very similar to those of the voice actors, or voice over artists. And they are quite different from music producers or audio mixers. And quite different from high fidelity music listeners. So this kind of narrows down from whom to take advice from. Another good source of advice are audio engineers in churches, by the way. And with a bit of training our own ears are another authoritative judge!

Audio Signal Chain

Within the room there will be direct sound and reflected sound from the walls and surfaces. The physical properties of these surfaces make or break a good room acoustic! Only so-called anechoc chambers have no reflected sound and they are not a good thing to strive for, by the way.

The reflected sound will hit our microphone again and decrease the quality of our recording. This reflected sound is THE most important aspect that needs to be controlled. Worst case is to sit between two hard walls which reflect our spoken word multiple times into the microphone until it dies down in volume.

And let’s not forget that there is a reflective ceiling, too!

Refelction Absorption Diffusion

To reduce reflections both absorption and diffusion will improve the room acoustics. Examples for absorptive materials are:

acoustic foam
furniture
curtains
rugs
acoustic bass traps and of course
sound blankets

Diffusive materials are

bookshelves
sound diffusors

A measurable property of the room is its reverberation time, which can be measured in according to ISO standards – or performed with simple tools which is good enough in our context.

First, what is exactly meant with reverberation time?

Reverberation is the acumulated reflection of sound. It is what we hear in a bathroom. Reveberation time is the time it needs to fade away.

Most common method is to measure the RT60 which is the time in seconds until a loud sound decays by 60 decibels.

Play Store detail page for myRaumklang - Nachhall messen und optimieren

All this is another way of saying: Do a clap test, there is even an app for this!

Either the app or our own recording with our existing microphone into a software that displays a waveform will give us a measurement in seconds.

Then we do a few improvements, and measure again, and listen to our recording again.

my room

In my room I have both hard and absorbant materials. But as the sound of my voice does not bounce back from the hard surfaces into the microphone I have not negative effect by the hard wooden cupboard doors. So a good room for audio recording does not need to be covered in foam all around!

The biggest problem of my room is that the heating pipes transmit sound from the teenagers room underneath. Reverberation time does not fix THAT. Maybe I will give them a gamer’s headset next christmas! ;)

Going by sich an RT60-like measurement means that it is easy to find out when to stop improving the room acoustic, and focus on other aspects of good audio in webinars!

Resources

https://www.nti-audio.com/en/applications/room-building-acoustics/reverberation-time

http://www.larsondavis.com/learn/building-acoustics/Reverberation-Time-in-Room-Acoustics

my room

https://imgur.com/a/NHX2XGe

Audio Settings on Windows

Checklist

1. Windows 10 Audio Recording Inventory
1.	Are ASIO drivers installed?	YES/NO
2.	What recording software is used?	AUDIO-VIDEO/DAW/VOICE RECORDER/OTHER
3.	Does the recording software also function as a communication software?	YES/NO
4.	What is the Default Device for Recording?	CHECK
5.	What is the Default Device for Playback?	CHECK
6.	What is the Default Communication Device for Recording?	CHECK
7.	What is the Default Communication Device for Playback?	CHECK
8.	If an audio interface is used, does it have a monitoring output?	CHECK
9.	If an audio interface is used, which driver type is used?	WDM/ASIO
10.	If an audio interface is used, does it come with a Bitfocus Companion module?	CHECK
11.	If an external microphone is used, what is its transducer type?	DYNAMIC/CONDENSER/RIBBON
12.	If an external microphone is used, what cable does it have?	3-PIN XLR/3,5MM TRS/OTHER
13.	If an external microphone is used, does it need phantom power?	YES 48V FOR MIC/YES 48V FOR PRE-AMPLIFIER/NO
14.	If an external microphone is used, what is its polar pattern?	CHECK
15.	If an external microphone is used, what is its frequency response?	CHECK
16.	If an external microphone is used, how is it adressed?	END-FIRED/SIDE-ADRESSED
17.	The Sound Settings of the Default Device for Recording has which level set?	100%
18.	The Sound Settings of the Default Device for Recording show which sample rate?	48000 HZ
19.	The Sound Settings of the Default Device for Recording show which setting for exclusive mode?	ALLOW AND/OR PRIORITY?
20.	The Sound Settings of each Device for Recording has which level set?	100%
21.	The Sound Settings of the Default Device for Recording show which sample rate?	100%
22.	The Sound Settings of the Default Device for Recording show which setting for exclusive mode?	100%
23.	In the Volume Mixer, which output levels are set?	SPECIFIC DEVICE OR DEFAULT DEVICE?
24.	In the "App volume and device preferences", which App Volume levels are set for Output?	CHECK
25.	In the "App volume and device preferences", which App Volume levels are set for Input?	CHECK
26.	In the recording software, which input device is set?	CHECK
27.	Is something plugged into headphone/speaker/microphone jack(s)?	CHECK
28.	In the Sound Settings Sounds panel, which sound scheme is set?	NO SOUNDS/WINDOWS DEFAULT/OTHER
29.	In the Sound Settings Communications panel, which audio ducking/volume control is set?	MUTE/80%/*50%/DO NOTHING
30.	If a Realtek chipset is built in, are in Realtek HD Audio Manager for the Microphone device Microphone Effects Disabled?	YES/NO
31.	In Settings > Privacy > Background Apps, are any apps switched off that might interfere with audio recording?	YES/NO
32.	In Control Panel > Hardware > Power Options, is High Performance Setting set?	CHECK

Loudness GUI

Controlling the loudness during the duration of a recording is a standard workflow in videoproduction.

Here a overview of commonly implemented GUI graphical user interface functionality is shown.

This video shows the software Davinci Resolve.

This video shows the software Audacity.

To make a future edit as easy as possible it is a good idea to work on a copy of the audio track.

There is a handle to change the level of a single clip with the mouse.

The whole track also has a level, here adjustable by a slider and direct entry.

A double click resets the setting.

The beginning of the clip may get a Fade-In. And the end of the clip may get a Fade-Out.

But now the most important part:

Along the timeline at desired positions “keyframes” are inserted.

Each keyframe may get an individual level,

either by dragging it with the mouse,
or by a slider,
or direct entry.

In between the keyframes the software calculates the intermediate value, by default with a linear algorithm.

Every program with more than just a basic feature set will also offer an ease-in and/or ease-out influce for finer control of an otherwise abrupt new state. This software implements this feature with Bezier curve handles. Although for audio projects this particular control it is less important than in – for example animation.

Checklist for All Things Audio

To reduce complexity a lot can be learned from the aviation industry: Their checklist culture is outstanding and based on very

Checklist

1. YouTube Pre-Upload Audiotrack
1.	How many audio tracks does the video contain?	1
2.	Is the audio track in stereo or mono?	STEREO
3.	Which sample rate was the audio encoded with	48.000 HZ
4.	What is the bit depth?	16 BIT, MAYBE 24 BIT
5.	What Audio Codec is used?	AAC
6.	What Bit Rate is used?	MONO: 128 KBPS, STEREO: 384 KBPS
7.	Does the waveform show spikes?	CHECK
8.	Was the track suitably compressed?	CHECK
9.	Is the Loudness below YouTube standard?	<= -14 LUFS INTEGRATED (CLOSE TO)
10.	Are the True Peaks kept in check, e.g. with a limiter?	<= -1DB TRUE PEAK MAX
11.	Is the noise floor quiet?	< -40DB
12.	If background music is used, is speech still intelligeable?	CHECK
13.	Are there any parts in the audio that should be replaced by room sound?	CHECK
14.	Are there any lessons learned for the next recording?	CHECK

knowledge, facts, data

Audio Signal Levels

Using a microphone with a computer is actually an interesting mix: Any digital recording actually starts and ends in the Analog Domain of the physical world, and digital is only the middle part. Both the microphone and the loudspeaker are analog devices.

And a microphone produces at its core a continous electrical signal, done by converting vibrations into electrical energy, and a very low signal that is: The result is an induced voltage of a few millivolt – that is thousandths of a volt. This is very LOW!

In our webinars and YouTube videos signals are passed between audio equipment on

microphone level and
line level (in its consumer line level variant)

chart of audio signal levels: microphone, line, instrument and speaker

It is the functionality of

either a pre-amplifier and
the audio interface

to raise the mic level to line level, which is suitable to pass signals between equipment.

Sometimes we see on equipment the label “instrument level”, this is slightly below consumer line level.

All this gives us a hint that a microphone is best understood with the mindset of an electrical engineer.

The Audio Frequency Spectrum

frequency spectrum human hearing human voice

logarithmic scale

20-20k Hz hearing “but…” 200 - 6000 male voice 400 - 8000 Hz female voice

while human hearing spans ten octaves, human speech only covers about five octaves. source: https://larryjordan.com/articles/eq-warm-a-voice-and-improve-diction/

ranges

Our speaking voice has three frequency ranges that need to be understood;

Fundamentals. The fundamental frequencies of speech occur roughly between 85Hz and 250Hz.

Vowels. Vowels sounds contain the maximum energy and power of the voice, occurring between 350Hz and 2KHz.

Consonants. Consonants occur between 1.5KHz and 4KHz. They contain little energy but are essential to intelligibility. https://www.behindthemixer.com/how-eq-speech-maximum-intelligibility/

Low-End Noise - 20 Hz to 80 Hz

Boominess - 80 Hz to 300 Hz

Muddiness - 250 Hz to 500 Hz

Nasal Honk - 800 Hz to 1.5 kHz

Presence - 4.5 kHz to 9 kHz

Breathiness - 10 kHz to 15 kHz https://ledgernote.com/columns/mixing-mastering/how-to-eq-vocals/

By the way, the reason why we record in 44.100 Hz or for videos 48.000 Hz is that this is 2 times 20.000 Hz plus a bit of extra for the sampling filter to work.

-> Nyquist frequency

sibilance:

Vocal sibilance is an unpleasant tonal harshness that can happen during consonant syllables (like S, T, and Z), caused by disproportionate audio dynamics in upper midrange frequencies. Vocal sibilance is a phenomenon of disproportionate dynamics within an isolated frequency range. In other words, it is a problem of too much loudness contrast within a small frequency range of a waveform that has a dynamic profile of its own. The de-esser technique typically uses a narrow peak EQ in the sidechain path to boost the most offensive sibilant frequencies. This EQ exaggerates the dynamic difference between the sibilant band and the rest of the vocal waveform, making it much easier to achieve gain reduction during those consonants (and only then).

https://theproaudiofiles.com/vocal-sibilance/

intelligibility: https://www.dpamicrophones.com/mic-university/facts-about-speech-intelligibility

Loudness

Do you remeber the times when TV commercials were so much louder than the actual TV content? That was the time when the human perception of loudness was not taken into consideration.

The signal level on microphone or line level are measured with a technical reference, and the loudness innovation was to measure according to the human perception. This started at about 2010.

Nowadays streaming service like Spotify and – you guessed it – YouTube use the loudness across a while song or video to adjust loudness. YouTube actually only lowers the loudness, and protects users with headphones that way.

Based on what is called the Fletcher Munson Curve of human loudness perception a loudness measurement is a statistically determined value over time, either

short-term or
integrated, that means over for example a whole YouTube video.

Software like Youlean Loudness Meter analyze the audio and returns a graphical representation.

Why is this important?

Because when the production loudness did differ from the target platform’s rules then it makes sense to adjust the distribution loudness, to not leave it up to the streaming service to mess about with the audio.

YouTube for example will

turn down videos that are integrated –which mean over the full duratio – louder than -14dB LUFS (which is the ludness unit).
but YouTube will not turn the loudness up if it was too low to begin with!

The loudness measurement makes our – hopefully many – recordings comparable comparable between each other.

Methods to influnce the loudness of a recording are

the complressor or
the amplifier

The loudness measurement also helps us to adjust an interview partner to the same loudness as the host.

Online tool: https://www.loudnesspenalty.com/ Stats for nerds: https://productionadvice.co.uk/stats-for-nerds/

YouTube loudness normalization

What you need to know https://www.youtube.com/watch?v=wIicS8hKbeQ

intro: loud commercials

goal: relevance for YouTuber and what are the key concepts behind it

//caveat: peak audio AND loudness

toc

what is loudness, what is loud
- perception
- (Fletcher Munson Curve)
- we hear the mids more prominently than bass and highs
excursion: music production, TV
European Broadcast Union (EBU) OR EU, consumer protection, 2010
- now https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1770-4-201510-I!!PDF-E.pdf
- also: playlist shuffle
- example overcompressed track https://magroove-files.s3.amazonaws.com/magroove-blog/wp-content/uploads/2019/08/Nivel-em-LUFS-de-diferentes-faixas-de-%C3%A1udio-768x480.jpg
peak, RMS, LUFS, …
- negative numbers because referenced to full scale of the maximum that a system can handle
- RMS is not adjusted to human hearing
- LUFS k-weight https://youlean-129cf.kxcdn.com/wp-content/uploads/2020/05/Screen-Shot-2020-05-03-at-7.06.08-PM-3.png https://youlean.co/how-to-hack-lufs-normalization/
- high LUFS “means compression” == loss of dynamic range
statistics over time
- short-term
- integrated
true peak
production loudness
distribution loudness

To remain on the practical side of things:

example LUFS graph over time
YouTube rules, and other platforms
YouTube rules -> YLM features
resulting action / Applying This Knowledge
- separate master mixes for each platform
- or rely on “them” to turn it down
- measure after the adjustments!

Youlean Loudness Meter 2 (Free and Pro) Mastering The Mix LEVELS (Paid) iZotope Insight 2 (Paid) Waves WLM Plus Loudness Meter (Paid)

Technical Document AESTD1008.1.21-9 (supersedes TD1004) Recommendations for Loudness of Internet Audio Streaming and On-Demand Distribution September 24, 202 https://www.aes.org/technical/documentDownloads.cfm?docID=731

Reading Microphone Datasheets

For the novice a less readable part of datasheets of microphones might be the graphs about polar patterns and frequency responses. But that part is sooo important!

Polar Patterns

The polar pattern depicts the sencsitivity to sounds arriving from different angles. A polar graph uses angles as reference system, different from the more ubiquitous cartesian graphs with their x, z and mazbe z axis. The practitioner can read from a microphone’s polar pattern graph how to best position the micriphone so that it does not pick up for example typing noise from the keyboard.

microphone polar patterns

Frequency Responses

With the microphone at the very beginning of a recording signal chain it is highly advisable before doing any later EQ “equalisation” changes to the frequency spectrum to study what the microphone did apply to the frequencies at the very beginning. If the practitioner knows that the microphone has what is commonly referred to as presence boost in the (roughly) 3 kHz to 6 kHz range: Then there will hardly be any need to further boost that range in post production.

The polar patterns also show in raw data what otherwise might get marketing terms applied to, like “Exciter” oder “ContourFX”.

microphone polar patterns

Skills, Workflows

GUI Elements for ampli…

Types of Audio Editing

7 types https://audioaural.com/what-are-the-7-types-of-audio-editing-techniques/

amplification: increase the volume
compression: change the dynamics
limiting: limit sound above a certain level
panning: move the sound in and between left and right stereo channel
equalization: changes to the frequency spectrum; low-cut, high-cut, bell boost
normalization: //fixme
stereo imaging: “from mono to stereo” //fixme

Preface:

In the context of audio, Dynamic Range is the difference in Decibel between the quietest and loudest part. It can refer to both the dynamic range of the audio track(s) in a video, or the capabilities of a signal chain consisting of hardware and software.

The quietest part of a signal chain is called noise floor and each piece of equipment has one – even a cable.

The dynamic range can be altered by compression and expansion: Compressors (and in its extreme form Limiters) reduce Dynamic Range, while Expander (and in its extreme form Gates) increase it.

Wait, what? A Noise Gate does INcrease what exactly? Heh – the dynamic range!

By the way, in an audio )post-)processing workflow they have their place, and quite often chained after each other.

https://ask.audio/articles/mixing-concepts-what-is-dynamics-why-you-should-care

noise gate
expander

Compressor

A compressor (is an electrical amplifier that) reduces the dynamic range of a signal. Basically, it brings the loud and quiet parts closer together in terms of volume. Most compressors have a few standard controls: Threshold determines how loud the incoming signal needs to be before it gets any treatment. Ratio adjusts how much the incoming signal, that is above the threshold, is compressed. The higher the ratio, the more squished the sound. Attack/Release controls how long it takes the compressor to react to the incoming signal. Lastly, input gain simply makes the pretreatment sound louder, allowing more of the signal to be above threshold, resulting in more compression. https://www.dancemusicnw.com/compressors-expanders-gates-explained/

Threshold: high for limiting, at quietest for overall compression. set at level of tyical distance to typical level Ratio 4:1 for voice recording (3:1 - 5:1) Attack: 3ms time attack quickly, release slower Release time 100ms <————- longer than release, for drawn-out endings of words; if too long it nerver releases and we hear a so-called “pumping effect” Knee <– for voice not so important, it’s like a bezier curve at the threshold

Make-up gain or output gain: Not necessarily an aspect of the compressor. But as the compressor, making the loudest parts quieter, tends to turn the overall level down, there is a need to raise the overall level by adding make-up gain.

Recommended “reading”: https://www.youtube.com/watch?v=DYOuClAWokg Compression for Voiceover Booth Junkie

Expander and Gate

A Gate, or noise gate, is like an inverted compressor: Instead of limiting signals above a given threshold, the gate lowers a signal below a given threshold.

Equalisation: Changes to the Frequency Spectrum

Sibilance

start with microphjone frequency response chart

Types of EQ

Graphic EQ
Parametric EQ

Parametric EQ

Use studio headphones

Low-pass: Removes all frequencies above a set frequency
High-pass: Removes all frequencies below a set frequency
Low-shelf: Attenuates all frequencies below a set frequency
High-shelf: Attenuates all frequencies above a set frequency
Peak: Increase frequencies at a set frequency
Notch: Decrease frequencies at a set frequency

Remember to use a narrow Q factor when you’re cutting frequencies and use > a wide Q factor when boosting. But rules can be broken, just as long as > it sounds good. https://talkinmusic.com/how-to-eq-vocals/

no recipes, rather stretegies

The most obvious approach to EQ is push up more of what you want to hear. Getting the right microphone for the voice and recording it well should mean little or no EQ is needed. … If the performer was sticky and click-y, you’ll be hard pressed to boost in the 2-5 kHz range and not bring out mouth noise in the process. In my experience EQ isn’t going to help you get rid of mouth noise, but it can definitely make it worse. https://theproaudiofiles.com/voice-processing-eq-cuts-boosts/

what happened before? (mic frequency response)

Roll off the low-end starting around 90 Hz.

Reduce the mud around 250 Hz.

Add a high shelf around 9 kHz & a high roll off around 18 kHz.

Add a presence boost around 5 kHz.

Boost the core around 1 kHz to 2 kHz.

Reduce sibilance around 5 kHz to 8 kHz.

https://ledgernote.com/columns/mixing-mastering/how-to-eq-vocals/

EQ in the postprocesing workflow

gate (before compressor with its make-up gain)
EQ cuts
compression
EQ boosts
limiter

Roll off the low frequencies if the proximity effect is causing unusual > bassiness. Don’t roll off so much low end as the voice loses some of its umph. Yes, > I’m using “umph” as a technical word. Boost in the 1KHz to 5KHz range for improving intelligibility and clarity. Boost in the 3Khz to 6Khz range to add brightness. This can help with > speakers with poor intonation. Boost in the 4.5Khz to 6Khz range to add presence. Note that too much > boosting in this area can produce a thin lifeless sound. Boost in the 100Hz to 250Hz for a boomy effect In case your head is about to explode from an information overload, remember these key points; The above points can contradict each other. There is no hard and fast rule. Mixing is as much an art as a science. Trust your ears over everything else. It’s possible that once you EQ the vocal channel that it’s a little lacking in the low end. Boost it a bit give it that full sound. Again, trust your ears. Close your eyes and ask yourself if it a) sounds natural and b) sounds clear. https://www.behindthemixer.com/how-eq-speech-maximum-intelligibility/

Electric Signals, Noise, Discretization

Why is the microphone such a strange thing for us IT guys?

As programmers, IT technicians or system administrators we are used to highly complex systems, comprised of hardware, operating system, programming languages with their respective runtimes or interpreters, peripheral devices and software.

These machines are based on basic physical laws, electrons and their digitization.

foo

To illustrate the various disciplines in physics, electrical engineering and computer sciende let’s have a look at the blackboard in the lecture “Circuits and Electronics” by Professor Anant Agarwal at the Massachusets Institure of Technology: About half a dozen disciplines fill his blackboard, and on the left we are dealing with simple electronic circuits and their behaviour that can be expressed in Volt, Ampere and Ohm.

foo

Here is a visualization of the effect of noise when transmitting a combination of two signals.

Coming from the high-level digital world on the right side of the blackboard we IT people can dive into the analog world for example when experimenting with an Arduino or the GPIO pins of a Raspberry Pi.

foo

Looking at the example again we see that just to determine if such a signal expresses a binary “on” or “off” poses a challenge – as everybody knows who tried to implement a bare-bones shutdown-button for the Raspberry Pi.

This is pretty for from the usual “Hello World!” example in our favorite programming languages!

Now I am coming back full circle:

The microphone is such a basic electrical device, producing electrical signals measured in 1/1000 of a Volt.

foo

Here is a chart of the signals strengths with the equivalents of Volt and Decibels referenced in Voltage. (Yes, there is a straightforward conversion!) The microphone “transduces” a pretty low signal! One millivolt plus minus a half!

foo

And on the way to our software on the operating system of our choice we need to traverse the whole blackboard of Professor Agarwal!

foo

And on the way we inevitably are dealing with the same problems as before, noise and / or the need to digitize the signal (not only into binary 1 or zero but commonly into 16 bit, er even 24 or 32 bit).

All this is just my long-winded approach to re-state best practices in dealing with audio recordings from a microphone:

foo

Produce the cleanest possible signal, raise it as quickly to line level, and use hardware dedicated to high quality conversion from the analog world to the digital world.

Next time you are using a microphone, remember the picture of noise sitting on top of a signal, and remeber the blackboard of Professor Agarwal!

//todo: incorporate analog playout //analog reconstruction of two consecutive digital samples peaking at 0 dBFS may result in an analog wave that peaks higher than either of the samples //https://www.mixinglessons.com/dbtp-decibel-true-peak/

Off-The-Shelf Technology

Audio Signal Chain

To sound good in webinars and on YouTube many small adjustments need to be made, and it is best to identify the components involved – so they can be adressed individually.

For the sake of completeness the room where we record should be put on the list first. As it will have sounds coming in from the outside, and sound reflecting within its walls.

The we obviously will talk into a microphone, one of the many types and models. A microphone converts soundwaves to analog audio signals.

The cable connecting it will have 2 or three connections, and the connector will determine where we can plug it in. AND the cable will give us a hint about the level.

Within the cable run there is the possibility to insert an additional microphone pre-amplifier.

Preferrably a standalone audio interface will give us the microphone pre-amplification, knobs to turn the gain level, and will transform the analog signal into a digital.

By the way, a USB microphone “just” has a built-in audio interface.

On the computer that an audio interface is connected to many things can happen to the signal, like

processing it with a digital workflow
embedding it with video
streaming it to YouTube
storing it to in lossless or lossy formats

The signal chain does not end there though, let’s keep in mind that the viewer or listener will

Audio Cables and Plugs

foo

Youlean Loudness Meter

OBS Multi-Track Recording

Applications

Makefile for youtube-dl

Any repetitive manual task should be automated, for example the analysis of a whole YouTube channel. For speedy access to the videos the files need to be downloaded to the local harddrive, which is the core functionality of the Python-based commandline program youtube downloader and its improved fork yt-dl.

And a conveninet way to make this whole process repeateble is to write a shell script in the form of a Makefile. To run such files from the Unix world on Windows the WSL Windows Subsystem for Linux can be used.

The Makefile is published as a GitHub Gist https://gist.github.com/cprima/0d331d5a2c366688998071f6b61774fd

Result is that a few commands may be run daily and produce the raw material for an analysis.

Youtube downloader keeps track of the videos already downloaded in the past.

The metadata from YouTube is returned in JSON Format and again Python is a good technological fit to aggregate each individual file into a tabular format. A quick Jupyter Notebook does the job well.

#/**
# * Helper commands to analyse YouTube videos
# *
# * @author: Christian Prior-Mamulyan <cprior@gmail.com>
# * @license: MIT
# *
# * @usage: export YTCHANNELVIDEOURL=https://www.youtube.com/watch?v=iavXywwUh1c
# * @usage: make -e download_audio
# *
# * depends on https://github.com/yt-dlp/yt-dlp
# */

YTCHANNELBASEURL=https://www.youtube.com/channel/
YTCHANNELID=UCPdtz4gd_iYebJFYq9N8pWA
YTCHANNELVIDEOURL?=https://www.youtube.com/watch?v=7UN1xRN24xY
BATCH_FILE=urls.txt
ARCHIVE_FILE=archive.txt
SLEEP_REQUESTS=8
MIN_SLEEP_INTERVAL=8
MAX_SLEEP_INTERVAL=128
SUBTILES_LANG=en
LIMIT_RATE=5M
TROTTLED_RATE=50K
_BESTEALTHY=--limit-rate $(LIMIT_RATE) --throttled-rate $(TROTTLED_RATE) --min-sleep-interval $(MIN_SLEEP_INTERVAL) --max-sleep-interval $(MAX_SLEEP_INTERVAL) --sleep-requests $(SLEEP_REQUESTS)
_WRITEMETADATA=--write-url-link --write-info-json --write-description --write-info-json --write-thumbnail --write-auto-subs --sub-lang $(SUBTILES_LANG)


.PHONY: list_channel_videourls
list_channel_videourls:
	@echo "# pipe the output to e.g. >> $(BATCH_FILE)"
	@yt-dlp --get-filename --no-warnings --quiet -o "https://www.youtube.com/watch?v=%(id)s" $(addprefix $(YTCHANNELBASEURL),$(YTCHANNELID))

.PHONY: download_videos
download_videos:
	yt-dlp --limit-rate $(LIMIT_RATE) $(_BESTEALTHY) --continue --ignore-errors --no-overwrites --output "%(upload_date)s_%(acodec)s_%(asr)s_%(title)s.%(ext)s" --batch-file $(BATCH_FILE) $(_WRITEMETADATA)

Audacity and Room Reverb

UiPath Robot Process for YLM Automation

My favorite software for loudness analysis, Youlean Loudness Meter, cannot be automated out-of-the-box. But fortunately any type of GUI automation can be done not just with software from test automation but it is a core feature of RPA software, Robotic Process Automation. And with the aim to analyze the YouTube channel of Anders Jensen, RPA developer and Most Valuable Professional in forum of the software vendor “UiPath” – what would be a better fit than UiPath’s CV activities Computer Vision activities.

UiPath is the world’s leading RPA software company, started in 2015 in Bukarest, Romania and has since seen a rockeetship growth. Part of the capabilities of the development tool “UiPath Studio” are the Computer Visicion activities. Even such hard to automate GUIs like the one of Youlean Loudness Meter can be automated by Computer Vision’s AI-enhanced Click, Type into or dropdown select capabilities. All without having to resort to image scraping.

An efficient anylssis of Anders Jensen’s 350+ video was impossible without this UiPath Robot Process!

Key takeaways

Preface

Goals

Preparation

Checklists

How it started

Step 1: Reading the incident report

verfifying the incident report

ticket triage

Analysis of the YouTube channel

YouTube recommendations

Sources of Noise

Evaluation of all 350+ videos

Hidden Treasures in the YouTube API and youtube-dl output

Remedies

Eliminate *noise

Room Acoustics, Reverberation Time and Clap Test

Resources

my room

Audio Settings on Windows

Checklist

Loudness GUI

Checklist for All Things Audio

Checklist

knowledge, facts, data

Audio Signal Levels

The Audio Frequency Spectrum

Loudness

Reading Microphone Datasheets

Polar Patterns

Frequency Responses

Skills, Workflows

GUI Elements for ampli…

Types of Audio Editing

Compressor

Expander and Gate

Equalisation: Changes to the Frequency Spectrum

Electric Signals, Noise, Discretization

Off-The-Shelf Technology

Audio Signal Chain

Audio Cables and Plugs

Youlean Loudness Meter

OBS Multi-Track Recording

Applications

Makefile for youtube-dl

Audacity and Room Reverb

UiPath Robot Process for YLM Automation

DaVinci Resolve Fairlight Audio Page