Page History

Video Recording

Mark George edited this page on 10 Sep 2021

Clone this wiki locally

General workflow

  1. Capture the content. Pause (yourself, not the recording) for a second at section/slide boundaries to make it easier to cut and replace sections/slides.
  2. Separate the audio and video streams into different files.
  3. Clean up the audio with Audacity, and export the result as a WAV. Don't perform any cutting at this point.
  4. Add both streams (using the processed audio stream) to a Shotcut project and perform any cutting/editing that is necessary.
  5. Export the result using a lossless preset.
  6. Check the result. I have seen Shotcut output large files that are entirely black with no audio which is kind of impressive, and also annoying.
  7. Transcode the result to the desired output format if necessary (for things like Blackboard).
  8. Compress all source and intermediate files that you want to keep into either FLAC (audio only), or H.265 + FLAC + Matroska using maximum lossless compression. These files can be uploaded to echo360 if needed - they accept these formats.
  9. Verify that the compressed versions are good, and if so, delete the original captures to save some disk space.

Screen capture

The aim here is to capture at the highest quality possible with minimal CPU usage (so that the capture can keep up without dropping frames, and to leave some head room for other processes). Some quick testing suggests H.264 in lossless mode seems to be the the best option. Of the lossless codecs tested it had the best combination of file size (way smaller than HuffYUV) and speed (can mostly keep up with the capture even on my old laptop). It is also one of the best supported codecs for compatibility.

The capture file will be largish (around 20MB per minute), and will require some further encoding once the capture/editing is completed to reduce the file sizes.

Screen only (no audio)

Lossless H.264 video:

ffmpeg -hide_banner \
-f x11grab -video_size 1920x1080 -framerate 24 -probesize 20M -i :0 \
-c:v libx264rgb -crf 0 -preset ultrafast \
-f mp4 \
capture.mp4

Press q to stop the recording.

Using MP4 as the container seems to be faster, so for high-res screen-only demos, this is the best choice.

It is the -crf 0 option that makes the capture lossless. The -probesize is there to give the encoder enough initial data to allow it to analyse the input streams. The value 20M was determined experimentally - values lower than this generate a warning.

Identifying audio capture devices

Under Linux, PulseAudio seems to be the best bet for configuring multiple audio devices.

To find a recording device's ID:

pactl list sources

Screen and audio

We are using Matroska as the container since it is by far the most flexible of the containers. MP4 does not allow for any lossless audio formats.

Lossless H.264 video, and 16bit PCM (WAV) audio:

ffmpeg -hide_banner \
-f x11grab -video_size 1920x1080 -framerate 24 -probesize 20M -i :0 \
-f pulse -ac 1 -i default \
-c:v libx264rgb -crf 0 -preset ultrafast \
-c:a pcm_s16le \
-f matroska \
capture.mkv

Hardware accelerated capture via VAAPI (Linux)

Note that I have pretty much given up on accelerated capture/processing. The captured audio is glitchy, there is no way to capture losslessly, and the speed gain on older hardware is negligible. Basically, not worth it. I get the impression that CPU-only is the preferred mechanism - even if you have hardware-acceleration capabilities. The results are just better, and more consistent assuming you have a reasonable CPU and the time to wait.

Requires libva-intel-driver package for i915 chipsets.

This does improve encoder speed a bit, but has limited features. There is no true lossless mode, and no true CRF mode. However, the following will do a reasonable job, although will likely produce larger file sizes:

Capture (screen only)

ffmpeg -hide_banner \
-hwaccel vaapi -vaapi_device /dev/dri/renderD128 \
-f x11grab -video_size 1920x1080 -framerate 15 -probesize 20M -i :0 \
-vf 'hwupload,scale_vaapi=format=nv12' -c:v h264_vaapi -qp 1 -compression_level 2 \
-f matroska \
capture.mkv

Capture (screen + audio)

ffmpeg -hide_banner \
-hwaccel vaapi -vaapi_device /dev/dri/renderD128 \
-f x11grab -video_size 1920x1080 -framerate 15 -probesize 20M -i :0 \
-f pulse -ac 1 -i default \
-vf 'hwupload,scale_vaapi=format=nv12' \
-c:v h264_vaapi -qp 1 -compression_level 2 \
-c:a pcm_s16le \
-f matroska \
capture.mkv

Windows

Requires a dshow screen recorder:

https://github.com/rdp/screen-capture-recorder-to-video-windows-free

Capture (screen + audio)

ffmpeg -f dshow -i video="screen-capture-recorder":audio="Microphone (3- Logitech USB Headset H340)" -r 15  -c:v libx264 -crf 0 -preset ultrafast -c:a pcm_s16le -f matroska capture.mkv

To view dshow devices (to find microphone device):

ffmpeg -list_devices true -f dshow -i dummy

Cutting and Editing

Shotcut

Shotcut is the best tool I have found for lossless frame-exact cutting and editing. You can easily add a lossless profile to Shotcut in the export options. A full transcode will be necessary to get frame-exact cuts, but it should be lossless, and fairly fast.

Shotcut can sometime cause audio glitches around cut points. This is likely related to imperfect seeking inside the video container. A workaround is to separate the audio and video into different streams and add them both to the Shotcut timeline. This allows much more accurate cutting of the audio and does not seem to cause the same glitches. As an added bonus you now have separate audio and video which can make things like replacing slides in the video much easier.

Lossless preset

Shotcut does not include a lossless export preset by default, but it is easy to add. Create an export preset with the following settings:

preset=ultrafast
f=matroska
acodec=pcm_s16le
vcodec=libx264
crf=0
rescale=lanczos

OpenShot

OpenShot is easier to use, but is often very laggy, and currently has no lossless export ability. It is worth keeping an eye on since there is work underway to add lossless exporting.

Cutting with FFmpeg

You can also cut with FFmpeg. Again, a full transcode will be necessary to get frame-exact cuts, but it should be lossless, and fairly fast.

start="00:00:10.5"  # cut the first 10.5 seconds
end="00:48:32.4" # cut after 48 minutes and 32.4 seconds
ffmpeg -hide_banner \
-i input.mkv \
-ss "${start}" -to "${end}" \
-c:a pcm_s16le \
-c:v libx264 -crf 0 -preset ultrafast \
-f matroska \
cut.mkv

You can also use a start and a duration:

start="00:00:10.5"  # cut the first 10.5 seconds
duration="00:00:30" # cut 30 seconds after start
ffmpeg -hide_banner \
-i input.mkv \
-ss "${start}" -t "${duration}" \
-c:a pcm_s16le \
-c:v libx264 -crf 0 -preset ultrafast \
-f matroska
cut.mkv

Joining videos

You can join videos into a single file by creating a text file that looks like:

file scene1.mkv
file scene2.mkv
file scene3.mkv

You then run FFmpeg using:

ffmpeg -hide_banner -f concat -i scenes.txt -c copy joined.mkv

Audio processing

If audio needs cleaning up (noise reduction, high pass filter, dynamic range compression, etc) then export it to a WAV file, and fix it using something like Audacity or SOX, then replace the audio track.

  1. Extract audio to WAV:
    ffmpeg -i capture.mkv audio.wav
  2. Process the audio and export to WAV.

  3. Strip existing audio track from capture:

    ffmpeg -i capture.mkv -an -c:v copy capture_no_audio.mkv
  4. Add processed audio track back to capture:

    ffmpeg -i capture_no_audio.mkv -i processed_audio.wav -c:a copy -c:v copy capture_processed_audio.mkv

Using the Fraunhofer FDK AAC encoder

The FDK AAC encoder is generally considered to produce higher quality output than the default AAC encoder that is built into FFmpeg. You can get versions of FFmpeg that have the FDK encoded compiled in, but it is much easier to just encode the audio separately:

  1. Extract and strip the audio as shown above.
  2. Encode the audio using:
    fdkaac -b 64k --raw-channels 1 audio.wav
  3. Add the generated m4a file to the stripped video:
    ffmpeg -hide_banner \
    -i stripped.mp4 -i audio.m4a -c copy \
    -f mp4 \
    out.mp4

Encoding for streaming

In general, the best combination of codecs and container for compatibility with most video players (including embedded web-based players, and casting devices) is:

  • Video codec: H.264
  • Audio codec: AAC
  • Container: MP4

Echo 360 will transcode the video into their own preferred format (MP4/H.264/AAC) and bitrates, so don't bother doing any further processing beyond what is necessary. They seem to be pretty flexible with formats - I have uploaded a Matroska/H.265/FLAC video, and it worked fine.

If you are uploading to Blackboard then you should transcode into H.264/AAC/MP4 format, and lossy compress the data. We should also set a maximum bitrate to avoid spikes in the bandwidth that may cause player buffering.

ffmpeg -hide_banner \
-i input.mkv \
-c:a aac -b:a 64k \
-c:v libx264 -crf 18 -preset veryslow -maxrate 500K -bufsize 2M \ 
-f mp4 \
blackboard.mp4

The bufsize represents the expected client players's buffer (how much of the video is expected to be buffered while playing) and defines how much wiggle room the encoder has to enforce the maxrate. You would expect at least a couple of seconds of buffer, so bufsize should be at least twice the maxrate. Too small a bufsize can cause noticeable artefacts in the video in complex scenes. Too high a bufsize will mean the client needs to use more buffering - on a bad network that can cause the video to stop while the buffer is filling. It is a bit of a balance between crf, maxrate, and bufsize - if your crf is too low then your required bitrate will regularly be higher than the maxrate - increasing the bufsize here is pointless. You are aiming for a crf that gives you an average bitrate of just below the maxrate - then go for a bufsize that seems reasonable (2-3 times the maxrate) - if there is noticeable artefacts in complex scenes then increase bufsize.

Scaling

If you want to reduce the file size even more then scaling down to 720P might be a good idea:

ffmpeg -hide_banner \
-i input.mkv \ 
-vf scale=-1:720 \
-c:a aac -b:a 64k \
-c:v libx264 -crf 18 -preset veryslow -maxrate 300K -bufsize 1M \
-f mp4 \
final.mp4

Archival compression

You can shrink the original capture files by using the 'veryslow' preset. It will not impact the quality in any way - the result is still lossless (when combined with crf 0), but it does a LOT more processing to pull off the smaller file size so can take a long time to complete. We can also encode the audio to heavily compressed (lossless) FLAC:

ffmpeg -hide_banner \
-i input.mkv \
-c:a flac -compression_level 12 \
-c:v libx264 -crf 0  -preset veryslow \
-f matroska \
archive.mkv

H.265 offers better compression. For archival purposes, it is the better choice:

ffmpeg -hide_banner \
-i input.mkv \
-c:a flac -compression_level 12 \
-c:v libx265 -x265-params lossless=1 -preset veryslow \
-f matroska \
archived.mkv

Neither H.264, H.265, or FLAC have built-in redundancy so are not proper archival formats. FFV1 is a video codec that does have redundancy and is intended for long-term archival purposes but produces files sizes that are approximately 25 times what can be achieved with lossless H.265. Unfortunately there does not appear to be a well recognised audio codec that contains redundancy. The data is stream-based however, so an occasional bit flip here and there should not be too catastrophic. Lossless H.265/FLAC seems to be the most appropriate format for medium-term archival purposes at this point.

Testing for lossless encoding

It would be a good idea to verify that you have a good archival copy before deleting the original version of a file. There are two options:

Hash Demuxer

Will produce a hash of a stream. Can be used to compare two videos to ensure they are identical.

# video
ffmpeg -i input.mkv -an -f hash -hash md5 -

# audio
ffmpeg -i input.mkv -vn -f hash -hash md5 -

Using FFplay to display the difference between two videos

ffplay -f lavfi \
"movie=original.mkv[org]; \
 movie=compressed.mkv[enc]; \
 [org][enc]blend=all_mode=difference"

If the video is identical you will see solid green. This is probably more useful for comparing the amount of difference due to lossy encoding, but can still be used to check for lossless encoding.

Metadata

Metadata like author, title, description, comment can be added at any point by using the -metadata option with FFMPEG:

ffmpeg ... -metadata author="Boris McNorris"  -metadata title="Lecture 3: Something about some stuff" output.mkv

See the following for supported fields:

https://wiki.multimedia.cx/index.php/FFmpeg_Metadata

Reference material

https://slhck.info/posts/

Troubleshooting

"Too many packets buffered..."

You will sometimes seen an error stating:

Too many packets buffered for output stream

This is a long-standing bug in FFMPEG where it picks an insufficient muxing queue length. You can manually increase the queue length using:

-max_muxing_queue_size 1024

1024 seems to be sufficient in most cases. If you still get the error then increase the number.

"Thread message queue blocking..."

On high-end CPUs that have a lot of cores, the thread message queue can become a significant bottleneck at its default sizes resulting in poor performance.

You will see a message that states:

Thread message queue blocking; consider raising the thread_queue_size option

Raising the queue will solve the problem:

-thread_queue_size 1024 \
-threads 16