FFmpeg examples

ffmpeg's documentation: ffmpeg.org/documentation.html

gluing video files together

inline, with different-sized video inputs:
ffmpeg -i IMG_3198.MOV -i IMG_3199.MOV -i IMG_3201.MOV -i IMG_3202.MOV -i IMG_3203.MOV -i title.mp4 -filter_complex "[0:0] [0:1] [1:0] [1:1] [2:0] [2:1] [3:0] [3:1] [4:0] [4:1] [5:0] [5:1] concat=n=6:v=1:a=1:unsafe=1 [v] [a]" -map "[v]" -map "[a]" -s 960x540 output.mp4

inline, with similar inputs:
ffmpeg -hide_banner -i "Wow-64 2015-01-19 22-08-49-73.avi" -i "Wow-64 2015-01-19 22-08-49-73.avi" -filter concat=n=2:v=1:a=1 -c:v libx264 -crf 24 -preset slow -b:a 96k turning_in_10_onyx_eggs.mp4

with a file:
ffmpeg -f concat -i filelist.txt output.webm

fixing "Unsafe filename": ffmpeg -safe 0 -f concat -i filelist.txt ...; remember to use single quotes in filelist.txt. concat demuxer documentation

video duration

stop encoding after output duration reaches 1 minute: ffmpeg -i longvid.avi -t 00:01:00 "first minute.mp4"

stop encoding after reading 1 minute from input: ffmpeg -t 00:01:00 longvid.avi "first minute.mp4"

stop writing output once it is 4 minutes long: ffmpeg -i longvid.avi -t 04:00 out.avi

stop encoding after N frames have been output: ffmpeg -i vid.avi -frames:v N out.mp4

seek past the first ~20.3 seconds of input: ffmpeg -ss 00:00:20.3 -i infile.avi outfile.avi
send first 20.4 seconds processed to /dev/null: ffmpeg -i infile.avi -ss 20.4 outfile.avi

combining both: skip 11 seconds of input and stop reading input at position 3:40 (=input duration is 3:29): ffmpeg -ss 11 -to 3:40 -i longvid.mp4 -c:v libx264 -crf 20 out.mp4
and a modification: read and encode those 11 seconds in the beginning but don't include them in the output (this avoids the input being weird because part of it was cut off): ffmpeg -to 3:40 -i longvid.mp4 -c:v libx264 -crf 20 -ss 11 out.mp4

video size

specifying resolution: ffmpeg -i large.avi -s 1280x720 out_small.webm

scaling while keeping aspect ratio: ffmpeg -i large.avi -vf "scale=w=-1:h=320" out_small.webm (relevant doc section)

resolution as memnonics: ffmpeg -i large.avi -s hvga "480 by 320.mp4" | relevant doc section | vga (640x480), hvga (480x320), qvga (320x240), cif (352x288), qcif (176x144), hd480 (852x480), hd720 (1280x720), hd1080 (1920x1080), qhd (960x540), nhd (640x360) pal (720x576), ntsc (720x480)

cropping: ffmpeg -i in.avi -filter:v "crop=out_w=702:out_h=540:x=129:y=0" out.avi | relevant doc section

an example with everything: ffmpeg -i cbnxcn.mp4 -filter:v "crop=x=115:y=145:out_w=(in_w-405-115):out_h=(in_h-115-145), scale=w=1280:h=720" -c:v libx264 -crf 24 -preset slow -c:a copy -t 15 -ss 2.3 ekjkbdko.mp4

add padding (in black) to the sides of the video so that the output is 1280x720, and center the video in both directions: ffmpeg -i oddsize.mov -filter:v "pad=w=1280:h=720:x=-1:y=-1" -c:v libx264 -crf 23 -movflags +faststart youtube_ready.mp4

video quality

specifying bitrate: ffmpeg -i in.avi -b:v 500k out.avi | documentation doesn't like this though

specifying quality, for h.264: ffmpeg -i in.avi -c:v libx264 -crf 23 out.mp4 | crf values of 18-28 are considered "sane" | crf 0 is lossless, 23 default, 51 worst | relevant wiki link

generic quality options: ffmpeg -i in.avi -q:v N out.avi | ffmpeg -i in.avi qscale:v N out.avi | meaning of -q and -qscale is codec-dependent


relevant doc section | an example where the video's timecode is burned in in seconds.frame format:
ffmpeg -i infile.avi -filter:v drawtext=" fix_bounds=1: fontfile='//COMPUTER/Users/oatcookies/Desktop/DejaVuSansMono-Bold.ttf': fontcolor=white: fontsize=24: bordercolor=black: borderw=1: textfile=burn-in.txt: x=1: y=main_h-line_h-1:" outfile.avi
where burn-in.txt has the following contents: %{expr: (floor(n/30)) + (mod(n, 30)/100)}
of course, that specific expression applies to a 30 fps video. the odd fontfile path is because ffmpeg doesn't exactly like windows' drive letters: trying 'C\:\\Windows\\Fonts\\x.ttf' (with varying amounts of (back)slashes) always resulted in an error.

a different burn-in.txt: %{expr_int_format: floor((n/30)/60) : d : 2}:%{eif: mod(floor(n/30), 60) :d:2}.%{eif: mod(n, 30)+1 :d:2} | this shows the timecode in minutes:seconds.frames format.

time code for text-burning filters


screen capture

from Windows

ffmpeg -f gdigrab -show_region 1 -framerate 30 -video_size 942x1003 -offset_x 8 -offset_y 30 -i desktop out.avi

With h264 compression but not a lot of it:
ffmpeg -f gdigrab -show_region 1 -framerate 25 -video_size 800x600 -offset_x 1 -offset_y 30 -i desktop -c:v libx264 -crf 18 -preset veryfast out.mp4

from linux

ffmpeg -f x11grab -framerate 25 -video_size 218x148 -i :0.0+0,95 -c:v libx264 -crf 21 -preset superfast screencast.mp4


making louder: ffmpeg -hide_banner -i "quiet input.mp4" -af "volume=40dB" -vcodec copy "louder output.mp4"

removing entirely: ffmpeg -i with_sound.mp4 -vcodec copy -an no_sound.mp4

sound quality: ffmpeg -i in.mp4 -b:a 64k out.mp4

example filelist.txt

file 'G:\ff\day1\2015-05-15 21-53-49-96.avi'
file 'G:\ff\day1\2015-05-15 22-03-57-86.avi'
file 'G:\ff\day2\2015-05-16 22-08-42-72.avi'

perl script to generate a file list:

my $prefix = 'G:\\ff\\';
foreach my $subdir (("day1", "day2")) {
	my $dirname = "$prefix\\$subdir";
	print("# $dirname\n");
	opendir(my $dirhandle, $dirname) || die "Can't opendir $dirname: $!\n";

	open(my $outfile, '>', "$subdir.txt") || die "Can't open > $subdir.txt: $!\n";

	my $i = 1;
	while (readdir $dirhandle) {
		if ($_ =~ /*.avi$/ ) {
			print( $outfile "file '$dirname\\$_'\n");


sed -E "s/(.*)/file '\1'/"

viewing file information

ffprobe infile.mp4

framerate acceleration

example: input was recorded at 5 fps. output should be the same frames but at 25 fps, making the output video 5x faster.

ffmpeg -i input_5fps.mp4 -r 25 -filter setpts=PTS/5 output_25fps.mp4

example: a web rip had borders and a speedup to avoid algorims.

ffmpeg -i inrip.mp4 -r 30 -ar 44100 -filter:v "crop=out_w=640:out_h=360:x=345:y=235, setpts=PTS/0.9" -filter:a "asetrate=39690" -c:v libx264 -crf 24 "output.mp4"

example: speeding up a video 16-fold but keeping its framerate

ffmpeg -hide_banner -i 1x.mp4 -filter:v "setpts=PTS/16" -r 30 16x.mp4

speed-up by 20× and scale down

ffmpeg -hide_banner -i $file -filter_complex "[0:v] setpts=PTS/20, scale=h=360:w=-2, pad=w=640:h=360:x=-1:y=-1, minterpolate=fps=60:mi_mode=blend [v]; [0:a] atempo=2, atempo=2, atempo=2, atempo=2, atempo=1.25 [a]" -map "[v]" -map "[a]" -c:v libx264 -crf 22 -preset veryfast -b:a 128k S/$file

one could also do atempo=20 instead of the 2,2,2,2,1.25 done here; the doc on atempo says "note that tempo greater than 2 will skip some samples rather than blend them in"; i didn't really hear a difference but i wanted to "blend it in" anyway. also, i noticed that if this speedup is done in two separate filters, with -filter:v and -filter:a, the encoding will kind of like hang on to the last frame and make the entire video as long as the input even tho it's speeded up in its entirety: it'll be the speeded up video and then nothing until the file is as long as the input, really weird. doing video and audio simultaneously in one filtergraph with -filter_complex fixes this.

speedup video (which lacks audio) and add silent audio: ffmpeg -hide_banner -i $file -f lavfi -i anullsrc -filter_complex "[0:v] setpts=PTS/20, scale=h=360:w=-2, pad=w=640:h=360:x=-1:y=-1, minterpolate=fps=60:mi_mode=blend" -shortest -c:v libx264 -crf 22 -preset veryfast -b:a 128k S/$file

selective muting

ffmpeg -i copyright_trolled.mp4 -c:v copy -filter:a "volume='ifnot(between(t,20834,20987),1)':eval=frame" part_muted.mp4

mp3 metadata

ffmpeg -i in.mp3 -c:a copy -metadata TSOT="sort by this string" out.mp3

ffmpeg -hide_banner -i "424745724.mp3" -c:a copy -metadata TITLE="The Title" -metadata ARTIST="Whoever" -metadata DATE=2018 -metadata LYRICS="a windows command line cannot include a newline but a unix one can" -metadata TSOT="Title, The" forarchive.mp3


doc section

ffmpeg -hide_banner -i shaky.mp4 -filter:v "deshake" -c:v libx264 -crf 23 -c:a copy lessshaky.mp4

ffmpeg -hide_banner -i shaky -filter:v "deshake=edge=original:rx=32:ry=32:blocksize=4" -c:v libx264 -crf 22 -c:a copy -t 15 less_shaky.mp4

compare videos side-by-side

ffmpeg -hide_banner -i a.mp4 -i b.mp4 -filter_complex "[0:v]pad=iw*2:ih[int];[int][1:v]overlay=W/2:0[vid]" -map [vid] -c:v libx264 -crf 22 -map 0:a -c:a copy ab.mp4

palettize and export as gif

ffmpeg -i 2020-02-24.mkv -vf palettegen palette.png

you can now edit palette.png if you want to change it, reduce the number of colours and stuff (but keeping it as a 16×16 image)

ffmpeg -i 2020-02-24.mkv -i palette2.png -filter_complex "[0] crop=w=1180:h=720:x=0:y=0,scale=472x288,crop=w=449:h=265:x=10:y=12 [x] ; [x] [1] paletteuse=dither=none:diff_mode=rectangle" -t 3 thing3sec.gif

generate test screen with text and a silent audio track

ffmpeg -f lavfi -i anullsrc -f lavfi -i "yuvtestsrc=rate=60:size=1280x720,drawtext=text='lol ┐(´∀\` )┌':fontcolor=white:fontsize=48:bordercolor=black:borderw=3:x=(w-text_w)/2:y=(h-text_h)/2" -c:v libx264 -crf 22 -b:a 256k -t 5 testfoo.mp4

[test screen]

generate just a test screen, play it back immediately

ffplay -f lavfi "yuvtestsrc"
or instead of yuvtestsrc try testsrc (which has a built-in seconds counter) or pal75bars or rgbtestsrc or smptehdbars.

generate a mandelbrot set, zooming in

ffplay -f lavfi "mandelbrot=size=1024x768:inner=convergence" ; doc section

generate a video stream of oscillating greyness

ffplay -f lavfi -i "color=0x808080:size=640x480,hue=b=sin(PI*t)*1.5"

threshold another video, cycling the threshold value

ffmpeg -i input720p.mkv -f lavfi -i "color=0x696969:size=1280x720,hue=b=sin(PI*t)*2" -f lavfi -i color=0x111111:size=1280x720 -f lavfi -i color=0xDDDDDD:size=1280x720 -lavfi threshold cyclic_threshold.mkv

Doc on the threshold filter, doc on the hue filter. 0x696969, or "DimGray", seems to be a pretty decent threshold value.

burn subtitles

ffmpeg -i video.mp4 -vf "subtitles='subs.srt'" (-c:a copy -c:v libx264 etc...) video_sub.mp4 (filter documentation, ffmpeg wiki link)

play around with mixing color channels into a greyscale video

ffplay -i video.mkv -vf "split=3 [v1][v2][v3] ;
[v1] colorchannelmixer=.7:.2:.1:0:.7:.2:.1:0:.7:.2:.1:0[v4] ;
[v2] crop=in_w/3:ih:in_w/3:0, colorchannelmixer=.33:.34:.33:0:.33:.34:.33:0:.33:.34:.33:0 [v5] ;
[v3] crop=in_w/3:ih:2*in_w/3:0, colorchannelmixer=.2:.3:.5:0:.2:.3:.5:0:.2:.3:.5:0 [v6] ;
[v4][v5] overlay=main_w/3:0 [v7] ;
[v7][v6] overlay=2*main_w/3:0,
drawtext=text='70 20 10':x=0:y=10:bordercolor=white:borderw=1,
drawtext=text='33 34 33':x=w/3:y=10:bordercolor=white:borderw=1,
drawtext=text='20 30 50':x=2*w/3:y=10:bordercolor=white:borderw=1"


youtube-dl https://www.youtube.com/watch?v=dQw4w9WgXcQ -f 137+140 -o "%(title)s.%(ext)s" --get-filename

-o "%(upload_date)s %(title)s %(id)s.%(ext)s"

youtube-dl https://www.youtube.com/watch?v=3MqYE2UuN24 --list-subs

youtube-dl https://www.youtube.com/watch?v=3MqYE2UuN24 -f 22 --write-sub --sub-format vtt --sub-lang en

-o "%(upload_date)s_%(title)s_%(id)s.%(ext)s" -f "720p60-0/480p/360p/160p"

turn an image file and an audio file into a video file

ffmpeg -loop 1 -i picture.png -i audio.mp3 -map 0:v -map 1:a -c:v libx264 -crf 24 -b:a 320k -r 30 video.mp4

This'll make the video the resolution of the image, and at 30 frames per second (the -r 30 part). For a different size video, use (for example) -s 1080x1080 after the -map 1:a flag but before -c:v.

Note that this will keep adding soundless frames of the picture to the end of the video until you quit the program (by pressing q); it doesn't stop when the input audio stops. I haven't figured out how to stop when the audio stops, but you can get around this by first running ffprobe on the audio, getting its exact length, then adding (for example) -t 03:14.15 right before the output file name to make the video exactly 3 minutes and 14.15 seconds long.

meaning of "fps/tbr/tbn/tbc" in ffmpeg's/ffprobe's output

Short answer: fps is average frames per second, tbr is a different kind of framerate, tbn is the time base that the time stamps of the video's frames are internally represented in, and tbc (when it existed) is the codec's time base. (tbc was removed in April 2021, in a commit called "remove remnants of codec timebase".) For most purposes, "fps" and "tbr" are the same, and "tbn" and "tbc" are irrelevant.

What does this mean? This Stack­Overflow answer explains it well; here follows my explanation based on that explanation.

(Modern) video codecs don't store the list of frames, then a command of "play these back at (for example) 30 fps". Variable frame rates are annoying but a useful concept – maybe the camera lags, or maybe it's a stream over an unreliable internet connection where frames are received out of order or some of them are dropped. If each frame itself said what time it's supposed to be shown, they can be buffered out of order, then played in-order, or one frame can be held for enough time to skip over missing frames and the next is shown at the correct time to keep in sync with the separately-transmitted audio. Or maybe you're recording a video game, at exactly one of the game's frames becoming one of the video's frames, but video games (especially 3D ones) rarely run at an exactly constant framerate.

Each frame has a presentation timestamp, PTS, stored with it (this is what the setpts filter does, and why it can be used to slow down or speed up video – if you double the PTS, each frame is shown twice as late as it usually would, hence the video's speed is halved). This PTS value is stored as an integer, as floating-point values are notoriously imprecise. It's typically relative to the start of the video, so the first frame has a PTS of zero. The time base, tbn ("tb" for timebase and "n" for "as a number"), indicates the magnitude of these PTS units; for example, if the tbn is a sixtieth of a second, and the PTS of a frame is 181, then that frame ought to be displayed at 3 seconds + 1/60th of a second. (So, in principle, a video could be sped up or slowed down by changing its time base, instead of all of its frames' timestamps, but it appears to be good practice to not mess with tbc.) What FFmpeg/FFprobe shows is the reciprocal of this time base – it's stored as a fraction of a second, for example 0.01666…, but it's then inverted to turn it into 60 tbn. A tbn of 60 is rather unlikely, though, because it's so low in precision. I've seen tbn values of "1k", "15360", and "90k"; these are better suited, because they have a lot of factors. (1000 doesn't, but it specifies a millisecond, which is probably good enough; 15360 is 512 (a power of two) times 30, so it can neatly specify multiples and quite a few fractions of 30. 90,000 is divisible by 24, 25, and 30, but its reason for existence is due to some codecs (M2TS on BluRay) specifying it.)

The tbc field has the same idea as tbn, but instead of it being the video file's time base, it's the video codec's. These aren't necessarily the same, as for example the codec might have a default that a file can override. Use of tbc appears to have been phased out.

The tbr field is just the frame rate, but derived from a different source; it is typically but not always the same as the value in the "fps" field.

The code that writes this info in the command-line output is in the FFmpeg code tree's libavformat/dump.c, in the function dump_stream_format, and has been for years. See it on github.

originally created , last edited · index