ffmpeg's documentation: ffmpeg.org/documentation.html
inline, with different-sized video inputs:
ffmpeg -i IMG_3198.MOV -i IMG_3199.MOV -i IMG_3201.MOV -i IMG_3202.MOV -i IMG_3203.MOV -i title.mp4 -filter_complex "[0:0] [0:1] [1:0] [1:1] [2:0] [2:1] [3:0] [3:1] [4:0] [4:1] [5:0] [5:1] concat=n=6:v=1:a=1:unsafe=1 [v] [a]" -map "[v]" -map "[a]" -s 960x540 output.mp4
inline, with similar inputs:
ffmpeg -hide_banner -i "Wow-64 2015-01-19 22-08-49-73.avi" -i "Wow-64 2015-01-19 22-08-49-73.avi" -filter concat=n=2:v=1:a=1 -c:v libx264 -crf 24 -preset slow -b:a 96k turning_in_10_onyx_eggs.mp4
with a file:
ffmpeg -f concat -i filelist.txt output.webm
if all the input files listed in filelist.txt have the same format, you can add in -c:v copy -c:a copy
just before the output file name to make the process much faster, and (afaik) lossless.
(e.g. "ffmpeg -f concat -i filelist.txt -c:v copy -c:a copy output.mkv
". but again, all the files must have exactly the same codec/format.)
fixing "Unsafe filename": ffmpeg -safe 0 -f concat -i filelist.txt ...
; remember to use single quotes in filelist.txt.
concat demuxer documentation
stop encoding after output duration reaches 1 minute: ffmpeg -i longvid.avi -t 00:01:00 "first minute.mp4"
stop encoding after reading 1 minute from input: ffmpeg -t 00:01:00 longvid.avi "first minute.mp4"
stop writing output once it is 4 minutes long: ffmpeg -i longvid.avi -t 04:00 out.avi
stop encoding after N frames have been output: ffmpeg -i vid.avi -frames:v N out.mp4
seek past the first ~20.3 seconds of input: ffmpeg -ss 00:00:20.3 -i infile.avi outfile.avi
send first 20.4 seconds processed to /dev/null: ffmpeg -i infile.avi -ss 20.4 outfile.avi
combining both: skip 11 seconds of input and stop reading input at position 3:40 (=input duration is 3:29): ffmpeg -ss 11 -to 3:40 -i longvid.mp4 -c:v libx264 -crf 20 out.mp4
and a modification: read and encode those 11 seconds in the beginning but don't include them in the output (this avoids the input being weird because part of it was cut off): ffmpeg -to 3:40 -i longvid.mp4 -c:v libx264 -crf 20 -ss 11 out.mp4
specifying resolution: ffmpeg -i large.avi -s 1280x720 out_small.webm
scaling while keeping aspect ratio: ffmpeg -i large.avi -vf "scale=w=-1:h=320" out_small.webm
(relevant doc section)
resolution as memnonics: ffmpeg -i large.avi -s hvga "480 by 320.mp4"
| relevant doc section | vga (640x480), hvga (480x320), qvga (320x240), cif (352x288), qcif (176x144), hd480 (852x480), hd720 (1280x720), hd1080 (1920x1080), qhd (960x540), nhd (640x360) pal (720x576), ntsc (720x480)
cropping: ffmpeg -i in.avi -filter:v "crop=out_w=702:out_h=540:x=129:y=0" out.avi
|
relevant doc section
an example with everything: ffmpeg -i cbnxcn.mp4 -filter:v "crop=x=115:y=145:out_w=(in_w-405-115):out_h=(in_h-115-145), scale=w=1280:h=720" -c:v libx264 -crf 24 -preset slow -c:a copy -t 15 -ss 2.3 ekjkbdko.mp4
add padding (in black) to the sides of the video so that the output is 1280x720, and center the video in both directions:
ffmpeg -i oddsize.mov -filter:v "pad=w=1280:h=720:x=-1:y=-1" -c:v libx264 -crf 23 -movflags +faststart youtube_ready.mp4
specifying bitrate: ffmpeg -i in.avi -b:v 500k out.avi
| documentation doesn't like this though
specifying quality, for h.264: ffmpeg -i in.avi -c:v libx264 -crf 23 out.mp4
| crf values of 18-28 are considered "sane" | crf 0 is lossless, 23 default, 51 worst | relevant wiki link
generic quality options: ffmpeg -i in.avi -q:v N out.avi
| ffmpeg -i in.avi qscale:v N out.avi
| meaning of -q and -qscale is codec-dependent
relevant doc section | an example where the video's timecode is burned in in seconds.frame format:
ffmpeg -i infile.avi -filter:v drawtext=" fix_bounds=1: fontfile='//COMPUTER/Users/oatcookies/Desktop/DejaVuSansMono-Bold.ttf': fontcolor=white: fontsize=24: bordercolor=black: borderw=1: textfile=burn-in.txt: x=1: y=main_h-line_h-1:" outfile.avi
where burn-in.txt
has the following contents: %{expr: (floor(n/30)) + (mod(n, 30)/100)}
of course, that specific expression applies to a 30 fps video. the odd fontfile path is because ffmpeg doesn't exactly like windows' drive letters: trying 'C\:\\Windows\\Fonts\\x.ttf' (with varying amounts of (back)slashes) always resulted in an error.
a different burn-in.txt: %{expr_int_format: floor((n/30)/60) : d : 2}:%{eif: mod(floor(n/30), 60) :d:2}.%{eif: mod(n, 30)+1 :d:2}
| this shows the timecode in minutes:seconds.frames format.
%{expr_int_format:floor((n/60)/60):d:2}:%{eif:mod(floor(n/60),60):d:2}
ffmpeg -f gdigrab -show_region 1 -framerate 30 -video_size 942x1003 -offset_x 8 -offset_y 30 -i desktop out.avi
With h264 compression but not a lot of it:
ffmpeg -f gdigrab -show_region 1 -framerate 25 -video_size 800x600 -offset_x 1 -offset_y 30 -i desktop -c:v libx264 -crf 18 -preset veryfast out.mp4
ffmpeg -f x11grab -framerate 25 -video_size 218x148 -i :0.0+0,95 -c:v libx264 -crf 21 -preset superfast screencast.mp4
making louder: ffmpeg -hide_banner -i "quiet input.mp4" -af "volume=40dB" -vcodec copy "louder output.mp4"
removing entirely: ffmpeg -i with_sound.mp4 -vcodec copy -an no_sound.mp4
sound quality: ffmpeg -i in.mp4 -b:a 64k out.mp4
file 'G:\ff\day1\2015-05-15 21-53-49-96.avi'
file 'G:\ff\day1\2015-05-15 22-03-57-86.avi'
file 'G:\ff\day2\2015-05-16 22-08-42-72.avi'
perl script to generate a file list:
my $prefix = 'G:\\ff\\'; foreach my $subdir (("day1", "day2")) { my $dirname = "$prefix\\$subdir"; print("# $dirname\n"); opendir(my $dirhandle, $dirname) || die "Can't opendir $dirname: $!\n"; open(my $outfile, '>', "$subdir.txt") || die "Can't open > $subdir.txt: $!\n"; my $i = 1; while (readdir $dirhandle) { if ($_ =~ /*.avi$/ ) { print( $outfile "file '$dirname\\$_'\n"); $i++; } } closedir($dirhandle); close($outfile); }
sed -E "s/(.*)/file '\1'/"
The way FFmpeg's documentation recommends this is done is by renaming all the images into something like img001.jpg, img002.jpg, ... and then using "-f image2 -i img%03d.jpg
". I didn't want to do this, so I figured out a way to make it work with the concat format.
Here, I wanted the images to be played back at a rate of 10 a second, and I put them into a 30 fps mp4, so it's three frames per image:
ffmpeg -r 10 -f concat -i image-filelist -r 30 -c:v libx264 output.mp4
An important thing was to have the image-filelist text file have duration specifications for each image (concat format documentation). I don't really know what units they are, but "duration 1" ended up working, and when combined with the "-r 10" option I ended up getting ten images a second. The image-filelist file looked something like this:
file 'screenshot_0000.jpg' duration 1 file 'screenshot_0001.jpg' duration 1 ... and so on
Because the durations are all 1, maybe you can get away with not having them at all, I haven't tried.
I piped ls output into sed -E "s/(.*)/file '\1'\nduration 1/"
to add the "file" directive and the "duration" directive on the next file.
ffprobe infile.mp4
example: input was recorded at 5 fps. output should be the same frames but at 25 fps, making the output video 5x faster.
ffmpeg -i input_5fps.mp4
-r 25
-filter
setpts=PTS/5
output_25fps.mp4
example: a web rip had borders and a speedup to avoid algorims.
ffmpeg -i inrip.mp4
-r 30 -ar 44100
-filter:v "crop=out_w=640:out_h=360:x=345:y=235,
setpts=PTS/0.9"
-filter:a "asetrate=39690"
-c:v libx264 -crf 24 "output.mp4"
example: speeding up a video 16-fold but keeping its framerate
ffmpeg -hide_banner -i 1x.mp4
-filter:v "setpts=PTS/16" -r 30 16x.mp4
ffmpeg -hide_banner -i $file -filter_complex "[0:v] setpts=PTS/20, scale=h=360:w=-2, pad=w=640:h=360:x=-1:y=-1, minterpolate=fps=60:mi_mode=blend [v]; [0:a] atempo=2, atempo=2, atempo=2, atempo=2, atempo=1.25 [a]" -map "[v]" -map "[a]" -c:v libx264 -crf 22 -preset veryfast -b:a 128k S/$file
one could also do atempo=20 instead of the 2,2,2,2,1.25 done here;
the doc on atempo says "note that tempo greater than 2 will skip some samples rather than blend them in";
i didn't really hear a difference but i wanted to "blend it in" anyway.
also, i noticed that if this speedup is done in two separate filters, with -filter:v
and -filter:a
,
the encoding will kind of like hang on to the last frame and make the entire video as long as
the input even tho it's speeded up in its entirety: it'll be the speeded up video
and then nothing until the file is as long as the input, really weird.
doing video and audio simultaneously in one filtergraph with -filter_complex
fixes this.
speedup video (which lacks audio) and add silent audio: ffmpeg -hide_banner -i $file -f lavfi -i anullsrc -filter_complex "[0:v] setpts=PTS/20, scale=h=360:w=-2, pad=w=640:h=360:x=-1:y=-1, minterpolate=fps=60:mi_mode=blend" -shortest -c:v libx264 -crf 22 -preset veryfast -b:a 128k S/$file
the minterpolate
filter won't work with high acceleration rates: using "setpts=PTS/20
" and the minterpolate works, but "setpts=PTS/60" and then minterpolate will give a floating point error and a crash. just leave out minterpolate and use the -r
option to get the right frame rate in the end. frames will be dropped but what can you do, floating point is messy.
ffmpeg -i copyright_trolled.mp4
-c:v copy -filter:a
"volume='ifnot(between(t,20834,20987),1)':eval=frame"
part_muted.mp4
The "selective" part is documented as "Timeline editing" in the FFmpeg documentation, but the short version is that many filters (ffmpeg -filters
should list which) support the enable
option. The enable
option takes an expression that may operate on the values t
and n
, which are a timestamp in seconds and a frame number, and if that expression evaluates to anything non-zero then that filter is enabled. I haven't found a full list of all functions but there's at least between(x, a, b)
, geq(a, b)
, and ifnot(expr, val)
. You'll probably want to wrap the expression in (single) quotes, so that the comma in the expression isn't interpreted as a comma that separates filters.
Here i use the hue
filter for saturation and brightness modification, and the gblur
filter for a Gaussian blur. By default, the gblur
filter has a sigma
of 0.5, which is very little blurring and is unsuitable for making text or such unreadable.
ffplay -i input.mkv -vf "gblur=sigma=5:
I have also used the geq
("generic expression", evaluated per pixel) filter for blurring and dimming, but I haven't yet figured out desaturation (the Cb and Cr plane values are a bit difficult). The expression here makes each output pixel the average luminosity of its neighbors in the input, in a plus shape instead of the more typical nine-pixel square, and it was sort of interesting. (Any references to a pixel out of bounds of the video is clipped to an edge, with no error.)
ffplay -i input.mkv -vf "geq=enable='between(t,1250,1260)':lum='(lum(X,Y) + lum(X-1,Y) + lum(X+1,Y) + lum(X,Y-1) + lum(X,Y+1))/5':cb=cb(X\,Y):cr=cr(X\,Y)"
geq
ffplay -i input.mkv -vf "geq=enable='between(t,1250,12560)':cb=cb(X\,Y):cr=cr(X\,Y):lum='min(max(lum(X,Y), max(lum(X-1,Y), max(lum(X-2,Y), max(lum(X+1,Y), lum(X+2, Y)))))/1, 200)'"
geq=enable='between(t,1240,12560)':r='min(min(g(X,Y), min(g(X-1,Y), min(g(X-2,Y), min(g(X+1,Y), g(X+2, Y)))))/1, 160)':g='min(min(g(X,Y), min(g(X-1,Y), g(X+1,Y)))/1, 160)':b='min(min(g(X,Y), min(g(X-1,Y), min(g(X-2,Y), min(g(X+1,Y), g(X+2, Y)))))/1, 160)'
-vf "hue=s=0:b=-1.5:enable='between(t,1255.7,1258.5)', geq=enable='between(t,1255.7,1258.5)':lum='min(min(lum(X,Y), min(lum(X-1,Y), min(lum(X-2,Y), min(lum(X+1,Y), lum(X+2, Y)))))/1, 160)':cr=cr(X\,Y):cb=cb(X\,Y), drawtext=enable='between(t,1255.7,1258.5)':text='redacted lol':x=10:y=140"
ffmpeg -i in.mp3 -c:a copy -metadata TSOT="sort by this string" out.mp3
ffmpeg -hide_banner -i "424745724.mp3" -c:a copy -metadata TITLE="The Title" -metadata ARTIST="Whoever" -metadata DATE=2018 -metadata LYRICS="a windows command line cannot include a newline but a unix one can" -metadata TSOT="Title, The" forarchive.mp3
ffmpeg -hide_banner -i shaky.mp4 -filter:v "deshake" -c:v libx264 -crf 23 -c:a copy lessshaky.mp4
ffmpeg -hide_banner -i shaky -filter:v "deshake=edge=original:rx=32:ry=32:blocksize=4" -c:v libx264 -crf 22 -c:a copy -t 15 less_shaky.mp4
ffmpeg -hide_banner -i a.mp4 -i b.mp4 -filter_complex "[0:v]pad=iw*2:ih[int];[int][1:v]overlay=W/2:0[vid]" -map [vid] -c:v libx264 -crf 22 -map 0:a -c:a copy ab.mp4
ffmpeg -i 2020-02-24.mkv -vf palettegen palette.png
you can now edit palette.png if you want to change it, reduce the number of colours and stuff (but keeping it as a 16×16 image)
ffmpeg -i 2020-02-24.mkv -i palette2.png -filter_complex
"[0] crop=w=1180:h=720:x=0:y=0,scale=472x288,crop=w=449:h=265:x=10:y=12 [x] ; [x] [1] paletteuse=dither=none:diff_mode=rectangle"
-t 3 thing3sec.gif
ffmpeg -f lavfi -i anullsrc -f lavfi -i
"yuvtestsrc=rate=60:size=1280x720,drawtext=text='lol ┐(´∀\` )┌':fontcolor=white:fontsize=48:bordercolor=black:borderw=3:x=(w-text_w)/2:y=(h-text_h)/2"
-c:v libx264 -crf 22 -b:a 256k -t 5 testfoo.mp4
ffplay -f lavfi "yuvtestsrc"
or instead of yuvtestsrc
try testsrc
(which has a built-in seconds counter) or pal75bars
or rgbtestsrc
or smptehdbars
.
ffplay -f lavfi "mandelbrot=size=1024x768:inner=convergence"
; doc section
you can set the size, framerate, audio channels (anullsrc documentation), and audio sample rate as required; likewise for whatever codecs you need. (useful to produce a short separator when concatenating videos; the concat format requires all inputs to be the same codec.)
ffmpeg -f lavfi -i color=c=black:
ffplay -f lavfi -i "color=0x808080:size=640x480,hue=b=sin(PI*t)*1.5"
ffmpeg -i input720p.mkv -f lavfi -i "color=0x696969:size=1280x720,hue=b=sin(PI*t)*2" -f lavfi -i color=0x111111:size=1280x720 -f lavfi -i color=0xDDDDDD:size=1280x720 -lavfi threshold cyclic_threshold.mkv
Doc on the threshold filter, doc on the hue filter. 0x696969, or "DimGray", seems to be a pretty decent threshold value.
ffmpeg -i video.mp4 -vf "subtitles='subs.srt'" (-c:a copy -c:v libx264 etc...) video_sub.mp4
(filter documentation, ffmpeg wiki link)
ffplay -i video.mkv -vf "split=3 [v1][v2][v3] ;
[v1] colorchannelmixer=.7:.2:.1:0:.7:.2:.1:0:.7:.2:.1:0[v4] ;
[v2] crop=in_w/3:ih:in_w/3:0, colorchannelmixer=.33:.34:.33:0:.33:.34:.33:0:.33:.34:.33:0 [v5] ;
[v3] crop=in_w/3:ih:2*in_w/3:0, colorchannelmixer=.2:.3:.5:0:.2:.3:.5:0:.2:.3:.5:0 [v6] ;
[v4][v5] overlay=main_w/3:0 [v7] ;
[v7][v6] overlay=2*main_w/3:0,
drawtext=text='70 20 10':x=0:y=10:bordercolor=white:borderw=1,
drawtext=text='33 34 33':x=w/3:y=10:bordercolor=white:borderw=1,
drawtext=text='20 30 50':x=2*w/3:y=10:bordercolor=white:borderw=1"
yt-dlp https://www.youtube.com/watch?v=dQw4w9WgXcQ -f 137+140 -o "%(title)s.%(ext)s" --get-filename
-o "%(upload_date)s %(title)s %(id)s.%(ext)s"
yt-dlp https://www.youtube.com/watch?v=3MqYE2UuN24 --list-subs
yt-dlp https://www.youtube.com/watch?v=3MqYE2UuN24 -f 22 --write-subs --sub-format vtt --sub-langs "en.*,ja,all,-live_chat"
-o "%(upload_date)s_%(title)s_%(id)s.%(ext)s" -f "720p60-0/480p/360p/160p"
-o "%(height)d", --skip-download, --write-auto-subs
ffmpeg -loop 1 -i picture.png -i audio.mp3 -map 0:v -map 1:a -c:v libx264 -crf 24 -b:a 320k -r 30 video.mp4
This'll make the video the resolution of the image, and at 30 frames per second (the -r 30
part). For a different size video, use (for example) -s 1080x1080
after the -map 1:a
flag but before -c:v
.
Note that this will keep adding soundless frames of the picture to the end of the video until you quit the program (by pressing q); it doesn't stop when the input audio stops. I haven't figured out how to stop when the audio stops, but you can get around this by first running ffprobe
on the audio, getting its exact length, then adding (for example) -t 03:14.15
right before the output file name to make the video exactly 3 minutes and 14.15 seconds long.
Short answer: fps
is average frames per second, tbr
is a different kind of framerate, tbn
is the time base that the time stamps of the video's frames are internally represented in, and tbc
(when it existed) is the codec's time base. (tbc
was removed in April 2021, in a commit called "remove remnants of codec timebase".) For most purposes, "fps" and "tbr" are the same, and "tbn" and "tbc" are irrelevant.
What does this mean? This StackOverflow answer explains it well; here follows my explanation based on that explanation.
(Modern) video codecs don't store the list of frames, then a command of "play these back at (for example) 30 fps". Variable frame rates are annoying but a useful concept – maybe the camera lags, or maybe it's a stream over an unreliable internet connection where frames are received out of order or some of them are dropped. If each frame itself said what time it's supposed to be shown, they can be buffered out of order, then played in-order, or one frame can be held for enough time to skip over missing frames and the next is shown at the correct time to keep in sync with the separately-transmitted audio. Or maybe you're recording a video game, at exactly one of the game's frames becoming one of the video's frames, but video games (especially 3D ones) rarely run at an exactly constant framerate.
Each frame has a presentation timestamp, PTS
, stored with it (this is what the setpts
filter does, and why it can be used to slow down or speed up video – if you double the PTS, each frame is shown twice as late as it usually would, hence the video's speed is halved). This PTS value is stored as an integer, as floating-point values are notoriously imprecise. It's typically relative to the start of the video, so the first frame has a PTS of zero. The time base, tbn
("tb" for timebase and "n" for "as a number"), indicates the magnitude of these PTS units; for example, if the tbn
is a sixtieth of a second, and the PTS of a frame is 181, then that frame ought to be displayed at 3 seconds + 1/60th of a second. (So, in principle, a video could be sped up or slowed down by changing its time base, instead of all of its frames' timestamps, but it appears to be good practice to not mess with tbc
.) What FFmpeg/FFprobe shows is the reciprocal of this time base – it's stored as a fraction of a second, for example 0.01666…, but it's then inverted to turn it into 60 tbn
.
A tbn
of 60 is rather unlikely, though, because it's so low in precision. I've seen tbn
values of "1k", "15360", and "90k"; these are better suited, because they have a lot of factors. (1000 doesn't, but it specifies a millisecond, which is probably good enough; 15360 is 512 (a power of two) times 30, so it can neatly specify multiples and quite a few fractions of 30. 90,000 is divisible by 24, 25, and 30, but its reason for existence is due to some codecs (M2TS on BluRay) specifying it.)
The tbc
field has the same idea as tbn
, but instead of it being the video file's time base, it's the video codec's. These aren't necessarily the same, as for example the codec might have a default that a file can override. Use of tbc
appears to have been phased out.
The tbr
field is just the frame rate, but derived from a different source; it is typically but not always the same as the value in the "fps" field.
The code that writes this info in the command-line output is in the FFmpeg code tree's libavformat
, in the function dump_stream_format
, and has been for years. See it on github.
link to main ffplay documentation.
ffplay -i video.mp4 (-an disables audio) (-vn disables video) (-ss pos: seek to approximately that position) -vf video filter graph -af audio filter graph
Left click: play/pause, also from p and space. Right click: seek to a point according to the horizontal position of where you clicked on (i.e., if you click on the right quarter of the screen, you'll seek to around 75% of the file).
s: pause and go to next frame. a: cycle audio channel. v: cycle video channel. t: cycle subtitle channel.
9 or /: decrease volume. 0 or *: increase volume. m: mute.
originally created , last edited · index