4 years agomkv: Update the seek test to match 5d3953a5dc
Luca Barbato [Wed, 22 Feb 2017 08:55:45 +0000 (09:55 +0100)]
mkv: Update the seek test to match 5d3953a5dc

4 years agofate: Update fate-lavf-mkv after commit 5d3953a5dc
John Stebbins [Tue, 21 Feb 2017 23:47:20 +0000 (16:47 -0700)]
fate: Update fate-lavf-mkv after commit 5d3953a5dc

4 years agofate: Add webp alpha test
Mark Thompson [Fri, 17 Feb 2017 23:13:14 +0000 (23:13 +0000)]
fate: Add webp alpha test

4 years agomatroskaenc: factor ts_offset into block timecode computation
John Stebbins [Wed, 15 Feb 2017 22:22:40 +0000 (15:22 -0700)]
matroskaenc: factor ts_offset into block timecode computation

ts_offset was added to cluster timecode, but then effectively subtracted
back off the block timecode

When setting initial_padding for an audio stream, the timestamps are
written incorrectly to the mkv file.  cluster timecode gets written
as pts0 + ts_offset which is correct, but then block timecode gets
written as pts - cluster timecode which expanded is
pts - (pts0 + ts_offset).  Adding cluster and block tc back together:
cluster + block = (pts0 + ts_offset) + (pts - (pts0 + ts_offset)) = pts
But the result should be pts + ts_offset since demux will subtract the
CodecDelay element from pts and set initial_padding to CodecDelay.
This patch gives the correct result.

4 years agobuild: Move cli tool sources to a separate subdirectory
Diego Biurrun [Wed, 4 Jan 2017 14:09:29 +0000 (15:09 +0100)]
build: Move cli tool sources to a separate subdirectory

This unclutters the top-level directory and groups related files together.

4 years agobuild: Separate logic for building examples from that for building avtools
Diego Biurrun [Tue, 14 Feb 2017 12:15:25 +0000 (13:15 +0100)]
build: Separate logic for building examples from that for building avtools

4 years agobuild: Split logic for building examples off into a separate Makefile
Diego Biurrun [Wed, 15 Feb 2017 12:31:52 +0000 (13:31 +0100)]
build: Split logic for building examples off into a separate Makefile

4 years agobuild: Avoid duplication in examples lists
Diego Biurrun [Tue, 14 Feb 2017 11:57:13 +0000 (12:57 +0100)]
build: Avoid duplication in examples lists

4 years agobuild: Drop leftover reference to old EXAMPLES logic
Diego Biurrun [Mon, 6 Feb 2017 19:07:02 +0000 (20:07 +0100)]
build: Drop leftover reference to old EXAMPLES logic

4 years agoconfigure: Restructure the way check_pkg_config() operates
Diego Biurrun [Sat, 11 Feb 2017 12:09:27 +0000 (13:09 +0100)]
configure: Restructure the way check_pkg_config() operates

Have check_pkg_config() enable variables and set cflags and extralibs
instead of relegating that task to require_pkg_config. This simplifies
require_pkg_config(), is consistent with what other helper functions
like check_lib() do and allows getting rid of some manual variable
setting in places where check_pkg_config() is used.

4 years agoconfigure: Explicitly spell out first require_pkg_config() parameter
Diego Biurrun [Thu, 16 Feb 2017 16:37:25 +0000 (17:37 +0100)]
configure: Explicitly spell out first require_pkg_config() parameter

This is less confusing than encountering "" in the argument list.

4 years agonvenc: Fix nvec vs. nvenc typo
Diego Biurrun [Fri, 17 Feb 2017 11:40:40 +0000 (12:40 +0100)]
nvenc: Fix nvec vs. nvenc typo

4 years agodv: Don't return EIO upon EOF
John Stebbins [Wed, 11 Jan 2017 19:17:06 +0000 (12:17 -0700)]
dv: Don't return EIO upon EOF

4 years agowebp: Fix alpha decoding
Mark Thompson [Fri, 17 Feb 2017 23:14:19 +0000 (23:14 +0000)]
webp: Fix alpha decoding

This was broken by 4e528206bc4d968706401206cf54471739250ec7 - the webp
decoder was assuming that it could set the output pixfmt of the vp8
decoder directly, but after that change it no longer could because
ff_get_format() was used instead.  This adds an internal get_format()
callback to webp use of the vp8 decoder to override the pixfmt

4 years agovf_deinterlace_vaapi: Create filter buffer after context
Mark Thompson [Thu, 9 Feb 2017 19:26:11 +0000 (19:26 +0000)]
vf_deinterlace_vaapi: Create filter buffer after context

The Intel proprietary VAAPI driver enforces the restriction that a
buffer must be created inside an existing context, so just ensure
this is always true.

4 years agovaapi_encode: Discard output buffer if picture submission fails
Mark Thompson [Thu, 16 Feb 2017 00:02:29 +0000 (00:02 +0000)]
vaapi_encode: Discard output buffer if picture submission fails

Previously this was leaking, though it actually hit an assert making
sure that the buffer had already been cleared when freeing the picture.

4 years agovf_fade: Make sure to not miss the last lines of a frame
Martin Storsjö [Thu, 16 Feb 2017 10:23:20 +0000 (12:23 +0200)]
vf_fade: Make sure to not miss the last lines of a frame

When slice_h is rounded up due to chroma subsampling, there's
a risk that jobnr * slice_h exceeds frame->height.

Prior to a638e9184d63, this wasn't an issue for the last slice
of a frame, since slice_end was set to frame->height for the last

a638e9184d63 tried to fix the case where other slices than the
last one would exceed frame->height (which can happen where the
number of slices/threads is very large compared to the frame

However, the fix in a638e9184d63 instead broke other cases,
where slice_h * nb_threads < frame->height. Therefore, make
sure the last slice always ends at frame->height.

CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoconfigure: Handle SDL version check through pkg-config
Diego Biurrun [Sat, 11 Feb 2017 15:51:25 +0000 (16:51 +0100)]
configure: Handle SDL version check through pkg-config

4 years agoaarch64: Add parentheses around the offset parameter in movrel
Martin Storsjö [Thu, 16 Feb 2017 07:18:25 +0000 (09:18 +0200)]
aarch64: Add parentheses around the offset parameter in movrel

This fixes building with clang for linux with PIC enabled.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoavconv: Move rescale to stream timebase before monotonisation
Mark Thompson [Sun, 12 Feb 2017 23:47:58 +0000 (23:47 +0000)]
avconv: Move rescale to stream timebase before monotonisation

If the stream timebase is coarser than the muxing timebase then the
monotonisation process may fail because adding one to the timestamp
need not actually produce a different timestamp after the rescale.

4 years agolibopenh264dec: Let the framework use the h264_mp4toannexb bitstream filter
Martin Storsjö [Wed, 15 Feb 2017 09:06:17 +0000 (11:06 +0200)]
libopenh264dec: Let the framework use the h264_mp4toannexb bitstream filter

This avoids a lot of boilerplate code within the decoder wrapper itself.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoasfdec: Account for different Format Data sizes
Alexandra Hájková [Wed, 8 Feb 2017 11:51:37 +0000 (12:51 +0100)]
asfdec: Account for different Format Data sizes

Some muxers may use the BMP_HEADER Format Data size instead
of the ASF-specific one.

Bug-Id: 1020
CC: libav-stable@libav.org
Signed-off-by: Diego Biurrun <diego@biurrun.de>
4 years agoconfigure: Check for xcb as well as xcb-shape before enabling libxcb
Diego Biurrun [Sat, 11 Feb 2017 10:47:34 +0000 (11:47 +0100)]
configure: Check for xcb as well as xcb-shape before enabling libxcb

Newer versions of libxcb have xcb-foo pkg-config files that do not declare
their xcb dependency so that required linker flags will not be generated.

4 years agomov: Do not try to parse multiple stsd for the same track
Luca Barbato [Sat, 11 Feb 2017 21:44:08 +0000 (21:44 +0000)]
mov: Do not try to parse multiple stsd for the same track

Bug-Id: 1017
CC: libav-stable@libav.org
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
4 years agohwcontext_vaapi: Try to support the VDPAU wrapper
Mark Thompson [Mon, 30 Jan 2017 19:11:28 +0000 (19:11 +0000)]
hwcontext_vaapi: Try to support the VDPAU wrapper

The driver is somewhat bitrotten (not updated for years) but is still
usable for decoding with this change.  To support it, this adds a new
driver quirk to indicate no support at all for surface attributes.

Based on a patch by wm4 <nfxjfg@googlemail.com>.

4 years agovaapi: Implement device-only setup
Mark Thompson [Sat, 11 Feb 2017 15:13:12 +0000 (15:13 +0000)]
vaapi: Implement device-only setup

In this case, the user only supplies a device and the frame context
is allocated internally by lavc.

4 years agolavc: Add device context field to AVCodecContext
Mark Thompson [Sat, 11 Feb 2017 15:13:04 +0000 (15:13 +0000)]
lavc: Add device context field to AVCodecContext

For use by codec implementations which can allocate frames internally.

4 years agoaarch64: vp9lpf: Fix broken indentation/vertical alignment
Martin Storsjö [Wed, 11 Jan 2017 09:58:02 +0000 (11:58 +0200)]
aarch64: vp9lpf: Fix broken indentation/vertical alignment

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoaarch64: vp9lpf: Interleave the start of flat8in into the calculation above
Martin Storsjö [Tue, 10 Jan 2017 20:08:50 +0000 (22:08 +0200)]
aarch64: vp9lpf: Interleave the start of flat8in into the calculation above

This adds lots of extra .ifs, but speeds it up by a couple cycles,
by avoiding stalls.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoarm: vp9lpf: Interleave the start of flat8in into the calculation above
Martin Storsjö [Tue, 10 Jan 2017 14:49:13 +0000 (16:49 +0200)]
arm: vp9lpf: Interleave the start of flat8in into the calculation above

This adds lots of extra .ifs, but speeds it up by a couple cycles,
by avoiding stalls.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agodv: Convert to the new bitstream reader
Luca Barbato [Mon, 11 Apr 2016 17:18:50 +0000 (19:18 +0200)]
dv: Convert to the new bitstream reader

4 years agoaac: Validate the sbr sample rate before using the value
Luca Barbato [Sat, 11 Feb 2017 14:40:20 +0000 (15:40 +0100)]
aac: Validate the sbr sample rate before using the value

Avoid a floating point exception.

Bug-Id: 1027
CC: libav-stable@libav.org
4 years agoconfigure: Move up the avbuild directory creation
Luca Barbato [Fri, 10 Feb 2017 19:31:34 +0000 (19:31 +0000)]
configure: Move up the avbuild directory creation

The early check for inconsistent in-source vs out-of-source build
cannot generate a config.log otherwise.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
4 years agohwcontext_dxva2: support D3D9Ex
wm4 [Fri, 10 Feb 2017 11:17:24 +0000 (12:17 +0100)]
hwcontext_dxva2: support D3D9Ex

D3D9Ex uses different driver paths. This helps with "headless"
configurations when no user logs in. Plain D3D9 device creation will
fail if no user is logged in, while it works with D3D9Ex.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
4 years agoAVFrame: add an opaque_ref field
wm4 [Thu, 2 Feb 2017 10:27:54 +0000 (11:27 +0100)]
AVFrame: add an opaque_ref field

This is an extended version of the AVFrame.opaque field, which can be
used to attach arbitrary user information to an AVFrame.

The usefulness of the opaque field is rather limited, because it can
store only up to 32 bits of information (or 64 bit on 64 bit systems).
It's not possible to set this field to a memory allocation, because
there is no way to deallocate it correctly.

The opaque_ref field circumvents this by letting the user set an
AVBuffer, which makes the user data refcounted.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
4 years agoframe: allow align=0 (meaning automatic) for av_frame_get_buffer()
Anton Khirnov [Wed, 8 Feb 2017 08:46:04 +0000 (09:46 +0100)]
frame: allow align=0 (meaning automatic) for av_frame_get_buffer()

This will avoid every caller from hardcoding some specific alignment,
which may break in the future with new instruction sets.

4 years agolavc: use av_cpu_max_align() instead of hardcoding alignment requirements
Anton Khirnov [Wed, 8 Feb 2017 08:34:58 +0000 (09:34 +0100)]
lavc: use av_cpu_max_align() instead of hardcoding alignment requirements

4 years agocpu: add a function for querying maximum required data alignment
Anton Khirnov [Wed, 8 Feb 2017 08:32:17 +0000 (09:32 +0100)]
cpu: add a function for querying maximum required data alignment

4 years agoscale_npp: explicitly set the output frames context for passthrough mode
Anton Khirnov [Wed, 1 Feb 2017 09:38:42 +0000 (10:38 +0100)]
scale_npp: explicitly set the output frames context for passthrough mode

This is no longer done automatically for filters marked as

4 years agoUse the new AVIOContext destructor.
Anton Khirnov [Fri, 13 Jan 2017 11:04:16 +0000 (12:04 +0100)]
Use the new AVIOContext destructor.

4 years agoavio: add a destructor for AVIOContext
Anton Khirnov [Fri, 13 Jan 2017 10:53:51 +0000 (11:53 +0100)]
avio: add a destructor for AVIOContext

Before this commit, AVIOContext is to be freed with a plain av_free(),
which prevents us from adding any deeper structure to it.

4 years agoarm: vp9lpf: Use orrs instead of orr+cmp
Martin Storsjö [Fri, 13 Jan 2017 21:42:28 +0000 (23:42 +0200)]
arm: vp9lpf: Use orrs instead of orr+cmp

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoarm/aarch64: vp9lpf: Calculate !hev directly
Martin Storsjö [Thu, 12 Jan 2017 14:52:33 +0000 (16:52 +0200)]
arm/aarch64: vp9lpf: Calculate !hev directly

Previously we first calculated hev, and then negated it.

Since we were able to schedule the negation in the middle
of another calculation, we don't see any gain in all cases.

Before:                     Cortex A7      A8      A9     A53  A53/AArch64
vp9_loop_filter_v_4_8_neon:     147.0   129.0   115.8    89.0         88.7
vp9_loop_filter_v_8_8_neon:     242.0   198.5   174.7   140.0        136.7
vp9_loop_filter_v_16_8_neon:    500.0   419.5   382.7   293.0        275.7
vp9_loop_filter_v_16_16_neon:   971.2   825.5   731.5   579.0        453.0
vp9_loop_filter_v_4_8_neon:     143.0   127.7   114.8    88.0         87.7
vp9_loop_filter_v_8_8_neon:     241.0   197.2   173.7   140.0        136.7
vp9_loop_filter_v_16_8_neon:    497.0   419.5   379.7   293.0        275.7
vp9_loop_filter_v_16_16_neon:   965.2   818.7   731.4   579.0        452.0

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoaarch64: vp9itxfm: Optimize 16x16 and 32x32 idct dc by unrolling
Martin Storsjö [Wed, 4 Jan 2017 10:57:56 +0000 (12:57 +0200)]
aarch64: vp9itxfm: Optimize 16x16 and 32x32 idct dc by unrolling

This work is sponsored by, and copyright, Google.

Before:                           Cortex A53
vp9_inv_dct_dct_16x16_sub1_add_neon:   235.3
vp9_inv_dct_dct_32x32_sub1_add_neon:   555.1
vp9_inv_dct_dct_16x16_sub1_add_neon:   180.2
vp9_inv_dct_dct_32x32_sub1_add_neon:   475.3

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoarm: vp9itxfm: Optimize 16x16 and 32x32 idct dc by unrolling
Martin Storsjö [Wed, 4 Jan 2017 11:08:51 +0000 (13:08 +0200)]
arm: vp9itxfm: Optimize 16x16 and 32x32 idct dc by unrolling

This work is sponsored by, and copyright, Google.

Before:                            Cortex A7      A8      A9     A53
vp9_inv_dct_dct_16x16_sub1_add_neon:   273.0   189.5   211.7   235.8
vp9_inv_dct_dct_32x32_sub1_add_neon:   752.0   459.2   862.2   553.9
vp9_inv_dct_dct_16x16_sub1_add_neon:   226.5   145.0   225.1   171.8
vp9_inv_dct_dct_32x32_sub1_add_neon:   721.2   415.7   727.6   475.0

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoaarch64: vp9mc: Calculate less unused data in the 4 pixel wide horizontal filter
Martin Storsjö [Sat, 17 Dec 2016 11:14:38 +0000 (13:14 +0200)]
aarch64: vp9mc: Calculate less unused data in the 4 pixel wide horizontal filter

No measured speedup on a Cortex A53, but other cores might benefit.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoarm: vp9mc: Calculate less unused data in the 4 pixel wide horizontal filter
Martin Storsjö [Sat, 17 Dec 2016 11:09:50 +0000 (13:09 +0200)]
arm: vp9mc: Calculate less unused data in the 4 pixel wide horizontal filter

Before:                    Cortex A7      A8     A9     A53
vp9_put_8tap_smooth_4h_neon:   378.1   273.2  340.7   229.5
vp9_put_8tap_smooth_4h_neon:   352.1   222.2  290.5   229.5

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoaarch64: vp9mc: Simplify the extmla macro parameters
Martin Storsjö [Fri, 16 Dec 2016 22:55:41 +0000 (00:55 +0200)]
aarch64: vp9mc: Simplify the extmla macro parameters

Fold the field lengths into the macro.

This makes the macro invocations much more readable, when the
lines are shorter.

This also makes it easier to use only half the registers within
the macro.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agomov: Rework stsc index validation
Vittorio Giovara [Fri, 3 Feb 2017 12:05:27 +0000 (13:05 +0100)]
mov: Rework stsc index validation

In order to avoid potential integer overflow change the comparison
and make sure to use the same unsigned type for both elements.

4 years agoimgutils: Document av_image_get_buffer_size()
Vittorio Giovara [Tue, 7 Feb 2017 15:01:41 +0000 (10:01 -0500)]
imgutils: Document av_image_get_buffer_size()

4 years agohlsenc: Correctly write down all 16 bytes in hex
Luca Barbato [Thu, 9 Feb 2017 22:27:41 +0000 (23:27 +0100)]
hlsenc: Correctly write down all 16 bytes in hex

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
4 years agoutvideodec: Add a missing include
Martin Storsjö [Fri, 10 Feb 2017 07:20:39 +0000 (09:20 +0200)]
utvideodec: Add a missing include

This was missing from 77c23704c76, fixing building.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agonvenc: make gpu indices independent of supported capabilities
Timo Rothenpieler [Tue, 7 Feb 2017 02:04:39 +0000 (18:04 -0800)]
nvenc: make gpu indices independent of supported capabilities

Do not allocate a CUDA context for every available gpu.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
4 years agoavcodec: Mark some codecs with threadsafe init as such
Derek Buitenhuis [Wed, 8 Feb 2017 14:42:16 +0000 (14:42 +0000)]
avcodec: Mark some codecs with threadsafe init as such

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
4 years agoaarch64: vp9itxfm: Fix incorrect vertical alignment
Martin Storsjö [Tue, 3 Jan 2017 14:11:56 +0000 (16:11 +0200)]
aarch64: vp9itxfm: Fix incorrect vertical alignment

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoaarch64: vp9itxfm: Update a comment to refer to a register with a different name
Martin Storsjö [Tue, 3 Jan 2017 21:11:51 +0000 (23:11 +0200)]
aarch64: vp9itxfm: Update a comment to refer to a register with a different name

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoaarch64: vp9itxfm: Use the right lane sizes in 8x8 for improved readability
Martin Storsjö [Tue, 3 Jan 2017 14:46:17 +0000 (16:46 +0200)]
aarch64: vp9itxfm: Use the right lane sizes in 8x8 for improved readability

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoaarch64: vp9itxfm: Use a single lane ld1 instead of ld1r where possible
Martin Storsjö [Tue, 3 Jan 2017 12:55:46 +0000 (14:55 +0200)]
aarch64: vp9itxfm: Use a single lane ld1 instead of ld1r where possible

The ld1r is a leftover from the arm version, where this trick is
beneficial on some cores.

Use a single-lane load where we don't need the semantics of ld1r.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoaarch64: vp9itxfm: Share instructions for loading idct coeffs in the 8x8 function
Martin Storsjö [Tue, 3 Jan 2017 14:39:41 +0000 (16:39 +0200)]
aarch64: vp9itxfm: Share instructions for loading idct coeffs in the 8x8 function

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoarm: vp9itxfm: Share instructions for loading idct coeffs in the 8x8 function
Martin Storsjö [Tue, 3 Jan 2017 14:38:56 +0000 (16:38 +0200)]
arm: vp9itxfm: Share instructions for loading idct coeffs in the 8x8 function

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoaarch64: vp9itxfm: Do separate functions for half/quarter idct16 and idct32
Martin Storsjö [Tue, 22 Nov 2016 20:58:35 +0000 (22:58 +0200)]
aarch64: vp9itxfm: Do separate functions for half/quarter idct16 and idct32

This work is sponsored by, and copyright, Google.

This avoids loading and calculating coefficients that we know will
be zero, and avoids filling the temp buffer with zeros in places
where we know the second pass won't read.

This gives a pretty substantial speedup for the smaller subpartitions.

The code size increases from 14740 bytes to 24292 bytes.

The idct16/32_end macros are moved above the individual functions; the
instructions themselves are unchanged, but since new functions are added
at the same place where the code is moved from, the diff looks rather

vp9_inv_dct_dct_16x16_sub1_add_neon:     236.7
vp9_inv_dct_dct_16x16_sub2_add_neon:    1051.0
vp9_inv_dct_dct_16x16_sub4_add_neon:    1051.0
vp9_inv_dct_dct_16x16_sub8_add_neon:    1051.0
vp9_inv_dct_dct_16x16_sub12_add_neon:   1387.4
vp9_inv_dct_dct_16x16_sub16_add_neon:   1387.6
vp9_inv_dct_dct_32x32_sub1_add_neon:     554.1
vp9_inv_dct_dct_32x32_sub2_add_neon:    5198.5
vp9_inv_dct_dct_32x32_sub4_add_neon:    5198.6
vp9_inv_dct_dct_32x32_sub8_add_neon:    5196.3
vp9_inv_dct_dct_32x32_sub12_add_neon:   6183.4
vp9_inv_dct_dct_32x32_sub16_add_neon:   6174.3
vp9_inv_dct_dct_32x32_sub20_add_neon:   7151.4
vp9_inv_dct_dct_32x32_sub24_add_neon:   7145.3
vp9_inv_dct_dct_32x32_sub28_add_neon:   8119.3
vp9_inv_dct_dct_32x32_sub32_add_neon:   8118.7

vp9_inv_dct_dct_16x16_sub1_add_neon:     236.7
vp9_inv_dct_dct_16x16_sub2_add_neon:     640.8
vp9_inv_dct_dct_16x16_sub4_add_neon:     639.0
vp9_inv_dct_dct_16x16_sub8_add_neon:     842.0
vp9_inv_dct_dct_16x16_sub12_add_neon:   1388.3
vp9_inv_dct_dct_16x16_sub16_add_neon:   1389.3
vp9_inv_dct_dct_32x32_sub1_add_neon:     554.1
vp9_inv_dct_dct_32x32_sub2_add_neon:    3685.5
vp9_inv_dct_dct_32x32_sub4_add_neon:    3685.1
vp9_inv_dct_dct_32x32_sub8_add_neon:    3684.4
vp9_inv_dct_dct_32x32_sub12_add_neon:   5312.2
vp9_inv_dct_dct_32x32_sub16_add_neon:   5315.4
vp9_inv_dct_dct_32x32_sub20_add_neon:   7154.9
vp9_inv_dct_dct_32x32_sub24_add_neon:   7154.5
vp9_inv_dct_dct_32x32_sub28_add_neon:   8126.6
vp9_inv_dct_dct_32x32_sub32_add_neon:   8127.2

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoarm: vp9itxfm: Do a simpler half/quarter idct16/idct32 when possible
Martin Storsjö [Tue, 22 Nov 2016 09:07:38 +0000 (11:07 +0200)]
arm: vp9itxfm: Do a simpler half/quarter idct16/idct32 when possible

This work is sponsored by, and copyright, Google.

This avoids loading and calculating coefficients that we know will
be zero, and avoids filling the temp buffer with zeros in places
where we know the second pass won't read.

This gives a pretty substantial speedup for the smaller subpartitions.

The code size increases from 12388 bytes to 19784 bytes.

The idct16/32_end macros are moved above the individual functions; the
instructions themselves are unchanged, but since new functions are added
at the same place where the code is moved from, the diff looks rather

Before:                              Cortex A7       A8       A9      A53
vp9_inv_dct_dct_16x16_sub1_add_neon:     273.0    189.5    212.0    235.8
vp9_inv_dct_dct_16x16_sub2_add_neon:    2102.1   1521.7   1736.2   1265.8
vp9_inv_dct_dct_16x16_sub4_add_neon:    2104.5   1533.0   1736.6   1265.5
vp9_inv_dct_dct_16x16_sub8_add_neon:    2484.8   1828.7   2014.4   1506.5
vp9_inv_dct_dct_16x16_sub12_add_neon:   2851.2   2117.8   2294.8   1753.2
vp9_inv_dct_dct_16x16_sub16_add_neon:   3239.4   2408.3   2543.5   1994.9
vp9_inv_dct_dct_32x32_sub1_add_neon:     758.3    456.7    864.5    553.9
vp9_inv_dct_dct_32x32_sub2_add_neon:   10776.7   7949.8   8567.7   6819.7
vp9_inv_dct_dct_32x32_sub4_add_neon:   10865.6   8131.5   8589.6   6816.3
vp9_inv_dct_dct_32x32_sub8_add_neon:   12053.9   9271.3   9387.7   7564.0
vp9_inv_dct_dct_32x32_sub12_add_neon:  13328.3  10463.2  10217.0   8321.3
vp9_inv_dct_dct_32x32_sub16_add_neon:  14176.4  11509.5  11018.7   9062.3
vp9_inv_dct_dct_32x32_sub20_add_neon:  15301.5  12999.9  11855.1   9828.2
vp9_inv_dct_dct_32x32_sub24_add_neon:  16482.7  14931.5  12650.1  10575.0
vp9_inv_dct_dct_32x32_sub28_add_neon:  17589.5  15811.9  13482.8  11333.4
vp9_inv_dct_dct_32x32_sub32_add_neon:  18696.2  17049.2  14355.6  12089.7

vp9_inv_dct_dct_16x16_sub1_add_neon:     273.0    189.5    211.7    235.8
vp9_inv_dct_dct_16x16_sub2_add_neon:    1203.5    998.2   1035.3    763.0
vp9_inv_dct_dct_16x16_sub4_add_neon:    1203.5    998.1   1035.5    760.8
vp9_inv_dct_dct_16x16_sub8_add_neon:    1926.1   1610.6   1722.1   1271.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   2873.2   2129.7   2285.1   1757.3
vp9_inv_dct_dct_16x16_sub16_add_neon:   3221.4   2520.3   2557.6   2002.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     753.0    457.5    866.6    554.6
vp9_inv_dct_dct_32x32_sub2_add_neon:    7554.6   5652.4   6048.4   4920.2
vp9_inv_dct_dct_32x32_sub4_add_neon:    7549.9   5685.0   6046.9   4925.7
vp9_inv_dct_dct_32x32_sub8_add_neon:    8336.9   6704.5   6604.0   5478.0
vp9_inv_dct_dct_32x32_sub12_add_neon:  10914.0   9777.2   9240.4   7416.9
vp9_inv_dct_dct_32x32_sub16_add_neon:  11859.2  11223.3   9966.3   8095.1
vp9_inv_dct_dct_32x32_sub20_add_neon:  15237.1  13029.4  11838.3   9829.4
vp9_inv_dct_dct_32x32_sub24_add_neon:  16293.2  14379.8  12644.9  10572.0
vp9_inv_dct_dct_32x32_sub28_add_neon:  17424.3  15734.7  13473.0  11326.9
vp9_inv_dct_dct_32x32_sub32_add_neon:  18531.3  17457.0  14298.6  12080.0

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoaarch64: vp9itxfm: Move the load_add_store macro out from the itxfm16 pass2 function
Martin Storsjö [Sun, 5 Feb 2017 20:53:55 +0000 (22:53 +0200)]
aarch64: vp9itxfm: Move the load_add_store macro out from the itxfm16 pass2 function

This allows reusing the macro for a separate implementation of the
pass2 function.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoarm: vp9itxfm: Move the load_add_store macro out from the itxfm16 pass2 function
Martin Storsjö [Sun, 5 Feb 2017 20:55:20 +0000 (22:55 +0200)]
arm: vp9itxfm: Move the load_add_store macro out from the itxfm16 pass2 function

This allows reusing the macro for a separate implementation of the
pass2 function.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoaarch64: vp9itxfm: Make the larger core transforms standalone functions
Martin Storsjö [Wed, 23 Nov 2016 12:03:05 +0000 (14:03 +0200)]
aarch64: vp9itxfm: Make the larger core transforms standalone functions

This work is sponsored by, and copyright, Google.

This reduces the code size of libavcodec/aarch64/vp9itxfm_neon.o from
19496 to 14740 bytes.

This gives a small slowdown of a couple of tens of cycles, but makes
it more feasible to add more optimized versions of these transforms.

vp9_inv_dct_dct_16x16_sub4_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub16_add_neon:   1372.2
vp9_inv_dct_dct_32x32_sub4_add_neon:    5180.0
vp9_inv_dct_dct_32x32_sub32_add_neon:   8095.7

vp9_inv_dct_dct_16x16_sub4_add_neon:    1051.0
vp9_inv_dct_dct_16x16_sub16_add_neon:   1390.1
vp9_inv_dct_dct_32x32_sub4_add_neon:    5199.9
vp9_inv_dct_dct_32x32_sub32_add_neon:   8125.8

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoarm: vp9itxfm: Make the larger core transforms standalone functions
Martin Storsjö [Wed, 23 Nov 2016 08:56:12 +0000 (10:56 +0200)]
arm: vp9itxfm: Make the larger core transforms standalone functions

This work is sponsored by, and copyright, Google.

This reduces the code size of libavcodec/arm/vp9itxfm_neon.o from
15324 to 12388 bytes.

This gives a small slowdown of a couple tens of cycles, up to around
150 cycles for the full case of the largest transform, but makes
it more feasible to add more optimized versions of these transforms.

Before:                              Cortex A7       A8       A9      A53
vp9_inv_dct_dct_16x16_sub4_add_neon:    2063.4   1516.0   1719.5   1245.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   3279.3   2454.5   2525.2   1982.3
vp9_inv_dct_dct_32x32_sub4_add_neon:   10750.0   7955.4   8525.6   6754.2
vp9_inv_dct_dct_32x32_sub32_add_neon:  18574.0  17108.4  14216.7  12010.2

vp9_inv_dct_dct_16x16_sub4_add_neon:    2060.8   1608.5   1735.7   1262.0
vp9_inv_dct_dct_16x16_sub16_add_neon:   3211.2   2443.5   2546.1   1999.5
vp9_inv_dct_dct_32x32_sub4_add_neon:   10682.0   8043.8   8581.3   6810.1
vp9_inv_dct_dct_32x32_sub32_add_neon:  18522.4  17277.4  14286.7  12087.9

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoconfigure: Correctly recurse in do_check_deps()
Diego Biurrun [Wed, 8 Feb 2017 17:06:34 +0000 (18:06 +0100)]
configure: Correctly recurse in do_check_deps()

Fixes all sorts of configuration problems introducec by dad7a9c7c0ae
on non-Linux or non-vanilla configs. Also removes a line made redundant
in that commit.

4 years agoomx: Use the EOS flag to handle flushing at the end
Martin Storsjö [Mon, 6 Feb 2017 22:25:19 +0000 (00:25 +0200)]
omx: Use the EOS flag to handle flushing at the end

This avoids having to count the number of frames sent to the codec
and the number of output packets received; instead just wait until
the encoder returns a buffer with the EOS flag set.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoconfigure: Rework dependency handling for conflicting components
Diego Biurrun [Fri, 20 Jan 2017 16:17:16 +0000 (17:17 +0100)]
configure: Rework dependency handling for conflicting components

This makes the feature more visible and obvious.

4 years agoconfigure: Add name parameter to require_pkg_config() helper function
Diego Biurrun [Mon, 23 Jan 2017 10:57:14 +0000 (11:57 +0100)]
configure: Add name parameter to require_pkg_config() helper function

This allows distinguishing between the internal variable name for
external libraries and the pkg-config package name. Having both
names available avoids special-casing outside the helper function
when the two identifiers do not match.

4 years agoUse bitstream_init8() where appropriate
Diego Biurrun [Mon, 6 Jun 2016 11:20:17 +0000 (13:20 +0200)]
Use bitstream_init8() where appropriate

4 years agoconfigure: Use cppflags check helper functions where appropriate
Diego Biurrun [Fri, 20 Jan 2017 14:29:07 +0000 (15:29 +0100)]
configure: Use cppflags check helper functions where appropriate

4 years agoconfigure: Add stdlib.h #include to CPPFLAGS check helper functions
Diego Biurrun [Fri, 3 Feb 2017 09:15:40 +0000 (10:15 +0100)]
configure: Add stdlib.h #include to CPPFLAGS check helper functions

This ensures that added CPPFLAGS are validated against libc headers.

4 years agowma: Convert to the new bitstream reader
Alexandra Hájková [Fri, 15 Apr 2016 08:46:06 +0000 (10:46 +0200)]
wma: Convert to the new bitstream reader

4 years agoaarch64: vp9itxfm: Restructure the idct32 store macros
Martin Storsjö [Thu, 1 Dec 2016 09:10:19 +0000 (11:10 +0200)]
aarch64: vp9itxfm: Restructure the idct32 store macros

This avoids concatenation, which can't be used if the whole macro
is wrapped within another macro.

This is also arguably more readable.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoarm: vp9itxfm: Avoid .irp when it doesn't save any lines
Martin Storsjö [Sat, 4 Feb 2017 20:16:09 +0000 (22:16 +0200)]
arm: vp9itxfm: Avoid .irp when it doesn't save any lines

This makes it more readable.

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoasfdec: Use the ASF stream count when iterating
John Stebbins [Thu, 12 Jan 2017 20:36:26 +0000 (13:36 -0700)]
asfdec: Use the ASF stream count when iterating

The AVFormat stream count can be larger due external factors, such as
an id3 tag appended.

Avoid an out of bound read.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
4 years agoasm: Consistently uppercase SECTION markers
Diego Biurrun [Wed, 1 Feb 2017 12:27:30 +0000 (13:27 +0100)]
asm: Consistently uppercase SECTION markers

4 years agobuild: Ignore generated .version files
Diego Biurrun [Tue, 31 Jan 2017 14:46:50 +0000 (15:46 +0100)]
build: Ignore generated .version files

4 years agortmp: Correctly handle the Window Acknowledgement Size packets
Martin Storsjö [Tue, 31 Jan 2017 14:15:56 +0000 (16:15 +0200)]
rtmp: Correctly handle the Window Acknowledgement Size packets

This swaps which field is set when the Window Acknowledgement Size
and Set Peer BW packets are received, renames the fields in
order to clarify their role further and adds verbose comments
explaining their respective roles and how well the code currently
does what it is supposed to.

The Set Peer BW packet tells the receiver of the packet (which
can be either client or server) that it should not send more data
if it already has sent more data than the specified number of bytes,
without receiving acknowledgement for them. Actually checking this
limit is currently not implemented.

In order to be able to check that properly, one can send the
Window Acknowledgement Size packet, which tells the receiver of the
packet that it needs to send Acknowledgement packets
(RTMP_PT_BYTES_READ) at least after receiving a given number of bytes
since the last Acknowledgement.

Therefore, when we receive a Window Acknowledgement Size packet,
this sets the maximum number of bytes we can receive without sending
an Acknowledgement; therefore when handling this packet we should set
the receive_report_size field (previously client_report_size).

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agortmp: Rename packet types to closer match the spec
Martin Storsjö [Tue, 31 Jan 2017 13:47:00 +0000 (15:47 +0200)]
rtmp: Rename packet types to closer match the spec

Also rename comments and log messages accordingly,
and add clarifying comments for some hardcoded values.

The previous names were taken from older, reverse engineered

These names match the official public rtmp specification, and
matches the names used by wirecast in annotating captured
streams. These names also avoid hardcoding the roles of server
and client, since the handling of them is irrelevant of whether
we act as server or client.


The SERVER_BW and CLIENT_BW types are a bit more intertwined;

Signed-off-by: Martin Storsjö <martin@martin.st>
4 years agoconfigure: Add require_cpp_condition() convenience function
Diego Biurrun [Sun, 22 Jan 2017 15:15:38 +0000 (16:15 +0100)]
configure: Add require_cpp_condition() convenience function

Simplifies checking for conditions in external library headers and
aborting if said conditions are not met.

4 years agoconfigure: Add require_header() convenience function
Diego Biurrun [Sun, 22 Jan 2017 15:04:09 +0000 (16:04 +0100)]
configure: Add require_header() convenience function

Simplifies checking for external library headers and aborting if
the external library support was requested, but is not available.

4 years agoconfigure: Simplify libxcb check
Diego Biurrun [Sun, 22 Jan 2017 15:05:25 +0000 (16:05 +0100)]
configure: Simplify libxcb check

4 years agosvq3: Convert to the new bitstream reader
Alexandra Hájková [Thu, 17 Mar 2016 13:21:24 +0000 (14:21 +0100)]
svq3: Convert to the new bitstream reader

4 years agoconfigure: Drop weak dependencies on external libraries for webm muxer
Diego Biurrun [Mon, 23 Jan 2017 12:17:24 +0000 (13:17 +0100)]
configure: Drop weak dependencies on external libraries for webm muxer

Weak dependencies on external libraries do not obviate having to
explicitly enable these libraries, so the weak dependency does not
simplify the configure command line nor have any real effect.

4 years agoconfigure: Add proper weak dependency of drawtext filter on libfontconfig
Diego Biurrun [Mon, 23 Jan 2017 16:59:56 +0000 (17:59 +0100)]
configure: Add proper weak dependency of drawtext filter on libfontconfig

4 years agoconfigure: Simplify inline asm check with appropriate helper function
Diego Biurrun [Fri, 20 Jan 2017 14:30:36 +0000 (15:30 +0100)]
configure: Simplify inline asm check with appropriate helper function

4 years agoconfigure: Merge compiler/libc/os hacks sections
Diego Biurrun [Fri, 20 Jan 2017 14:29:57 +0000 (15:29 +0100)]
configure: Merge compiler/libc/os hacks sections

4 years agolavc: deprecate refcounted_frames field
wm4 [Mon, 16 Jan 2017 16:32:18 +0000 (17:32 +0100)]
lavc: deprecate refcounted_frames field

No deprecation guards, because the old decode API (for which this field
is needed) doesn't have any either.

This field should be removed together with the old decode calls.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
4 years agohwcontext_cuda: implement frames_get_constraints
wm4 [Mon, 16 Jan 2017 15:42:17 +0000 (16:42 +0100)]
hwcontext_cuda: implement frames_get_constraints

Copied and modified from hwcontext_qsv.c.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
4 years agoMark some arrays that never change as const.
Anton Khirnov [Sun, 3 Jul 2016 08:09:36 +0000 (10:09 +0200)]
Mark some arrays that never change as const.

4 years agoavconv: allow -b to be used with streamcopy
Anton Khirnov [Mon, 30 Jan 2017 20:35:42 +0000 (21:35 +0100)]
avconv: allow -b to be used with streamcopy

In this mode it tells the muxer about the bitrate of the input stream.

4 years agoffv1: Convert to the new bitstream reader
Alexandra Hájková [Sat, 19 Mar 2016 16:32:04 +0000 (17:32 +0100)]
ffv1: Convert to the new bitstream reader

4 years agoh261dec: Convert to the new bitstream reader
Alexandra Hájková [Sun, 10 Apr 2016 09:44:20 +0000 (11:44 +0200)]
h261dec: Convert to the new bitstream reader

4 years agoshorten: Convert to the new bitstream reader
Alexandra Hájková [Tue, 22 Mar 2016 15:09:39 +0000 (16:09 +0100)]
shorten: Convert to the new bitstream reader

4 years agoralf: Convert to the new bitstream reader
Alexandra Hájková [Tue, 22 Mar 2016 09:26:03 +0000 (10:26 +0100)]
ralf: Convert to the new bitstream reader

4 years agoloco: Convert to the new bitstream reader
Alexandra Hájková [Mon, 21 Mar 2016 19:23:36 +0000 (20:23 +0100)]
loco: Convert to the new bitstream reader

4 years agofic: Convert to the new bitstream reader
Alexandra Hájková [Sat, 19 Mar 2016 16:40:55 +0000 (17:40 +0100)]
fic: Convert to the new bitstream reader

4 years agodirac: Convert to the new bitstream reader
Alexandra Hájková [Sat, 19 Mar 2016 14:39:03 +0000 (15:39 +0100)]
dirac: Convert to the new bitstream reader