Vittorio Giovara [Mon, 5 Dec 2016 18:42:27 +0000 (13:42 -0500)]
lavfi: Drop deprecated filter registration
Deprecated in 04/2013.
Vittorio Giovara [Mon, 5 Dec 2016 17:43:19 +0000 (12:43 -0500)]
lavfi: Drop deprecated filter initialization
Deprecated in 03/2013.
Vittorio Giovara [Mon, 5 Dec 2016 17:41:49 +0000 (12:41 -0500)]
lavfi: Drop deprecated functions to open a filter or a filterchain
Deprecated in 03/2013.
Vittorio Giovara [Mon, 5 Dec 2016 17:38:32 +0000 (12:38 -0500)]
lavfi: Drop deprecated way of passing options for a few filters
Deprecated in 02/2013.
Vittorio Giovara [Thu, 16 Mar 2017 19:37:51 +0000 (15:37 -0400)]
Bump major versions of all libraries
This disables everything that was deprecated at least 18 months ago.
Readjust the minimum API version as needed, postponing any
API-incompatible changes until the next bump.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Carl Eugen Hoyos [Fri, 25 Nov 2016 10:06:14 +0000 (11:06 +0100)]
flvdec: Set avg_frame_rate for video streams
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Tue, 21 Mar 2017 12:36:33 +0000 (14:36 +0200)]
libavutil: Hook up the rest of the gcc specific attributes to clang as well
Hook up all attributes that don't have a MSVC specific version at the
moment.
See
f637046d313 for details.
These don't seem to be critical for building with clang in MSVC mode
though, and thus haven't been hooked up until now.
These seem to build fine with as old clang as 3.3 at least.
(clang 3.3 disguises itself as gcc 4.2 normally, so all of these
have been used for clang before, except for av_cold.)
The clang version numbers themselves are useless for detecting what
attributes are available, since Apple's clang builds use a completely
different versioning (presenting itself as e.g. clang 8.0 instead
of 3.8).
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Tue, 21 Mar 2017 12:26:27 +0000 (14:26 +0200)]
libavutil: Define the noreturn attribute for clang in MSVC mode as well
This is a follow-up to
f637046d313.
Without the noreturn attribute set, avconv_opt.c fails to build after
d2e6dd32a44 with the error "control may reach end of non-void function".
By making sure the noreturn attribute is set properly, this compiles
as intended.
Signed-off-by: Martin Storsjö <martin@martin.st>
Luca Barbato [Tue, 14 Mar 2017 09:15:30 +0000 (09:15 +0000)]
dca: Refactor dca_filter_channels() a little
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Luca Barbato [Tue, 14 Mar 2017 09:15:29 +0000 (09:15 +0000)]
dca: Validate the channel map
Having a mismatch between the number of channels in the stream and those
in the channel map will lead to a segfault or worse.
Bug-Id: 1016
CC: libav-stable@libav.org
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Konda Raju [Fri, 17 Mar 2017 04:10:05 +0000 (09:40 +0530)]
nvenc: Allow different const qps for I, P and B frames
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Luca Barbato [Tue, 14 Mar 2017 16:44:45 +0000 (17:44 +0100)]
rtsp: Move message parsing to a separate function
Make easier to handle the polling function before we implement
full threading support.
Luca Barbato [Tue, 14 Mar 2017 22:42:37 +0000 (23:42 +0100)]
configure: Do not treat JACK as a system library
JACK is not commonly installed and should not be picked up as a
dependency unless specifically requested.
Mark Thompson [Sun, 19 Mar 2017 16:25:37 +0000 (16:25 +0000)]
avconv: Document the -init_hw_device option
Mark Thompson [Sat, 4 Mar 2017 23:57:34 +0000 (23:57 +0000)]
avconv: Enable generic hwaccel support for VDPAU
wm4 [Sat, 4 Mar 2017 23:57:33 +0000 (23:57 +0000)]
lavc: vdpau: add support for new hw_frames_ctx and hw_device_ctx API
This supports retrieving the device from a provided hw_frames_ctx, and
automatically creating a hw_frames_ctx if hw_device_ctx is set.
The old API is not deprecated yet. The user can still use
av_vdpau_bind_context() (with or without setting hw_frames_ctx), or use
the API before that by allocating and setting hwaccel_context manually.
wm4 [Sat, 4 Mar 2017 23:57:32 +0000 (23:57 +0000)]
lavc: Add hwaccel_flags field to AVCodecContext
This "reuses" the flags introduced for the av_vdpau_bind_context() API
function, and makes them available to all hwaccels. This does not affect
the current vdpau API, as av_vdpau_bind_context() should obviously
override the AVCodecContext.hwaccel_flags flags for the sake of
compatibility.
Mark Thompson [Sat, 4 Mar 2017 23:57:31 +0000 (23:57 +0000)]
avconv: Enable generic hwaccel support for VAAPI
Mark Thompson [Sat, 4 Mar 2017 23:57:30 +0000 (23:57 +0000)]
avconv: Generic device setup
Not yet enabled for any hwaccels.
Mark Thompson [Sat, 4 Mar 2017 23:57:29 +0000 (23:57 +0000)]
hwcontext: Make it easier to work with device types
Adds functions to convert to/from strings and a function to iterate
over all supported device types. Also adds a new invalid type
AV_HWDEVICE_TYPE_NONE, which acts as a sentinel value.
Mark Thompson [Sat, 4 Mar 2017 23:57:28 +0000 (23:57 +0000)]
hwcontext: Add device derivation
Creates a new device context from another of a different type which
refers to the same underlying hardware.
Diego Biurrun [Tue, 14 Mar 2017 15:38:38 +0000 (16:38 +0100)]
rtmp: Move RTMP digest calculation to a separate file
The rtmpcrypt protocol requires it.
Diego Biurrun [Tue, 14 Mar 2017 14:31:04 +0000 (15:31 +0100)]
build: Add missing object dependency for extract_extradata bitstream filter
Martin Storsjö [Sun, 8 Jan 2017 22:04:19 +0000 (00:04 +0200)]
arm/aarch64: vp9: Fix vertical alignment
Align the second/third operands as they usually are.
Due to the wildly varying sizes of the written out operands
in aarch64 assembly, the column alignment is usually not as clear
as in arm assembly.
Signed-off-by: Martin Storsjö <martin@martin.st>
James Almer [Thu, 9 Mar 2017 18:11:49 +0000 (15:11 -0300)]
matroskaenc: add support for Spherical Video elements
Signed-off-by: James Almer <jamrial@gmail.com>
Minor cosmetic changes by committer.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Luca Barbato [Tue, 14 Mar 2017 23:07:23 +0000 (00:07 +0100)]
configure: Replace -no_weak_symbols with -Werror=partial-availability
Jack uses weak symbols on purpose.
Diego Biurrun [Sun, 15 Jul 2012 18:11:55 +0000 (20:11 +0200)]
x86: fft: Port to cpuflags
Diego Biurrun [Wed, 8 Mar 2017 18:49:15 +0000 (19:49 +0100)]
x86: h264: Simplify DEQUANT macro with cpuflags
Diego Biurrun [Fri, 27 Jul 2012 10:09:17 +0000 (12:09 +0200)]
x86: vp8dsp: port FILTER_BILINEAR macro to cpuflags
Diego Biurrun [Sun, 15 Jul 2012 16:01:10 +0000 (18:01 +0200)]
x86util: Port all macros to cpuflags
Also do some small cosmetic changes: Drop pointless _MMX suffix from ABSD2
macro name, drop pointless check for MMX support, we always assume MMX is
available in our SIMD code, fix spelling.
Anton Khirnov [Wed, 28 Dec 2016 12:02:02 +0000 (13:02 +0100)]
h264_cavlc: check the value of run_before
Section 9.2.3.2 of the spec implies that run_before must not be larger
than zeros_left.
Fixes invalid reads with corrupted files.
CC: libav-stable@libav.org
Bug-Id: 1000
Found-By: Kamil Frankowicz
Anton Khirnov [Wed, 28 Dec 2016 10:27:56 +0000 (11:27 +0100)]
h2645_parse: use the bytestream2 API for packet splitting
The code does some nontrivial jumping around in the buffer, so it is
safer to use a checked API rather than do everything manually.
Fixes a bug in nalff parsing, where the length field is currently not
counted in the buffer size check, resulting in possible overreads with
invalid files.
CC: libav-stable@libav.org
Bug-Id: 1002
Found-By: Kamil Frankowicz
Anton Khirnov [Wed, 28 Dec 2016 10:05:25 +0000 (11:05 +0100)]
h264dec: initialize field_started to 0 on each decode call
It might be incorrectly set to 1 if the previous call exited with an
error.
Bug-Id: 1019
CC: libav-stable@libav.org
Martin Storsjö [Sun, 26 Feb 2017 20:13:10 +0000 (22:13 +0200)]
arm/aarch64: vp9itxfm: Skip loading the min_eob pointer when it won't be used
In the half/quarter cases where we don't use the min_eob array, defer
loading the pointer until we know it will be needed.
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Sun, 26 Feb 2017 12:02:35 +0000 (14:02 +0200)]
arm: vp9itxfm: Template the quarter/half idct32 function
This reduces the number of lines and reduces the duplication.
Also simplify the eob check for the half case.
If we are in the half case, we know we at least will need to do the
first three slices, we only need to check eob for the fourth one,
so we can hardcode the value to check against instead of loading
from the min_eob array.
Since at most one slice can be skipped in the first pass, we can
unroll the loop for filling zeros completely, as it was done for
the quarter case before.
This allows skipping loading the min_eob pointer when using the
quarter/half cases.
Signed-off-by: Martin Storsjö <martin@martin.st>
Diego Biurrun [Mon, 29 Feb 2016 14:39:27 +0000 (15:39 +0100)]
cfhd: Add FATE tests
Kieran Kunhya [Sat, 30 Jan 2016 17:39:48 +0000 (17:39 +0000)]
Add Cineform HD Decoder
Decodes YUV 4:2:2 10-bit and RGB 12-bit files.
Older files with more subbands, skips, Bayer, alpha not supported.
Further fixes and refactorings by Anton Khirnov <anton@khirnov.net>,
Diego Biurrun <diego@biurrun.de>, Vittorio Giovara <vittorio.giovara@gmail.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Konda Raju [Tue, 7 Mar 2017 06:32:14 +0000 (12:02 +0530)]
add initial QP value options
Signed-off-by: Diego Biurrun <diego@biurrun.de>
wm4 [Mon, 6 Mar 2017 10:34:20 +0000 (11:34 +0100)]
avcodec: clarify some decoding/encoding API details
Make it clear that there is no timing-dependent behavior. In particular,
there is no state in which both input and output are denied, and where
you have to wait for a while yourself to make progress (apparently some
hardware decoders like to do this).
Avoid wording that makes references to time. It shouldn't be mistaken
for some kind of asynchronous API (like POSIX read() can return EAGAIN
if there is no new input yet). It's a state machine, so try to use
appropriate terms.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Vittorio Giovara [Fri, 10 Feb 2017 21:02:22 +0000 (16:02 -0500)]
mkv: Export bounds and padding from spherical metadata
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
James Almer [Tue, 6 Dec 2016 17:48:45 +0000 (14:48 -0300)]
mkv: Add support for Spherical Video elements
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Vittorio Giovara [Fri, 10 Feb 2017 20:36:56 +0000 (15:36 -0500)]
mov: Export bounds and padding from spherical metadata
Update the fate test as needed.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Vittorio Giovara [Fri, 10 Feb 2017 20:26:55 +0000 (15:26 -0500)]
spherical: Add tiled equirectangular type and projection-specific properties
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Vittorio Giovara [Tue, 28 Feb 2017 16:27:02 +0000 (11:27 -0500)]
mov: Validate cubemap layout
Vittorio Giovara [Wed, 15 Feb 2017 15:40:16 +0000 (10:40 -0500)]
mov: Validate spherical metadata version
Vittorio Giovara [Tue, 28 Feb 2017 15:54:36 +0000 (10:54 -0500)]
mov: Ignore old spherical metadata when newer version is present
Aaron Colwell [Fri, 27 Jan 2017 17:33:29 +0000 (09:33 -0800)]
mov: Fix spherical metadata_source parsing
Signed-off-by: James Almer <jamrial@gmail.com>
Luca Barbato [Mon, 6 Mar 2017 19:21:19 +0000 (20:21 +0100)]
configure: Check for -no_weak_imports in ldflags on macOS
Recent versions of macOS provide more POSIX API (in particular,
clock_gettime) than previous versions and recent Apple toolchains
provide all that API, even when targeting older releases without
said API. Disallow linking to functions which might not be available
at runtime.
To actually have an effect, either add
--extra-cflags="-mmacosx-version-min=10.11" (or any other version
prior to 10.12) or set MACOSX_DEPLOYMENT_TARGET=10.11 when running
configure.
As a workaround for libav versions without this fix, one can
also add --extra-cflags="-mmacosx-version-min=10.11
-Werror=partial-availability" while running configure.
The -no_weak_imports flag is new in Xcode 8; in Xcode 7 it is not
supported. This is not an issue since Xcode 7 only ships with the
10.11 macOS SDK, which lacks clock_gettime.
Bug-Id: 1033
CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
Diego Biurrun [Thu, 13 Oct 2016 18:33:15 +0000 (20:33 +0200)]
build: Prefer NASM assembler over YASM
NASM is more actively maintained and permits generating dependency information
as a sideeffect of assembling, thus cutting build times in half.
Diego Biurrun [Tue, 28 Feb 2017 18:32:37 +0000 (19:32 +0100)]
build: Make x86 assembler commandline-selectable
Diego Biurrun [Thu, 2 Mar 2017 13:54:28 +0000 (14:54 +0100)]
build: Special-case handling of SDL CFLAGS
SDL adds some "special" CFLAGS that interfere with building normal
binaries. Capture those CFLAGS separately and avoid adding them to
the general CFLAGS.
Diego Biurrun [Mon, 6 Mar 2017 18:35:12 +0000 (19:35 +0100)]
build: Fix logic of clock_gettime() check
We should only check for clock_gettime() if _POSIX_MONOTONIC_CLOCK is
available and do a full link check for clock_gettime() in all cases.
Vittorio Giovara [Thu, 2 Mar 2017 00:45:31 +0000 (19:45 -0500)]
pixlet: Fix architecture-dependent code and values
The constants used in the decoder used floating point precision,
and this caused different values to be generated on different
architectures. Additionally on big endian machines, the fate test
would output bytes in native order, which is different from the one
hardcoded in the test.
So, eradicate floating point numbers and use fixed point (32.32)
arithmetics everywhere, replacing constants with precomputed integer
values, and force the pixel format output to be the same in the fate
test.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Diego Biurrun [Wed, 1 Mar 2017 18:42:21 +0000 (19:42 +0100)]
build: Explicitly set 32-bit/64-bit object formats for nasm/yasm
Consistently use object format names with "32" suffix and set object format
to "win64" on Windows x86_64, which fixes assembling with nasm.
Diego Biurrun [Wed, 1 Mar 2017 18:04:03 +0000 (19:04 +0100)]
x86: Merge align directives into SECTION_RODATA declarations where possible
Ganapathy Kasi [Wed, 1 Mar 2017 23:04:47 +0000 (15:04 -0800)]
nvenc: Remove qmin and qmax constraints for nvenc vbr
qmin and qmax are not necessary for nvenc vbr.
Also fix for using 2 pass vbr mode for slow preset through ctx->flag NVENC_TWO_PASSES.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Paul B Mahol [Mon, 19 Sep 2016 12:53:03 +0000 (08:53 -0400)]
Add Apple Pixlet decoder
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
James Almer [Wed, 22 Feb 2017 17:53:34 +0000 (12:53 -0500)]
libavutil: add av_mod_uintp2
Signed-off-by: James Almer <jamrial@gmail.com>
Ganesh Ajjanagadde [Wed, 22 Feb 2017 17:53:33 +0000 (12:53 -0500)]
intmath: add faster clz support
Diego Biurrun [Wed, 1 Mar 2017 11:02:11 +0000 (12:02 +0100)]
build: Add pthreads to list of avutil extralibs
libavutil uses pthreads in the buffer code (abstracted through a header).
Diego Biurrun [Fri, 5 Oct 2012 12:46:38 +0000 (14:46 +0200)]
fate: Add build-only targets to FATE
Diego Biurrun [Thu, 13 Oct 2016 00:45:09 +0000 (02:45 +0200)]
build: Allow generating dependencies as a side-effect of assembling
Diego Biurrun [Sat, 8 Oct 2016 14:18:33 +0000 (16:18 +0200)]
build: Generalize yasm/nasm-related variable names
None of them are specific to the YASM assembler.
Diego Biurrun [Tue, 28 Feb 2017 21:11:39 +0000 (22:11 +0100)]
build: Add "build" shorthand target that depends on all compile targets
Diego Biurrun [Tue, 28 Feb 2017 21:12:18 +0000 (22:12 +0100)]
build: Skip generating .version files when cleaning
Diego Biurrun [Tue, 28 Feb 2017 18:01:28 +0000 (19:01 +0100)]
configure: Fix typo in objcc default setting
Also drop stray duplicate OBJCC config.mak entry.
Diego Biurrun [Tue, 28 Feb 2017 17:35:10 +0000 (18:35 +0100)]
x86: hevc: Add missing colons after assembly labels
This fixes several warnings of the sort
warning: label alone on a line without a colon might be in error
Diego Biurrun [Sun, 22 Jan 2017 15:42:36 +0000 (16:42 +0100)]
build: Fine-grained link-time dependency settings
Previously, all link-time dependencies were added for all libraries,
resulting in bogus link-time dependencies since not all dependencies
are shared across libraries. Also, in some cases like libavutil, not
all dependencies were taken into account, resulting in some cases of
underlinking.
To address all this mess a machinery is added for tracking which
dependency belongs to which library component and then leveraged
to determine correct dependencies for all individual libraries.
Diego Biurrun [Tue, 24 Jan 2017 12:57:52 +0000 (13:57 +0100)]
configure: Simplify dlopen check
Michael Niedermayer [Wed, 15 Feb 2017 16:34:52 +0000 (11:34 -0500)]
h264_sei: Check actual presence of picture timing SEI message
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Diego Biurrun [Fri, 24 Feb 2017 13:00:24 +0000 (14:00 +0100)]
build: Explicitly disable external libraries when not explicitly enabled
Leaving those variables in an undefined state allows them getting implicitly
enabled when they are declared as weak dependencies of other components.
In that case, the library check is not run and required linker flags are not
added, resulting in a failing build.
Fixes linking when enabling libfreetype without libfontconfig.
Diego Biurrun [Thu, 18 Oct 2012 10:34:23 +0000 (12:34 +0200)]
fate: Rename WMV8_DRM decoder tests to WMV3_DRM
The codec used in those files is WMV3/WMV9, not WMV2/WMV8.
Luca Barbato [Mon, 20 Feb 2017 01:16:28 +0000 (02:16 +0100)]
rtsp: Lazily set up the pollfd array once
Ben Chang [Fri, 24 Feb 2017 22:39:21 +0000 (14:39 -0800)]
nvenc: Fix the preset mapping list
The map is a sparse array and does not need a empty element to terminate
it.
The empty element is stored after the last one inserted in the list,
overwriting whichever element was next with zeros.
Bug-Id: 1029
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Diego Biurrun [Mon, 15 Oct 2012 13:38:29 +0000 (15:38 +0200)]
fate: Make null comparison method more useful
This allows dropping /dev/null as reference value when no output is generated.
Diego Biurrun [Wed, 22 Feb 2017 13:18:47 +0000 (14:18 +0100)]
build: Drop DOC_ prefix from EXAMPLES-related variables
Luca Barbato [Mon, 20 Feb 2017 01:11:58 +0000 (02:11 +0100)]
rtsp: Lazily allocate the pollfd array
And use av_malloc_array.
Luca Barbato [Sun, 19 Feb 2017 23:50:34 +0000 (00:50 +0100)]
rtsp: Move the pollfd setup out of the for loop
Luca Barbato [Sun, 19 Feb 2017 23:04:59 +0000 (00:04 +0100)]
rtsp: Factor out packet reading
Diego Biurrun [Thu, 18 Oct 2012 08:15:07 +0000 (10:15 +0200)]
Use modern avconv syntax for codec selection in documentation and tests
Diego Biurrun [Sat, 25 Feb 2017 16:19:48 +0000 (17:19 +0100)]
fate: Use bitexact optimizations in the svq3-2 test
This fixes the test with mmxext disabled because the current reference
frame hashes correspond to the non-bitexact mmxext optimizations.
Anton Khirnov [Tue, 14 Feb 2017 19:51:06 +0000 (20:51 +0100)]
lavc: make sure not to return EAGAIN from codecs
This error is treated specially by the API.
CC: libav-stable@libav.org
James Almer [Fri, 10 Feb 2017 23:24:27 +0000 (20:24 -0300)]
apetag: account for header size if present when returning the start position
The size field in the header/footer accounts for the entire APE tag
structure except the 32 bytes from header, for compatibility with
APEv1.
Signed-off-by: James Almer <jamrial@gmail.com>
CC: libav-stable@libav.org
Signed-off-by: Anton Khirnov <anton@khirnov.net>
James Almer [Fri, 10 Feb 2017 23:24:26 +0000 (20:24 -0300)]
apetag: fix flag value to signal footer presence
According to the spec[1], a value of 0 means the footer is present and a value
of 1 means it's absent, the exact opposite of header presence flag where 1
means present and 0 absent.
The reason for this is compatibility with APEv1 tags, where there's no header,
footer presence was mandatory for all files, and the flags field was a zeroed
reserved field.
[1] http://wiki.hydrogenaud.io/index.php?title=Ape_Tags_Flags
Signed-off-by: James Almer <jamrial@gmail.com>
CC: libav-stable@libav.org
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Anton Khirnov [Wed, 1 Feb 2017 10:50:38 +0000 (11:50 +0100)]
svq3: fix the slice size check
Currently it incorrectly compares bits with bytes.
Also, move the check right before where it's relevant, so that the
correct number of remaining bits is used.
CC: libav-stable@libav.org
John Stebbins [Thu, 23 Feb 2017 23:47:58 +0000 (16:47 -0700)]
asfdec: fix reading files larger than 2GB
avio_skip returns file position and overflows int
John Stebbins [Thu, 23 Feb 2017 21:22:56 +0000 (14:22 -0700)]
h264dec: fix dropped initial SEI recovery point
Diego Biurrun [Sat, 6 Apr 2013 10:48:32 +0000 (12:48 +0200)]
fate: Add another SVQ3 test to increase coverage
Martin Storsjö [Sat, 31 Dec 2016 20:27:13 +0000 (22:27 +0200)]
aarch64: vp9itxfm: Reorder iadst16 coeffs
This matches the order they are in the 16 bpp version.
There they are in this order, to make sure we access them in the
same order they are declared, easing loading only half of the
coefficients at a time.
This makes the 8 bpp version match the 16 bpp version better.
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Sat, 31 Dec 2016 20:27:13 +0000 (22:27 +0200)]
arm: vp9itxfm: Reorder iadst16 coeffs
This matches the order they are in the 16 bpp version.
There they are in this order, to make sure we access them in the
same order they are declared, easing loading only half of the
coefficients at a time.
This makes the 8 bpp version match the 16 bpp version better.
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Sat, 31 Dec 2016 12:18:31 +0000 (14:18 +0200)]
aarch64: vp9itxfm: Reorder the idct coefficients for better pairing
All elements are used pairwise, except for the first one.
Previously, the 16th element was unused. Move the unused element
to the second slot, to make the later element pairs not split
across registers.
This simplifies loading only parts of the coefficients,
reducing the difference to the 16 bpp version.
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Sat, 31 Dec 2016 12:05:44 +0000 (14:05 +0200)]
arm: vp9itxfm: Reorder the idct coefficients for better pairing
All elements are used pairwise, except for the first one.
Previously, the 16th element was unused. Move the unused element
to the second slot, to make the later element pairs not split
across registers.
This simplifies loading only parts of the coefficients,
reducing the difference to the 16 bpp version.
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Mon, 2 Jan 2017 20:08:41 +0000 (22:08 +0200)]
aarch64: vp9itxfm: Avoid reloading the idct32 coefficients
The idct32x32 function actually pushed d8-d15 onto the stack even
though it didn't clobber them; there are plenty of registers that
can be used to allow keeping all the idct coefficients in registers
without having to reload different subsets of them at different
stages in the transform.
After this, we still can skip pushing d12-d15.
Before:
vp9_inv_dct_dct_32x32_sub32_add_neon: 8128.3
After:
vp9_inv_dct_dct_32x32_sub32_add_neon: 8053.3
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Mon, 2 Jan 2017 20:50:38 +0000 (22:50 +0200)]
arm: vp9itxfm: Avoid reloading the idct32 coefficients
The idct32x32 function actually pushed q4-q7 onto the stack even
though it didn't clobber them; there are plenty of registers that
can be used to allow keeping all the idct coefficients in registers
without having to reload different subsets of them at different
stages in the transform.
Since the idct16 core transform avoids clobbering q4-q7 (but clobbers
q2-q3 instead, to avoid needing to back up and restore q4-q7 at all
in the idct16 function), and the lanewise vmul needs a register in
the q0-q3 range, we move the stored coefficients from q2-q3 into q4-q5
while doing idct16.
While keeping these coefficients in registers, we still can skip pushing
q7.
Before: Cortex A7 A8 A9 A53
vp9_inv_dct_dct_32x32_sub32_add_neon: 18553.8 17182.7 14303.3 12089.7
After:
vp9_inv_dct_dct_32x32_sub32_add_neon: 18470.3 16717.7 14173.6 11860.8
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Sat, 14 Jan 2017 11:22:30 +0000 (13:22 +0200)]
arm: vp9lpf: Implement the mix2_44 function with one single filter pass
For this case, with 8 inputs but only changing 4 of them, we can fit
all 16 input pixels into a q register, and still have enough temporary
registers for doing the loop filter.
The wd=8 filters would require too many temporary registers for
processing all 16 pixels at once though.
Before: Cortex A7 A8 A9 A53
vp9_loop_filter_mix2_v_44_16_neon: 289.7 256.2 237.5 181.2
After:
vp9_loop_filter_mix2_v_44_16_neon: 221.2 150.5 177.7 138.0
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Thu, 23 Feb 2017 21:33:58 +0000 (23:33 +0200)]
aarch64: vp9lpf: Use dup+rev16+uzp1 instead of dup+lsr+dup+trn1
This is one cycle faster in total, and three instructions fewer.
Before:
vp9_loop_filter_mix2_v_44_16_neon: 123.2
After:
vp9_loop_filter_mix2_v_44_16_neon: 122.2
Signed-off-by: Martin Storsjö <martin@martin.st>
Martin Storsjö [Sat, 14 Jan 2017 18:49:19 +0000 (20:49 +0200)]
arm/aarch64: vp9lpf: Keep the comparison to E within 8 bit
The theoretical maximum value of E is 193, so we can just
saturate the addition to 255.
Before: Cortex A7 A8 A9 A53 A53/AArch64
vp9_loop_filter_v_4_8_neon: 143.0 127.7 114.8 88.0 87.7
vp9_loop_filter_v_8_8_neon: 241.0 197.2 173.7 140.0 136.7
vp9_loop_filter_v_16_8_neon: 497.0 419.5 379.7 293.0 275.7
vp9_loop_filter_v_16_16_neon: 965.2 818.7 731.4 579.0 452.0
After:
vp9_loop_filter_v_4_8_neon: 136.0 125.7 112.6 84.0 83.0
vp9_loop_filter_v_8_8_neon: 234.0 195.5 171.5 136.0 133.7
vp9_loop_filter_v_16_8_neon: 490.0 417.5 377.7 289.0 271.0
vp9_loop_filter_v_16_16_neon: 951.2 814.7 732.3 571.0 446.7
Signed-off-by: Martin Storsjö <martin@martin.st>
Diego Biurrun [Wed, 22 Feb 2017 10:39:21 +0000 (11:39 +0100)]
Place attribute_deprecated in the right position for struct declarations
libavcodec/vaapi.h:58:1: warning: attribute 'deprecated' is ignored, place it after "struct" to apply attribute to type declaration [-Wignored-attributes]
Luca Barbato [Wed, 22 Feb 2017 08:55:45 +0000 (09:55 +0100)]
mkv: Update the seek test to match
5d3953a5dc
John Stebbins [Tue, 21 Feb 2017 23:47:20 +0000 (16:47 -0700)]
fate: Update fate-lavf-mkv after commit
5d3953a5dc