This is a hack to destroy all HostCounter instances before the base
class destructor is called. The query cache should be redesigned to have
a proper ownership model instead of using shared pointers.
For now, destroy the host counter hierarchy from the derived class
destructor.
This reworks how host<->device synchronization works on the Vulkan
backend. Instead of "protecting" resources with a fence and signalling
these as free when the fence is known to be signalled by the host GPU,
use timeline semaphores.
Vulkan timeline semaphores allow use to work on a subset of D3D12
fences. As far as we are concerned, timeline semaphores are a value set
by the host or the device that can be waited by either of them.
Taking advantange of this, we can have a monolithically increasing
atomic value for each submission to the graphics queue. Instead of
protecting resources with a fence, we simply store the current logical
tick (the atomic value stored in CPU memory). When we want to know if a
resource is free, it can be compared to the current GPU tick.
This greatly simplifies resource management code and the free status of
resources should have less false negatives.
To workaround bugs in validation layers, when these are attached there's
a thread waiting for timeline semaphores.
Now that the GPU is initialized when video backends are initialized,
it's no longer needed to query components once the game is running: it
can be done when yuzu is booting.
This allows us to pass components between constructors and in the
process remove all Core::System references in the video backend.
'driver_id' can only be known on Vulkan 1.1 after creating a logical
device. Move the driver id check to disable
VK_EXT_extended_dynamic_state after the logical device is successfully
initialized.
The Vulkan device will have the extension enabled but it will not be
used.
I made a request on the Xbyak issue tracker to allow some constructors
to be constexpr in order to avoid static constructors from needing to
execute for some of our register constants.
This request was implemented, so this updates Xbyak so that we can make
use of it.
Vertex binding's <stride> is bugged on AMD's proprietary drivers when
using VK_EXT_extended_dynamic_state. Blacklist it for now while we
investigate how to report this issue to AMD.
Add the necessary CMake code to copy the contents in a string source
shader (GLSL or GLASM) to a header file then consumed by video_core
files.
This allows editting GLSL in its own files without having to maintain
them in source files.
For now, only OpenGL presentation shaders are moved, but we can add
GLASM presentation shaders and static SPIR-V generation through
glslangValidator in the future.
State track the current primitive topology with a regular comparison
instead of using dirty flags.
This fixes a bug in dirty flags for this particular state and it also
avoids unnecessary state changes as this property is stored in a
frequently changed bit field.
Migrates a remaining common file over to the Common namespace, making it
consistent with the rest of common files.
This also allows for high-traffic FS related code to alias the
filesystem function namespace as
namespace FS = Common::FS;
for more concise typing.
This was assigning the field to itself, which is a no-op. The size
doesn't change between its initial assignment and this one, so this is a
safe change to make.
Does not allocate more threads than available in the host system for boot-time shader compilation and always allocates at least 1 thread if hardware_concurrency() returns 0.
There were two issues with block linear copies. First the swizzling was
wrong and this commit reimplements them.
The other issue was that these copies are generally used to download
render targets from the GPU and yuzu was not downloading them from
host GPU memory unless the extreme GPU accuracy setting was selected.
This commit enables cached memory reads for all accuracy levels.
- Fixes level thumbnails in Super Mario Maker 2.
The puller register array is made up of u32s however the `NUM_REGS` value is the size in bytes, so switch it to avoid making the struct unnecessary large. Also fix a small typo in a comment.
We can make use of emplace()'s return value to determine whether or not
we need to perform an increment.
emplace() performs no insertion if an element already exist, so this can
eliminate a find() call.
NV_shader_buffer_{load,store} is a 2010 extension that allows GL applications
to use what in Vulkan is known as physical pointers, this is basically C
pointers. On GLASM these is exposed through the LOAD/STORE/ATOM
instructions.
Up until now, assembly shaders were using NV_shader_storage_buffer_object.
These work fine, but have a (probably unintended) limitation that forces
us to have the limit of a single stage for all shader stages. In contrast,
with NV_shader_buffer_{load,store} we can pass GPU addresses to the
shader through local parameters (GLASM equivalent uniform constants, or
push constants on Vulkan). Local parameters have the advantage of being
per stage, allowing us to generate code without worrying about binding
overlaps.