Commit Graph

2790 Commits

Author SHA1 Message Date
Fernando Sahmkow
da91e6e4b6 Apply Const correctness to SwizzleKepler and replace u32 for size_t on iterators. 2019-04-16 12:00:46 -04:00
Fernando Sahmkow
13d626fc21 Use ReadBlockUnsafe for fetyching DMA CommandLists 2019-04-16 11:22:34 -04:00
Fernando Sahmkow
06d1c5a991 Document unsafe versions and add BlockCopyUnsafe 2019-04-16 10:11:35 -04:00
Fernando Sahmkow
6fc562a9aa Use ReadBlockUnsafe for Shader Cache 2019-04-15 23:34:03 -04:00
Fernando Sahmkow
ef381e6924 Use ReadBlockUnsafe on TIC and TSC reading
Use ReadBlockUnsafe on TIC and TSC reading as memory is never flushed
from host GPU there.
2019-04-15 23:10:24 -04:00
Fernando Sahmkow
367704aa82 GPU MemoryManager: Implement ReadBlockUnsafe and WriteBlockUnsafe 2019-04-15 23:01:35 -04:00
Fernando Sahmkow
3e96c367bd Use WriteBlock and ReadBlock. 2019-04-15 22:42:34 -04:00
Fernando Sahmkow
bec28d692d Implement Block Linear copies in Kepler Memory. 2019-04-15 21:22:16 -04:00
ReinUsesLisp
ef8245bed2 vk_shader_decompiler: Add missing operations 2019-04-15 21:32:57 -03:00
ReinUsesLisp
f43995ec53 shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmetic
Operations done before the main half float operation (like HAdd) were
managing a packed value instead of the unpacked one. Adding an unpacked
operation allows us to drop the per-operand MetaHalfArithmetic entry,
simplifying the code overall.
2019-04-15 21:16:10 -03:00
ReinUsesLisp
abcbcb1b2a gl_shader_decompiler: Fix MrgH0 decompilation
GLSL decompilation for HMergeH0 was wrong. This addresses that issue.
2019-04-15 21:16:10 -03:00
ReinUsesLisp
64613db605 shader_ir/decode: Implement half float saturation 2019-04-15 21:16:10 -03:00
ReinUsesLisp
90cbf89303 shader_ir/decode: Reduce severity of unimplemented half-float FTZ 2019-04-15 21:16:09 -03:00
ReinUsesLisp
acf618afbc renderer_opengl: Implement half float NaN comparisons 2019-04-15 21:13:26 -03:00
ReinUsesLisp
ae46ad48ed shader_ir: Avoid using static on heap-allocated objects
Using static here might be faster at runtime, but it adds a heap
allocation called before main.
2019-04-15 21:12:43 -03:00
Fernando Sahmkow
aa471274d9 Do some corrections in conversion shader instructions.
Corrects encodings for I2F, F2F, I2I and F2I
Implements Immediate variants of all four conversion types.
Add assertions to unimplemented stuffs.
2019-04-15 19:16:27 -04:00
Fernando Sahmkow
8a099ac99f Correct Kepler Memory on Linear Pushes. 2019-04-15 14:51:36 -04:00
Fernando Sahmkow
773d955dfa Support compressed formats on linear textures. 2019-04-15 13:56:09 -04:00
Fernando Sahmkow
bf561e4340 Correct Pitch in Fermi2D 2019-04-15 12:24:29 -04:00
ReinUsesLisp
f15c59a164 gl_shader_decompiler: Use variable AOFFI on supported hardware 2019-04-14 05:13:19 -03:00
ReinUsesLisp
5c280e6ff0 shader_ir: Implement STG, keep track of global memory usage and flush 2019-04-14 00:25:32 -03:00
bunnei
c9454c8422
Merge pull request #2373 from FernandoS27/z32
Set Pixel Format to Z32 if its R32F and depth compare enabled, and Implement format ZF32_X24S8
2019-04-13 22:14:51 -04:00
bunnei
ee2206a1b7
Merge pull request #2386 from ReinUsesLisp/shader-manager
gl_shader_manager: Move code to source file and minor clean up
2019-04-13 22:09:27 -04:00
Lioncash
6d0551196d video_core/gpu: Create threads separately from initialization
Like with CPU emulation, we generally don't want to fire off the threads
immediately after the relevant classes are initialized, we want to do
this after all necessary data is done loading first.

This splits the thread creation into its own interface member function
to allow controlling when these threads in particular get created.
2019-04-11 22:11:40 -04:00
bunnei
ea80e2bc57
Merge pull request #2235 from ReinUsesLisp/spirv-decompiler
vk_shader_decompiler: Implement a SPIR-V decompiler
2019-04-11 21:54:23 -04:00
Fernando Sahmkow
c9305959d3 gl_rasterizer_cache: Relax restrictions on FastCopySurface and FastLayeredCopySurface 2019-04-11 13:14:28 -04:00
bunnei
6951741a94
Merge pull request #2278 from ReinUsesLisp/vc-texture-cache
video_core: Implement API agnostic view based texture cache
2019-04-10 21:17:35 -04:00
bunnei
0371650bd7
Merge pull request #2372 from FernandoS27/fermi-fix
Correct Fermi Copy on Linear Textures.
2019-04-10 21:17:03 -04:00
ReinUsesLisp
93af663683 gl_shader_manager: Move code to source file and minor clean up 2019-04-10 19:29:15 -03:00
ReinUsesLisp
6df25e9c7b gl_rasterizer: Apply just the needed state on Clear 2019-04-10 18:13:15 -03:00
ReinUsesLisp
0032821864 gl_device: Implement interface and add uniform offset alignment 2019-04-10 15:56:12 -03:00
ReinUsesLisp
75d23a3679 vk_shader_decompiler: Implement flow primitives 2019-04-10 14:20:25 -03:00
ReinUsesLisp
58ad8dfac6 vk_shader_decompiler: Implement most common texture primitives 2019-04-10 14:20:25 -03:00
ReinUsesLisp
4667ed8e22 vk_shader_decompiler: Implement texture decompilation helper functions 2019-04-10 14:20:25 -03:00
ReinUsesLisp
676172e20d vk_shader_decompiler: Implement Assign and LogicalAssign 2019-04-10 14:20:25 -03:00
ReinUsesLisp
d316d248ab vk_shader_decompiler: Implement non-OperationCode visits 2019-04-10 14:20:25 -03:00
ReinUsesLisp
b758c861b0 vk_shader_decompiler: Implement OperationCode decompilation interface 2019-04-10 14:20:25 -03:00
ReinUsesLisp
fec4eb9776 vk_shader_decompiler: Implement Visit 2019-04-10 14:20:25 -03:00
ReinUsesLisp
ca51f99840 vk_shader_decompiler: Implement labels tree and flow 2019-04-10 14:20:25 -03:00
ReinUsesLisp
13aa664f3f vk_shader_decompiler: Implement declarations 2019-04-10 14:20:25 -03:00
ReinUsesLisp
ad53b233c5 vk_shader_decompiler: Declare and stub interface for a SPIR-V decompiler 2019-04-10 14:20:25 -03:00
ReinUsesLisp
970d9e57c8 video_core: Add sirit as optional dependency with Vulkan
sirit is a runtime assembler for SPIR-V
2019-04-10 14:20:25 -03:00
bunnei
97648f4841
Merge pull request #2345 from ReinUsesLisp/multibind
gl_rasterizer: Use ARB_multi_bind to update buffers with a single call per drawcall
2019-04-10 11:23:19 -04:00
bunnei
ed9dba89d3
Merge pull request #2375 from FernandoS27/fix-ldc
Remove unnecessary bounding in LD_C
2019-04-09 21:23:24 -04:00
Fernando Sahmkow
c9f35d96be Remove bounding in LD_C 2019-04-09 20:02:11 -04:00
bunnei
2598433f9c
Merge pull request #2354 from lioncash/header
video_core/texures/texture: Remove unnecessary includes
2019-04-09 19:19:41 -04:00
bunnei
353a099481
Merge pull request #2366 from FernandoS27/xmad-fix
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 -04:00
bunnei
bc7e149835
Merge pull request #2369 from FernandoS27/mip-align
gl_backend: Align Pixel Storage
2019-04-09 17:20:43 -04:00
Fernando Sahmkow
cd91e98dab Correct Fermi Copy on Linear Textures. 2019-04-09 14:13:58 -04:00
Fernando Sahmkow
7c458311d3 Implement Texture Format ZF32_X24S8. 2019-04-09 12:33:46 -04:00
Fernando Sahmkow
b0aa8ad736 Correct depth compare with color formats for R32F 2019-04-09 12:06:59 -04:00
Fernando Sahmkow
9f16833097 gl_backend: Align Pixel Storage
This commit makes sure GL reads on the correct pack size for the
respective texture buffer.
2019-04-08 17:16:02 -04:00
Fernando Sahmkow
5c55ae4e18 Correct LOP_IMN encoding 2019-04-08 13:39:12 -04:00
Fernando Sahmkow
16adc735a5 Correct XMAD mode, psl and high_b on different encodings. 2019-04-08 13:01:17 -04:00
Fernando Sahmkow
ef8be408d3 Adapt Bindless to work with AOFFI 2019-04-08 12:07:56 -04:00
Fernando Sahmkow
492040bd9c Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format. 2019-04-08 11:36:11 -04:00
Fernando Sahmkow
797e351bf8 Fix bad rebase 2019-04-08 11:35:22 -04:00
Fernando Sahmkow
c60b0b8432 Fix TMML 2019-04-08 11:35:22 -04:00
Fernando Sahmkow
a77e9a27b0 Simplify ConstBufferAccessor 2019-04-08 11:35:19 -04:00
Fernando Sahmkow
fd4e994de3 Refactor GetTextureCode and GetTexCode to use an optional instead of optional parameters 2019-04-08 11:35:18 -04:00
Fernando Sahmkow
4841440382 Implement TXQ_B 2019-04-08 11:29:52 -04:00
Fernando Sahmkow
189bd1980c Implement TMML_B 2019-04-08 11:29:49 -04:00
Fernando Sahmkow
ac3ba9a33e Corrections to TEX_B 2019-04-08 11:28:44 -04:00
Fernando Sahmkow
90d06acfed Fixes to Const Buffer Accessor and Formatting 2019-04-08 11:23:47 -04:00
Fernando Sahmkow
7af82ca022 Implement Bindless Handling on SetupTexture 2019-04-08 11:23:46 -04:00
Fernando Sahmkow
fe392fff24 Unify both sampler types. 2019-04-08 11:23:45 -04:00
Fernando Sahmkow
e28fd3d0a5 Implement Bindless Samplers and TEX_B in the IR. 2019-04-08 11:23:42 -04:00
Fernando Sahmkow
c4ac05c82c Implement Const Buffer Accessor 2019-04-08 11:19:34 -04:00
bunnei
f14328bf0a
Merge pull request #2300 from FernandoS27/null-shader
shader_cache: Permit a Null Shader in case of a bad host_ptr.
2019-04-07 17:58:27 -04:00
bunnei
c2fee0e519
Merge pull request #2355 from ReinUsesLisp/sync-point
maxwell_3d: Reduce severity of ProcessSyncPoint
2019-04-07 17:56:11 -04:00
bunnei
8aaf418bd6
Merge pull request #2306 from ReinUsesLisp/aoffi
shader_ir: Implement AOFFI for TEX and TLD4
2019-04-07 17:52:30 -04:00
bunnei
6b18a1592f
Merge pull request #2321 from ReinUsesLisp/gl-state-rework
gl_state: Rework to enable individual applies
2019-04-07 17:50:07 -04:00
bunnei
21a4e7deea
Merge pull request #2098 from FreddyFunk/disk-cache-zstd
gl_shader_disk_cache: Use Zstandard for compression
2019-04-07 17:48:33 -04:00
bunnei
80162888e6
Merge pull request #2352 from bunnei/mem-manager-fixes
memory_manager: Improved implementation of read/write/copy block.
2019-04-07 17:44:59 -04:00
Fernando Sahmkow
021cd56bc9 Permit a Null Shader in case of a bad host_ptr. 2019-04-07 07:52:01 -04:00
ReinUsesLisp
ddcb711ee8 maxwell_3d: Reduce severity of ProcessSyncPoint 2019-04-06 02:18:20 -03:00
Lioncash
89c106e31b video_core/textures/convert: Replace include with a forward declaration
Avoids dragging in a direct dependency in a header.
2019-04-06 00:14:36 -04:00
Lioncash
fbf452ab0e video_core/texures/texture: Remove unnecessary includes
Nothing in this header relies on common_funcs or the memory manager.

This gets rid of reliance on indirect inclusions in the OpenGL caches.
2019-04-06 00:03:35 -04:00
bunnei
864280fabc
Merge pull request #2317 from FernandoS27/sync
Implement SyncPoint Register in the GPU.
2019-04-05 23:50:54 -04:00
bunnei
e3402d976d
Merge pull request #2346 from lioncash/header
video_core/engines: Remove unnecessary inclusions where applicable
2019-04-05 23:44:27 -04:00
bunnei
20be92d5e6 memory_manager: Improved implementation of read/write/copy block.
- Fixes graphical issues with Chocobo's Mystery Dungeon EVERY BUDDY!
- Fixes a crash with Mario Tennis Aces
2019-04-05 23:43:34 -04:00
bunnei
89b8801a97
Merge pull request #2350 from lioncash/vmem
video_core/memory_manager: Mark a few member functions with the const qualifier
2019-04-05 23:40:54 -04:00
bunnei
41890a84be
Merge pull request #2347 from lioncash/trunc
video_core/gpu_thread: Silence truncation warning in ThreadManager's constructor
2019-04-05 23:39:31 -04:00
bunnei
520e4e5d4b
Merge pull request #2327 from ReinUsesLisp/crash-safe-visit
gl_shader_decompiler: Return early when an operation is invalid
2019-04-05 23:36:18 -04:00
bunnei
cb2209d06a
Merge pull request #2337 from lioncash/temporary
gl_shader_decompiler: Rename GenerateTemporal() to GenerateTemporary()
2019-04-05 23:35:31 -04:00
Lioncash
00e7190e29 video_core/macro_interpreter: Remove assertion within FetchParameter()
We can just use .at(), which essentially does the same thing, but with
less code.
2019-04-05 22:56:58 -04:00
Lioncash
1efdb4897e video_core/macro_interpreter: Simplify GetRegister()
Given we already ensure nothing can set the zeroth register in
SetRegister(), we don't need to check if the index is zero and special
case it. We can just access the register normally, since it's already
going to be zero.

We can also replace the assertion with .at() to perform the equivalent
behavior inline as part of the API.
2019-04-05 22:55:13 -04:00
Lioncash
c13fbe6a41 video_core/memory_manager: Make Read() a const qualified member function
Given this doesn't actually alter internal state, this can be made a
const member function.
2019-04-05 20:30:48 -04:00
Lioncash
76ef6e5c2b video_core/memory_manager: Make ReadBlock() a const qualifier member function
Now, since we have a const qualified variant of GetPointer(), we can put
it to use in ReadBlock() to retrieve the source pointer that is passed
into memcpy.

Now block reading may be done from a const context.
2019-04-05 20:28:44 -04:00
Lioncash
34510bcda8 video_core/memory_manager: Add a const qualified variant of GetPointer()
Allows retrieving read-only pointers from a const context externally.
2019-04-05 20:25:28 -04:00
Lioncash
085b388a7a video_core/memory_manager: Make FindFreeRegion() a const member function
This doesn't modify internal state, so it can be made a const member
function.
2019-04-05 20:22:55 -04:00
Lioncash
9dec087fca video_core/memory_manager: Make GpuToCpuAddress() a const member function
This doesn't modify any internal state, so it can be made a const member
function to allow its use in const contexts.
2019-04-05 20:18:29 -04:00
Fernando Sahmkow
fc91e21206 Implement SyncPoint Register in the GPU. 2019-04-05 19:19:30 -04:00
Lioncash
30ce9b2b5c video_core/gpu_thread: Silence truncation warning in ThreadManager's constructor
Since c5d41fd812 callback parameters were
changed to use an s64 to represent late cycles instead of an int, so
this was causing a truncation warning to occur here. Changing it to s64
is sufficient to silence the warning.
2019-04-05 18:37:37 -04:00
Lioncash
22f02076c6 video_core/engines: Make memory manager members private
These aren't used externally by anything, so they can be made private
data members.
2019-04-05 18:26:43 -04:00
Lioncash
26223f8124 video_core/engines: Remove unnecessary inclusions where applicable
Replaces header inclusions with forward declarations where applicable
and also removes unused headers within the cpp file. This reduces a few
more dependencies on core/memory.h
2019-04-05 18:26:32 -04:00
ReinUsesLisp
34c3e2c786 renderer_opengl/utils: Skip empty binds 2019-04-05 19:19:49 -03:00
ReinUsesLisp
b631c09e72 gl_rasterizer: Use ARB_multi_bind to update SSBOs 2019-04-05 19:18:43 -03:00
ReinUsesLisp
2d1f054c61 gl_rasterizer: Use ARB_multi_bind to update UBOs across stages 2019-04-05 19:10:46 -03:00
bunnei
66be5150d6
Merge pull request #2282 from bunnei/gpu-asynch-v2
gpu_thread: Improve synchronization by using CoreTiming.
2019-04-04 22:38:04 -04:00