Showing posts with label d3d. Show all posts
Showing posts with label d3d. Show all posts

Monday, October 18, 2021

vkd3d-proton version 2.5 has been released

VKD3D-Proton is a fork of VKD3D, which aims to implement the full Direct3D 12 API on top of Vulkan. The project serves as the development effort for Direct3D 12 support in Proton.


This is a release with a little bit of everything!


DXR progress

DXR has seen significant work in the background.

  • DXR 1.1 is now experimentally exposed. It can be enabled with VKD3D_CONFIG=dxr11.
    Note that DXR 1.1 cannot be fully implemented in VK_KHR_ray_tracing's current form, in particular
    DispatchRays() indirect is not compatible yet,
    although we have not observed a game which requires this API feature.
  • DXR 1.1 inline raytracing support is fully implemented.
  • DXR 1.0 support is more or less feature complete.
    Some weird edge cases remain, but will likely not be implemented unless required by a game.
    VKD3D_CONFIG=dxr will eventually be dropped when it matures.

Some new DXR games are starting to come alive, especially with DXR 1.1 enabled,
but there are significant bugs as well that we currently cannot easily debug.
Some experimental results on NVIDIA:

  • Control - already worked
  • DEATHLOOP - appears to work correctly
  • Cyberpunk 2077 - DXR can be enabled, but GPU timeouts
  • World of Warcraft - according to a user, it works, but we have not confirmed ourselves
  • Metro Exodus: Enhanced Edition -
    gets ingame and appears to work? Not sure if it looks correct.
    Heavy CPU stutter for some reason ...
  • Metro Exodus (original release) - GPU timeouts when enabling DXR
  • Resident Evil: Village - Appears to work, but the visual difference is subtle.

It's worth experimenting with these and others.
DXR is incredibly complicated, so expect bugs.
From here, DXR support is mostly a case of stamping out issues one by one.


NVIDIA contributed integration APIs in vkd3d-proton which enables DLSS support in D3D12 titles in Proton.
See Proton documentation for how to enable NvAPI support.

Shader models

A fair bit of work went into DXIL translation support to catch up with native drivers.

  • Shader model 6.5 is exposed.
    Shader model 6.6 should be straight forward once that becomes relevant.
  • Shader model 6.4 implementation takes advantage of VK_KHR_shader_integer_dot_product when supported.
  • Proper fallback for FP16 math on GPUs which do not expose native FP16 support (Polaris, Pascal).
    Notably fixes AMD FSR shaders in Resident Evil: Village (and others).
  • Shader model 6.1 SV_Barycentric support implemented (NVIDIA only for now).
  • Support shader model 6.2 FP32 denorm control.


Resizable BAR can improve GPU performance about 10-15% in the best case, depends a lot on the game.
Horizon Zero Dawn and Death Stranding in particular improve massively with this change.

By default, vkd3d-proton will now take advantage of PCI-e BAR memory types through heuristics
as D3D12 does not expose direct support for resizable BAR, and native D3D12 drivers are known to use heuristics as well.
Without resizable BAR enabled in BIOS/vBIOS, we only get 256 MiB which can help performance,
but many games will improve performance even more
when we are allowed to use more than that.
There is an upper limit for how much VRAM is dedicated to this purpose.
We also added VKD3D_CONFIG=no_upload_hvv to disable all uses of PCI-e BAR memory.

Other performance improvements:

  • Avoid redundant descriptor update work in certain scenarios (NVIDIA contribution).
  • Minor tweaks here and there to reduce CPU overhead.

Fixes and workarounds

  • Fix behavior for swap chain presentation latency HANDLE. Fixes spurious deadlocks in some cases.
  • Fix many issues related to depth-stencil handling, which fixed various issues in DEATHLOOP, F1 2021, WRC 10.
  • Fix DIRT 5 rendering issues and crashes. Should be fully playable now.
  • Fix some Diablo II Resurrected rendering issues.
  • Workaround shader bugs in Psychonauts 2.
  • Workaround some Unreal Engine 4 shader bugs which multiple titles trigger.
  • Fix some stability issues when VRAM is exhausted on NVIDIA.
  • Fix CPU crash in boot-up sequence of Far Cry 6 (game is still kinda buggy though, but gets in-game).
  • Fix various bugs with host visible images. Fixes DEATHLOOP.
  • Fix various DXIL conversion bugs.
  • Add Invariant geometry workarounds for specific games which require it.
  • Fix how d3d12.dll exports symbols to be more in line with MSVC.
  • Fix some edge cases in bitfield instructions.
  • Work around extreme CPU memory bloat on the specific NVIDIA driver versions which had this bug.
  • Fix regression in Evil Genius 2: World Domination.
  • Fix crashes in Hitman 3.
  • Fix terrain rendering in Anno 1800.
  • Various correctness and crash fixes.

 Link to source code

Run Microsoft Windows Applications and Games on Mac, Linux or ChromeOS save up to 20% off  CodeWeavers CrossOver+ today.


Wednesday, May 23, 2018

Vkd3d 1.0 Released the Direct3D 12 to Vulkan translation library

The Wine team is proud to announce that release 1.0 of vkd3d, the Direct3D 12 to Vulkan translation library, is now available.

This is the first release of vkd3d. A lot of Direct3D 12 features are still missing and bugs are expected. The current version was tested mainly with demo applications. A number of features that are being worked on have been deferred to the next development cycle. This includes in particular geometry and tessellation shaders support, various shader translation improvements, as well as various improvements for core Direct3D 12 methods.

The source is available here.

Run Microsoft Windows Applications and Games on Mac, Linux or ChromeOS save up to 20% off  CodeWeavers CrossOver+ today.

Friday, September 6, 2013

Direct3D Performance Improvements Coming To Wine

Stefan Dösinger of CodeWeavers has been working on some Direct3D performance improvements for Wine by creating a separate command stream / worker thread for WineD3D. This work moves OpenGL calls into a separate thread in order to improve performance while also fixing some outstanding bugs. This work can yield 50~100% performance improvements and in some cases making the games under Wine faster than on Windows.

If you want to help support this work consider purchasing a copy of CrossOver  12.5 from CodeWeavers. You can use promo code TOM23 and receive a instant 20% discount off the normal selling price.

Stefan's email sent to the wine Development mailing list :


In the past months I have been working on a command stream / worker
thread for wined3d. It moves most OpenGL calls into a separate thread
to improve performance (bug 11674) and fix some bugs that are
otherwise hard to fix (24684).

You can test the attached patches by applying them (git am
/path/to/patches/*) and setting HKCU/Software/Wine/Direct3D/CSMT =
"enabled". Make sure to disable StrictDrawOrdering. It is no longer
required with those patches and will destroy any performance gains.
(It might be useful for debugging though). The patches apply on top of
Wine 1.7.1.

Please test those patches with your games. I'm interested in any
successes or failures and performance differences. Performance numbers
with plain Wine 1.7.1, this patchset with CSMT off and on, and Wine
1.7.7 + bugzilla attachment 44420 and __GL_THREADED_OPTIMIZATIONS
would be greatly appreciated.

A notes for non-developers:
*) GPU limited games don't see any improvement. If you're GPU limited
heavilly depends on your hardware

*) So far I have not tested anything but Nvidia hardware. It should
work on all GPUs and drivers though.

*) Yes, this is essentially the same as Nvidia's
__GL_THREADED_OPTIMIZATIONS. Just driver independent, under our
control, and thus easier to fix bugs.

*) A lot of games see 50%-100% performance improvements and now run as
fast as on Windows or even faster. Examples are Source-Engine based
games, StarCraft 2, 3DMark 2001.

*) Call of Duty Modern Warfare 2 is improved a lot because you no
longer need StrictDrawOrdering. It's still not as good as it could be,
because it uses dynamic surfaces, which aren't properly implemented in
the patchset yet.

*) Some games have CPU-side bottlenecks outside d3d. Mass Effect 2
seems to be one of those.

*) Some games have CPU-side bottlenecks in the GL driver, and
comparably little game logic on their own. I think this applies to Civ
V, which doesn't see much improvement with those patches.

Some implementation notes:
*) One of the big design decisions is to do all OpenGL calls from one
thread, including resource creation and buffer maps. This is faster
than using glFlush calls to synchronize anything we do from the main
thread, and easier than trying to sync everything in a performant
fashion with ARB_sync. This means I need the priority command queue.
This is not yet fully implemented though, so you see GL calls from the
main thread as well.

*) There seem to be driver bugs when calling into GL from two threads,
even though those are two different contexts. Remember, we don't have
the GL lock any longer.

*) The other controversial design decision is that the command stream
does not hold any references to objects stored in pending commands or
its own state structure. This prevents the client libraries and
applications from "seeing" the CS via delayed destruction of objects
and freeing of application private data.

*) Currently resource destruction waits for the CS to execute all
pending commands. The goal is to handle private resources and removal
from the device's resource list in the main thread and freeing of GL
resources, freeing of resource->heap_memory and freeing of the main
structure in the worker thread.

*) A big issue that needs fixing is that there isn't a clear
separation between functions that are called from the main thread and
functions that are called from the worker thread. The plan is to
introduce comments similar to those that clarify who is responsible
for context activation.

*) Buffers are double-buffered and use glBufferSubData when the
multithreaded CS is in use. This is necessary because we can't draw
from a mapped buffer. In the long run GL_ARB_buffer_storage should be
able to fix this.

*) You can roughly see how surface and volume handling is going to
work in the volume code. I am not entirely happy with the code yet, I
hacked it together in the past few days...

*) The plan behind wined3d_device_get_bo and wined3d_device_release_bo
is to cache created GL BOs. Before I do that I have to write a
benchmark for dynamic volumes to verify that this is really a
performance improvement.

*) Before this can be merged, surfaces need a cleanup similar to
volumes. It's going to be a lot trickier though.

*) The tests should run with the single-threaded and multi-threaded
command stream.

*) There should not be any temporary regressions with the
single-threaded CS. If something's broken, git bisect should work with
CSMT off.

*) With CSMT on, there are a few known regressions and test failures.
The d3d9 and ddraw tests fail between patch 18 and 71. Occlusion
queries are broken between 22 and 108. In general nothing's working
right between 80 and 99. Some of those problems can be fixed or their
impact reduced, but I will not be able to completely avoid them. The
ddraw test failure is a driver bug and GL occlusion queries break by
design when used from a different thread. So if you try to bisect a
regression in this patch series with CSMT on YMMV.

*) This work was originally started by Henri. Some patches in the
series are from him and either unmodified or with minor adjustments.
Some patches are based on his work, but with heavy modifications.