Behind the Pretty Frames: Death Stranding


By the time this article published & you are able to access, it means it’s been 4-5 months as a draft since i started writing this line mid-June 2022.


Digging games to learn about their technology is one cool old hobby that i do enjoy. But unfortunately due to the limited time & the huge effort digging-and-understanding takes (not to mention if it’s going to end in article like this) in addition to the endless catalogue of games every year, i’ve to be very very carful about my choices, even if i’ll dig it for myself without spitting a word about it. The games i usually dig, are games that i personally enjoyed playing or are from the top of my list. There are other games that are neither under my favorite genre, nor i like or enjoyed, but i do dig them because of the technology (aka engine) powering them, and today’s game is under that later category.

To be honest & fair, the only thing i liked about Death Stranding (hereafter referred to as DS sometimes) for long time (pre-release), was that it starring Norman & Mads, which are two of the actors i’ve enjoyed their previous works (specially TWD & Hannibal), but not because it’s a Kojima game, this is out of question! One interesting thing about any Kojima game, it is usually relying on a good & solid tech, which is something interests me more than the game itself! (again, Kojima’s quality of gameplay/story is out of question).

The main reason behind this time’s dig, was not the love of the game, or even the creator behind it, in fact i’m one of that party who found DS a little boring & didn’t catch me to beat it ASAP. But the reason here is more of a personal interest and curiosity! When you ask me “What are the engines interests you?”, you will end up with a nice handful list that is a mixed bag of RAGE, Northlight, Glacier, Anvil, Snowdrop, Voyager, Dunia, id, Frostbite…and oh boy….Decima

Those are the engines that have names, others that i can’t name are like Naughtdog’s, Santamonica’s, Forza’s, Insomniac’s, Detroit’s, Ghost of Tsushima’s, The Last Guardian’s, Bluepoint’s,…and the list goes on..(Unreal defiantly is not & will never be in that list…sorry for the disappointment!)

So, just to be clear, i’m not digging DS because of any deep love, neither to the game nor to the designers behind it…nope…it’s because it is the “most recent” PC game that is running on Decima..i would have gone with one from the original engine creators “Guerrilla”, but Horizon 1 on PC is fairly old comparing to DS, and Horizon 2 won’t see the PC horizons that DS is the only good shot right now….

And if you’re living under a rock or not familiar at all with the name Decima in the Game Engines & Graphics space, i would like to invite you to watch this ~7 minutes presentation below (click to play), that is all about the most recent version of Decima that was used in latest Horizon game, it show case what it is capable of. This presentation was part of Siggraph 2022 Live, if you have more time, didn’t watch the live already, or curious about other showcases, you can watch the full video at Siggraph Channel here.

This downloadable video is hosted at the Guerrilla Games publications pages, if the video gone someday, poke me so i can update with another link if possible.


Once more i still captures from only one of my PCs! The PC i’m using still the same one from Elden Ring’s and Resident Evil’s studies, which is the RTX 3080Ryzen 5950x and 32G RAM. And the graphics settings is set as the screenshot below

Despite the fact that i can use the 4k/HDR monitor that i got for Mirage engine development, but i still decided to go with 1080p & none-HDR, so i can save myself from the previous hassles with Resident Evil from large capture files, to slow process & navigation in the gpu captures, and eventually large png/jpg files to upload to the article. Plus, i don’t think there is anything interesting in 4K in Death Stranding (no super resolution for example), from 1080p to 4k felt like just a matter of more pixels!

With that said, there are very few 4k captures (5-6 or so) that i decided to take eventually, for nothing except the sake of variation and checks, those are not referenced a lot in the article (perhaps around the particles & post-processing sections), so don’t mind if you notice them few times below, i used them here and there just because it was a “clear shot” not more, but not for any 4k specific reason.

Also i did take a single HDR capture, just to demonstrate the HDR section, but apart from that single frame, everything is still none-HDR.

Behind the Frame

GIFs Note

Keep in mind, almost all gif images below are linked with 4k videos on YT. So you don’t have to narrow your eyes to see details, it was meant to make the gifs as tiny as 500px, so it don’t take much time in the page loading.


Before starting this game, i was not aware what is the API it utilizes within the PC versions. I did play the game on the PlayStation platform before, so i didn’t care much about what was the target & status of the PC port when it came out until i started digging through it. Despite the fact that at the time i was taking the captures for Death Stranding, i was already working on yet another Sony exclusive that came to PC, and that other exclusive was ported on Vulkan, so when launched Death Stranding i thought it will be like that other game (that other game is delayed for future article), but nope, i found out that Death Stranding is using D3D12, and it seem to be utilizing the API just “fine”. i did not notice anything very special, WOW, or out of the ordinary, but at least it seem not to be taking poor & bad choices. So, i would say Death Stranding is balanced at the API utilization meter.


As i always like to give a note about compute (it’s the golden era of compute shaders anyways), it seem that compute being utilized among the frame draws pretty well, not only for post processing or particles, but, compute is heavily utilized in many other areas that is preparing for the frame or just contributing from behind the scenes without any direct draws. Again, i love compute, & i love it when i see game/engine is heavily utilizing that. So, i guess it’s once more the time, to use that meme from the God of War & Resident Evil studies!

Yet another Game/Engine that deserves my special tier compute meme!

For a quick idea about what compute usage in a typical Decima frame of Death Stranding, i’ll leave below just the summary for the utilization in dispatch order. Full details of those usages are left at their correct order below in the Draw Section.

Compute Dispatches Queue (in execution order)

  • Streaming Tiles Clustering
  • Streaming Priority
  • Streaming Check Activation
  • World Data Probe Texture Copy
  • Particles Update
  • Force Field
  • Force Field to Texture
  • Image Blend
  • Linear Buffer
  • Plant Compute
  • Snow Interaction Update
  • Sky Dome Irradiance Contribution
  • Weather Stuff
  • Occlusion 
  • GBuffer Downsample
  • Prepare Lookup Texture
  • Shadow Caster Height Field (Sun Shadowmap Long Distance)
  • Cloud Rendering
  • Volumetric Light
  • Volumetric Clouds
  • Prepare Cubemap 3D Texture
  • Color & Gloss Buffers


Stranding Vertex

Yet another game that got it’s own share of the vertex descriptions party. Below is not defiantly everything, there are quite more than that & i might’ve missed some. Again, it is not something that i’m a big fan of, but it is what it is. Below are the ones that i was able to spot, not sure which ones that slipped from me, but those ones are the ones that kept coming again & again on my face during those few months i spent in the journey of this breakdown

POSITION         R32G32B32_FLOAT           0
BLENDINDICES     R8G8B8A8_UINT             12
BLENDINDICES     R8G8B8A8_UINT             16
BLENDWEIGHT      R8G8B8A8_UNORM            20
BLENDWEIGHT      R8G8B8A8_UNORM            24
NORMAL           R32G32B32_FLOAT           0
TANGENT_BFLIP    R32G32B32A32_FLOAT        12
COLOR            R8G8B8A8_UNORM            0
TEXCOORD         R16G16_FLOAT              4
TEXCOORD         R16G16_FLOAT              8
TEXCOORD         R16G16_FLOAT              12
POSITION         R32G32B32_FLOAT           0
BLENDINDICES     R8G8B8A8_UINT             12
BLENDINDICES     R8G8B8A8_UINT             16
BLENDWEIGHT      R8G8B8A8_UNORM            20
BLENDWEIGHT      R8G8B8A8_UNORM            24
NORMAL           R32G32B32_FLOAT           0
TANGENT_BFLIP    R32G32B32A32_FLOAT        12
COLOR            R8G8B8A8_UNORM            0
TEXCOORD         R16G16_FLOAT              4
TEXCOORD         R16G16_FLOAT              8
TEXCOORD         R16G16_FLOAT              12
TEXCOORD         R16G16_FLOAT              16
POSITION         R32G32B32_FLOAT           0
BLENDINDICES     R8_UINT                   12
NORMAL           R32G32B32_FLOAT           0
TANGENT_BFLIP    R32G32B32A32_FLOAT        12
COLOR            R8G8B8A8_UNORM            0
TEXCOORD         R16G16_FLOAT              4
TEXCOORD         R16G16_FLOAT              8
TEXCOORD         R16G16_FLOAT              12
POSITION         R32G32B32_FLOAT           0
BLENDINDICES     R8_UINT                   12
NORMAL           R32G32B32_FLOAT           0
TANGENT_BFLIP    R32G32B32A32_FLOAT        12
COLOR            R8G8B8A8_UNORM            0
TEXCOORD         R16G16_FLOAT              4
TEXCOORD         R16G16_FLOAT              8
POSITION         R32G32B32_FLOAT           0
BLENDINDICES     R16G16B16A16_UINT         12
BLENDINDICES     R16G16B16A16_UINT         20
BLENDWEIGHT      R8G8B8A8_UNORM            28
BLENDWEIGHT      R8G8B8A8_UNORM            32
NORMAL           R32G32B32_FLOAT           0
TANGENT_BFLIP    R32G32B32A32_FLOAT        12
TEXCOORD         R16G16_FLOAT              0
POSITION         R32G32B32_FLOAT           0
BLENDINDICES     R8G8B8A8_UINT             12
BLENDINDICES     R8G8B8A8_UINT             16
BLENDWEIGHT      R8G8B8A8_UNORM            20
BLENDWEIGHT      R8G8B8A8_UNORM            24
NORMAL           R32G32B32_FLOAT           0
TEXCOORD         R32G32B32A32_FLOAT        12
COLOR            R8G8B8A8_UNORM            0
TEXCOORD         R16G16_FLOAT              4
POSITION         R32G32B32_FLOAT           0
NORMAL           R16G16B16A16_FLOAT        12
TANGENT_BFLIP    R16G16B16A16_FLOAT        20
COLOR            R8G8B8A8_UNORM            28
POSITION         R32G32B32_FLOAT           0
COLOR            R8G8B8A8_UNORM            28
POSITION         R32G32B32_FLOAT           0
POSITION         R16G16B16A16_FLOAT           0
POSITION         R16G16B16A16_FLOAT           0
COLOR            R8G8B8A8_UNORM               16
TEXCOORD         R16G16_FLOAT                 20
POSITION         R16G16B16A16_SNORM           0
TEXCOORD         R16G16_UNORM                 0
COLOR            R8G8B8A8_UNORM               20
POSITION         R16G16B16A16_SNORM       0
TEXCOORD         R16G16_UNORM             0
TANGENT_BFLIP    R16G16B16A16_SNORM       4
NORMAL           R16G16B16A16_SNORM       12
COLOR            R8G8B8A8_UNORM           20
POSITION         R16G16B16A16_FLOAT       0
TEXCOORD         R16G16_UNORM             0
TANGENT_BFLIP    R16G16B16A16_SNORM       4
NORMAL           R16G16B16A16_SNORM       12
COLOR            R8G8B8A8_UNORM           20
POSITION         R16G16B16A16_FLOAT       0
TANGENT_BFLIP    R16G16B16A16_SNORM       0
NORMAL           R16G16B16A16_SNORM       8
COLOR            R8G8B8A8_UNORM           16
TEXCOORD         R16G16_FLOAT             20
POSITION         R16G16B16A16_SNORM       0
TEXCOORD         R16G16_UNORM             0
TANGENT_BFLIP    R16G16B16A16_SNORM       4
NORMAL           R16G16B16A16_SNORM       12
POSITION         R32G32B32_FLOAT          0
TANGENT_BFLIP    R16G16B16A16_SNORM       0
NORMAL           R16G16B16A16_SNORM       8
COLOR            R8G8B8A8_UNORM           16
POSITION         R32G32B32_FLOAT          0
TANGENT_BFLIP    R16G16B16A16_SNORM       0
NORMAL           R16G16B16A16_SNORM       8
COLOR            R8G8B8A8_UNORM           16
TEXCOORD         R16G16_FLOAT             20
POSITION         R32G32B32_FLOAT          0
COLOR            R8G8B8A8_UNORM           0
TEXCOORD         R16G16_FLOAT             4

And much more….

Copy & Clear

Streaming Tiles Clustering [Compute]

A handful amount of dispatches in form of few compute passes queued one after another that works on world clustering to prepare for culling by defining visibility & visible instances across the view. A whole lot of data passed to such dispatches in order to work, such as:

struct mQueryConstants
	uint mFlags;
	int3 mFloatingOrigin;
	uint mBatchCapacity;
	uint mInstanceCapacity;
	uint mPlaneCount;
	uint mSubFrustumCount;
	uint mPackedPerFrustumShadowCastingBits;
	uint mOcclusionMipCount;
	uint mOcclusionWidth;
	uint mOcclusionHeight;
	uint4 mOcclusionViewport;
	float4 mThresholdAndLodScaleSquared;
	float4 mViewPosAndHalfDiag;
	float4[8] mFrustumPlanes;
	float4x4 mPrevViewProj;
	float4[24] mSubFrustumPlanes;
	float4[2] mSubFrustumThresholds;
	float4 mSubFrustumLodDistancesSquared;
	float4 mLodOverrideRangeSquared;
struct mQueryClusterBindings
	uint mInstanceCount;
	uint mTileInstanceOffset;
	uint mGlobalInstanceOffset;
	uint mTileIndex;
	uint mDisableFlags;
	uint mClusterFlags;

And for some reason it takes the downsampled occlusion texture of the previous frame as an input. This texture mostly holding values during cinematics, and usually black 1*1 during gameplay (culling takes place under multiple techniques anyways).

Streaming Priority [Compute]

Not very clear the exact details about what is going on here, but with a given GroupData, this few compute dispatches working on outputting a buffer of int values that represents the streaming’s PriorityOutputData using the following set of params.

struct mStreamingPriorityGPUComputeJobResources_Constant
	int mViewCount;
	float mMeshSafetyFactor;
	float mBoundingBoxPriority;
	int mPadding;
	float4x4 mViewPositionAndDeprioritizeMultiplier;
	float4x4 mViewDirectionAndLODDistanceScale;
	float4x4 mViewEncodedFOVData;
	float4x4 mDOFNearSettings;
	float4x4 mDOFFarSettings;
	float4 mPriorityComputationSettings;
	int4 mHintDataTypes;

Streaming Check Activation [Compute]

With a given buffer full of ObjectBoundingBoxData and a buffer of ObjectActivationStateAndResuseIndex in order to fill a RW buffer with some data for the streaming system.

struct mStreamingCheckActivationComputeJobResources_Constant
	int mViewCount;
	int3 mPadding;
	float4x4 mViewPositionAndDistanceScaleSquared;

The mViewCount value is kinda interesting because it always vary which is not usually the case, at least in gameplay, you’ve got 1 view to decide what active/visible and what not. Not super sure the factor behind that, or which values are meant here, but here are some variations of that value, and yes, it is not something based on being cinematic or gameplay, nope!

World Data Probe Texture Copy [Compute]

A compute pass with few dispatch that which does copy some texture tiles from the world, into buffers. Those buffers needed shortly as input for another compute jobs that is targeting verity of goals. Such as simulation force fields, rain, water, clouds, snow interaction,….etc.

struct WorldDataProbeTextureCopyCB
	int4 mTextureChannels;
	int mSourceRectTop;
	int mSourceRectLeft;
	int mTileTextureSize;

World tile textures that get copies are like those ones for example

Particles Update [Compute]

Few dispatches where writing simulation data to some buffers

struct mParticleUpdateComputeParams
	float4x4 mViewMatrix;
	float4x4 mInvViewMatrix;
	float4x4 mPrevViewMatrix;
	float4x4 mViewProjMatrix;
	float4x4 mDepthReconstructionMatrix;
	float mTimeStep;
	int mSystemFlags;
struct GPUParticle
	float4 mPositionAndSize;
	float4 mVelocityAndRandom;
	float4 mRotation;
	float4 mPreviousRotation;
	float mPreviousSize;
	uint mPackedFlagsAndAge;
	uint mLifeSpanAndForceFieldIndex;
	uint mColor;
struct GPUParticleSystemStats
	int mBoundsMinX;
	int mBoundsMinY;
	int mBoundsMinZ;
	int mBoundsMaxX;
	int mBoundsMaxY;
	int mBoundsMaxZ;
	int mAliveParticleCount;
	int mDeadParticleCount;
	int mParticleValidationInfo;
struct ForceFieldSample
	float4 mPositionAndMass;
	float4 mSurfaceAreaVector;
	float4 mVelocityAndWaterFlowScale;
	int mCategoryMask;

Force Field [Compute]

Few dispatches to compute the force fields and write the data out into the force field sample results structures.

struct ForceFieldSampleResult
	float4 mFlow;
	float4 mWaterFlow;
	float3 mForce;
struct ForceFieldComputeParams
	float4 mWorldToWaterHeightScaleBias;
	float4 mWorldToWaterFlowScaleBias;
	float mWaterHeightRangeMin;
	float mWaterHeightRangeMax;
	int mEnableWaterForces;
	int mForceFieldCount;
	int mSampleCount;
	float mTime;
	float mTimeStep;

Force Field to Texture [Compute]

This is a later dispatched to write those buffer data generated in the previous step into some 3d textures.

struct ForceFieldtoTextureComputeJobParams
	uint mForceFieldCount;
	float mTime;
	float mTimeStep;
	float mBlendAttack;
	float4 mGridOrigin;
	float4 mGridStepSize;
	int4 mGridDisplacement;
	int4 mTextureSize;
	float mSpecialStiffness;
	float mSpecialDrag;
	float mSpecialMass;
	float mSpecialScale;
	float mSpecialClamp;
	int mSpecialMaxPriority;
	uint mSpecialMode;
	uint mSpecialCategoryMask;
	float mGrassStiffness;
	float mGrassDrag;
	float mGrassMass;
	float mGrassScale;
	float mGrassClamp;
	int mGrassMaxPriority;
	uint mGrassMode;
	uint mGrassCategoryMask;
	float mPlantStiffness;
	float mPlantDrag;
	float mPlantMass;
	float mPlantScale;
	float mPlantClamp;
	int mPlantMaxPriority;
	uint mPlantMode;
	uint mPlantCategoryMask;
	float mTreeStiffness;
	float mTreeDrag;
	float mTreeMass;
	float mTreeScale;
	float mTreeClamp;
	int mTreeMaxPriority;
	uint mTreeMode;
	uint mTreeCategoryMask;

Image Blend [Compute]

Generate the 3d lookup table (32*32*32 – RGBA8_UNORM) that will be used by the end of the frame as a fragment shader resource in the post-processing applying step.

struct ImageBlendComputeLayerInfo
	int mOperation;
	float mAmount;
	float mPadding0;
	float mPadding1;

struct ImageBlendComputeConstants
	ImageBlendComputeLayerInfo[16] mLayers;
	int mLayerCount;
	int mPreExpose;
	float mExposureScale;

Linear Buffer [Compute]


Plant Compute [Compute]


Copy & Clear

Yet another ClearRenderTargetView()

Clear Depth Stencil View

Just before jumping to the next steps where snow/mud interaction is handled, a ClearDepthStencilView() runs after we’ve already copied the previous frame data in the previous step..

Snow/Mud Top View Depth Pass

Few sequential DrawIndexedInstanced() in order to Outputs a 1024*1024 – R16_TYPELESS, this is needed in the next step to compute the snow/mud interaction. In this pass, It seems it draw only a low resolution version of the meshes, and it makes sese considering the camera is far away at the top, and hence a low LOD is utilized. Most of the time this is about Sam, but other times would be Characters or vehicles.

This is always taking place, regardless there is snow or not. Even in Jungle level, still taking place.

Snow Interaction Update [Compute]

This pass only takes place when there is snow/mud. So with the fact that top view depth always get kicked, but here it only happened when there is actual surfaces to deform.

The exact details of the compute is something to investigate further, but yet, it is only working with those given inputs to produce the new deformation textures/buffer that will be used to deform the tessellated terrain later while drawing the GBuffer.

struct SnowInteractionUpdateParamsConstant
	float4 mProbeTextureScaleBias;
	float4 mProbeTextureHeightRange;
	float2 mWSInteractionCenter;
	float2 mSampleOffset;
	float mSampleDistanceTimes2;
	float mInteractionAreaWorldSize;
	float mInteractionTextureResInv;
	float mInteractionTextureResMinus1;
	float mInteractionTextureResMinus1Inv;
	float mTemporalFilterFactor;
	float mMaxSnowDepth;
	float mMaxSnowDepthInv;
	float mLinNormDepthToWorldScale;
	float mLinNormDepthToWorldBias;
	float mDeltaTime;
	float mNonUniformExponent;
	float mInitialDeformation;
	float mTerrainOffset;
	float mNormalTiling;
	float mNormalIntensity;

And to make it more clear, and becaue i’ve the luxury of owning a very short nickname that is just one letter & easy to type on snow with a walking delivery CG character, here it is… M

Keep in mind that the rendertarget is always “mirrored” because the top-view camera orientation is different than your third person camera orientation. You can easily observe that below, specially due to the rocky area where no deformable snow covering it.

Anyways, by the end of this pass, the output Deformation texture get copied to a buffer, this to be utilized by the next frame. So basically anything that will accumulate, is copied as buffers, kept aside, and then re-used in the next frame (mostly in the next frame it is changing it’s form from buffer to a rendertarget back again). And of course, the Deformation texture itself, is used shortly during the GBuffer to deform the tessellated terrain mesh.

Copy & Clear

Yet another ClearRenderTargetView()

Cloud Density

With some cloud textures, and a RainyMap (this is a compute output that is discussed below), we get the what so called the Cloud Density texture. You can think about this step as, wherever there is rain, then there is clouds, then let’s consider this area cloudy. The more rain, the dense the clouds at this location. Rain coming from the sky, not from the empty spaces!

Precipition Occlusion Height

Nothing very detailed here, but at this step there is a rendertarget generated (512*512 – R16_UNORM) that will be used quite a handful amount of time later during the GBuffer.

Copy & Clear

Yet another ClearRenderTargetView()

Atmospheric Scattering

With the help of penalty of textures, some are copied and come right away from the previous frame (black if invalid) where others are just usual resources, things such as Volume Light Volume, Clouds Textures, Aurora Texture, Light Shaft Texture, In-Scatter and Distant Fog, the shader works on generating the Atmospheric Scattering texture that is needed for the next compute step.

struct AtmosphericScatteringCB
	float4 mSunlightDirection;
	float4 mPrecomputedAtmoInnerSunInfo;
	float4 mPrecomputedAtmoInnerSkyInfo;
	float4 mPrecomputedAtmoHazeRayleighInfo;
	float4 mPrecomputedAtmoHazeMieInfo;
	float mPrecomputedAtmoMieLightShaftIntensity;
	float mPrecomputedAtmoHazeNear;
	float mPrecomputedAtmoHazeFalloff;
	float mPrecomputedAtmoHazeCurvature;

Just know this step like that for now, you will see it’s impact, and those resources names soon in action.

Sky Dome Irradiance Contribution [Compute]

Using the output of the previous Atmospheric Scattering step, in order to end up with couple of buffers, one for the Sky Dome Irradiance Contribution and the other one for the Sky Dome Color.

struct SkyDomeIrradiance
	float4x4 mInvView;
	float4 mCamPosWS;
	float4 mProjectionParams;
	float3 mSunlightColor;
	float3 mAverageSkyColor;
	float3 mOuterSunIntensity;
	float3 mSkyColorTint;
	float2 mSunShapeInfo;
	float mFarPlaneDistance;
	float mCloudColorSaturation;
	int mCompositeClouds;
	int mCompositeAurora;
	float mVolumeLightDepthRange;
	float mPreExposureScale;
	float4 mQuarterResolutionClipWindow;
	float4 mEightResolutionClipWindow;
	uint mUseKJPVolumetricFog;
	float mUseDithering;
	float mLaccOnceWriteMaxValue;
	uint mCloudShaderType;
	float4x4 mDepthReconstructMatrix;

Hmmm,…. mUseKJPVolumetricFog ….
It seems that Guerilla’s and Kojima’s are both exist!

Weather Stuff [Compute]

During the very early copy of the previous frame’s data, there were a buffer holding the weather info. During this compute, that buffer been copied into 2 separate textures, one of them is the Weather Map, where the other is the Volume Light Volume, that are needed for penalty of reasons, most notably the Integrated Volume Light compute later down in the pipe.

Copy & Clear

Yet another ClearRenderTargetView() + ClearDepthStencilView()


1.Depth Passes

2 to 4 depth only passes (3 most of the time). Drawing to depth, step by step. It’s worth mentioning that character body is draw last most of the time. But not here..not during the depth passes (strange!)

Several draws are utilizing the ForceFields data (from earlier compute) to affect the draw for “some” meshes, such as hair, strands, grass & tree branches.

Vertex Adjustment [Compute]

Sometimes there is compute dispatches between the different depth passes, that is used to update some positions values and possibly for the force fields.

2.Color Passes (PBR)

Between 2 to 4 color passes in normal cases, but this can go up to 10 color passes for complicated or very large OpenWorld frames/views (count depends on the complexity of the frame anyways, usually it is total of 3 or 4 color passes, perhaps in cinematics where camera is a narrow angle, where many cases during gameplay reach 6 or even 7 color passes in average frames & indoors to draw the frame RenderTargets) where the deferred GBuffer is drawn to render targets. Nothing very fancy. Also at the same time, if there are remaining meshes that was not drawn to the depth in the previous depth dedicated passes (parts of Character body, and sometimes the entire character body), are drawn to depth here during those color passes before drawing the color values to the deferred rendertargets. The Depth is attached to the color passes anyways!

Eventually we end up with pretty standard stuff like Albedo, Normal , Reflectance, Attributes(motion + roughness) + Emissive (mostly empty) + Depth.

So, the interesting question here is why there is an isolated Depth-Only pass[es] to draw the GBuffer’s depth…or to be more accurate, to draw part of the GBuffer’s depth, where the rest of the depth is drawn within the Color pass[es]. I can’t really get the point here. It would have made more sense (for example like RE Engine) to draw the depth target at the same time during the Color Pass[es]. Because usually when there is Depth-Only pass, it is either for shadowmaps (effects) or to generate the “full” GBuffer’s depth target. Splitting here was confusing, and made no sense!

And some more frames for clarity and coverage of difference cases (indoor, night, day, …etc.)


Of course it is PBR rendering, and hance some PBR textures are used to paint the GBuffer. Most of the time there is nothing out of the ordinary, but here an example of PBR mesh that is using a little more than other PBR meshes, let’s take the hoodie of Sam from the snow frame above:

The Roughness is actually holding info about Roughnesss as well as a Mask.

Not only Sam, but there are sometime Corrosion effect, that requires the mesh to get some more textures during the draw

And before leaving this PBR textures section, let’s just say, DS could do better than that.

ii.Terrain & Snow

When it comes the time to draw terrain, it is done like any other terrain, just a bunch of world displacement maps (artistically made) passed to the terrain (vertex shader), so vertices can get deformed. But at the same time, the snow deformation textures (if applicable) is set to terrain during that time too, so it can modify the default artistic heigh to new height that matches the Deformation Texture. So, to complete with the same snow M thing from above, here how it is done…just one more single draw cmd!

And if you are curious about what type of displacement textures that are passed to the vertex shader to lay the foundation of the terrain, before applying the snow deformation, here are a few that are used with that exact piece of the terrain in that gif

And for the entire map landscape for example, something a little higher in quality like that is used

iii.Emissive, Decals & Emissive Decals

By the end of the GBuffer, and after drawing all the ordinary geometry, it comes the time to draw the emissive objects as well as the decals. And they are executed at that same exact order. Emissive first, then Decals. There are other times where Decals are Emissive, such as the scan effect footprints, at this case they are drawn at last.

When there is some Emissive Decals, those are painted at the very last. So you can think about it is in order as
1 – Emissive Objects
2 – Decals
3 – Decals that are Emissive

In general, the entire process can take between 1 & 2 color passes. Here are some examples

Occlusion [Compute]

i.Occlusion Capture


ii.Occlusion Downsample


Depth Downsample

As the habit goes, downsamples of depth will be needed for quite a few things down the pipe, so it’s a good timing to prepare a 1/2, 1/4 depth versions at the side. 1920*1080 to 960*540 to 480*270

Not only that, but also separate min depth and max depth into their own rendertargets (1/2 res ones)

HalfResMinDepthTexture and HalfResMaxDepthTexture

All that happens in an off-screen quad, or to be exact, off-screen larger-than-viewport triangle (yummy, it’s the way i like)

Copy & Clear

Yet another one!

GBuffer Downsample [Compute]

Same as what been done previously with the depth, except that this time it’s half-res only, and it’s done in compute for pretty much all the GBuffer targets (Color, Attributes, Depth again).

This step seem random, and useless, and perhaps a remains of some obsolete code, as i see it is not necessary, and the outputs not utilized in any way. Not to mention that the outputs are solid black rendertargets.

Prepare Lookup Texture [Compute]

//TODO (Examples)


1.Downsample Normals

Take a full frame GBuffer’s normal target, and scale it to the 1/2 with a slight format change. So at this case from 1920*108 to 960*540

2.Generate SSAO

using the previously downsampled GBuffer’s normal, in addition to the Depth and a sample depth LUT, The frag shader run to generate a noisy SSAO image with the same size as the input Normals rendertarget.

3.Blurring & Smoothing SSAO

Couple of invocations to the blurring frag shader, starting by Horizontal blurring, followed up by Vertical blurring by using the normals (downscaled one) as well as the current noisy SSAO image.

struct mInUniform_Constant
	float4 mDirSize;
	float4 mParams;
i.Horizontal SSAO Blurring
ii.Vertical SSAO Blurring
iii.Accumulate Smoothing SSAO

When that done, using the previous frame’s data such as the SSAO image & motion vectors of the previous frame, a once more blurring step takes place in order to generate as smooth as possible SSAO image.

4.Transfer SSAO

Now come to the weird thing. This last step of extra blurriness, is taking place in two steps, previously you’ve seen the 1st step, here is the in & out of the 2nd step.

As you can see, the output of both steps is “exactly” the same, it’s more of transferring except it is not actually a “transferring” call. What happens here is that the DrawInstanced() call, at first issued to draw as a fullscreen TRIANGLE (larger than the viewport of course), and then using the output of that call, to issue another DrawInstacned() where it issues as a fullscreen QUAD. So it’s little strange, why not draw on the target surface regardless what it is from the first time?! Output of both steps is exactly the same, it’s just re-drawn! This could be due to some changes before shipping, or due to deprecation for some AO library,…etc. But all in all, yes SSAO quality looks good in that game, but that tiny thing can improve performances even in μs.

Not to mention, that each of the last two steps (that have the same output), each of them using totally different vertex descriptions!

POSITION   R32G32B32_FLOAT           0
TEXCOORD   R32G32_FLOAT              12
POSITION   R32G32B32_FLOAT           0
COLOR      R32G32B32A32_FLOAT        12
TEXCOORD   R32G32_FLOAT              28

Where the 1st one is used for vertcies drawin the Triangle in Step 5 (the SSAO smoothin), the later one used for the 6th step (transferring). So why?! What is even the need for some vertex color data here?!

And of course, channel utilization is good. It’s an RG img format, but after all when saved to disk and dropped here in browser, it’s has to be a RGBA, so here is a breakdown (RGBA even though it’s just RG), just in case…

And of course, for the sake of why not, here are some SSAO steps for multiple pretty frames!

Please don’t get upset by the many examples per case. You might’ve noticed by now that since the 1st article, that i put many examples per case most of the time, this is not because i’ve penalty of free time, or enjoy taking screenshots, but actually this is back to an old saying that me & many in the Middle East get raised with, it translated to English as “With examples, it becomes crystal clear” or in Arabic “بالأمثال يتضح المقال”

When studying complex subjects such as Chemistry, Physics & Math in high school, many teachers kept saying that same phrase again and again and again after explain each lesson and before diving into endless examples, and as a matter of fact, the teachers who did that, were my favorite ones, and were the ones i was learning smoothly from, and even after 25+ years from high school, i still remember most of the knowledge they explained, due to the variation of their examples (even thought i do nothing useful with chemistry nowadays). And growing up, i found this very helpful at career wise as well, since i started in this career, specially with pretty much everything i learned was self taught, i found that when getting the “case” in question covered with more examples, even if i have to make up my examples when a book or article doesn’t have enough, because i’ve got no teacher or nor professor to explain, that makes it digested smoothly in my mind, even if there were ZERO text explaining them!

So, it really is بالأمثال يتضح المقال !

Copy & Clear

Yet another ClearRenderTargetView()

Shadow Caster Height Field (Sun Shadowmap Long Distance) [Compute]

OutTarget / 3DTexture
HeightTerrain, HeightObject, ExtraHeightObject, ShadowCasterHF

Copy & Clear

Yet another ClearRenderTargetView()

Indirect Forward Pass (Light Sampling)

Get out a LightSamplingPositionTexture, LightSamplingNormalTexture, MiniatureLightTexture, and depth one

//TODO (Examples)

Copy & Clear

Yet another ClearRenderTargetView()

Shadow Pass[es]

An average of 10 depth only shadow pass for direct light sources (can go up to 15 in cinematics). All the passes are co-ordinating on a handful few of square power of two rendertargets/shadowmaps which are most of the time are either 1024*1024 or 2048*2048 of the foramt R32_TYPELESS.

There is nothing out of the ordinary here, and the only thing that is worth mentioning is the distribution of the passes workload. As you see there is a ton of landscape to cover, and this is why things are split into 3 or more shadowmaps but not in 3 passes, in 10 or more, and this is done by going back and forth between the shadowmpas in each pass one by one, and it is quire rare to have a shadowmap that is fully completed in a single renderpass. So the final result is something similar to that following diagram

Now, go to cinematic, which is quite different story, in cinematics. Cinematics in DS is 3 types, the 1st one is prerendered videos, the second ones are like gameplay moments with gameplay quality boosted, and this is not very different from the gameplay (in fact the Fragile frame above is one of this category), and the last type is the cinematic that is high quality and not prerendered, and for this type shadowmaps usually different.

It still use the same idea of rendering shadowmaps one after another back and forth, BUT the twist here is that there will be a single final shadowmap that contains everything together (less res of all combined) by the end. So if in a one renderpass there is a shadowmap that is completed, it get scaled down then composited to the final big shadowmap (that is most likely not square anymore) before proceeding in completing any of the other shadowmaps.

So basically the one shadowmap that is ready, it get composited, and not written to it in any future renderpass anymore. And to translate that to real example (with more shadowmaps than the diagram of course), take the frame below

And from another cinematic frame (perhaps less shadowmaps as well)

Prepare Volumetric Light

Now while everything looks tidy so far, but the Volumetric lighting to texture step, is not actually “very” independent, and it is not taking place after the Shadow Passes. Most of the time, in between the different shadow passes, the renderer take a break from working on shadows, and do process the fragment shader responsible for Volumetric Lights calculation. So is this something made in purpose, or failure in some Multithreaded rendering hopes,…only the team behind the game can tell.

//TODO (Examples)

Local Light Sources

Yet another step that might overlap with the shadow passes (most of the time) and not necessarily wait until all shadow passes ends. At this step basically the deferred light get calculated for the local light sources such as point & spot lights using the GBuffer rendertargets. This step is always present regardless it’s a cinematic or gameplay, the length of the step depends on the local light sources count.

//TODO (Examples)

Cloud Rendering [Compute]

A compute pass of few dispatches to get out a CloudTexture (similar to the ones carried over from the previous frame)…Cloud coverage.

Direct Light & Sun Shadows (23226 in the main target capture)

Apply the sun/moon light (single directional light src), as well as the shadow maps that were generated earlier

interestingly have an output rendertarget that looks interesting and never used for the rest of the frame life!

//TODO (Examples)

Volumetric Light [Compute]

Outputs an IntegratedVolumeLight with the possibility of format change from R11G11B10_FLOAT to R16G16B16A16_FLOAT

The depth of the volume is basically the slices. Can see examples below where it is used.

Another funny observation, was the use of 3d noise texture so it can add some “movement” to the volume, which is common, but what was not common to find that the 3d texture tagged as KJPFogNoiseTexture, and many of the params passed to the shader are prefixed with KJP to stands for Kojima Productions ^_^ not sure if that type of things are self-pride, or because those parts will be possibly shared back with Guerrilla, so it becomes easier to distinguish parts that came from Japan team!

Fun Fact

While writing that part, it came to my attention, that the frame order of execution, almost never changed since the days of Kill Zone!

Prepare volume light, Deferred, multiple shadow passes, sun/moon light, volmetrics…

Volumetric Clouds

prepare in few draws a cloud buffer (the green one)
//TODO (Examples)

Volumetric Clouds [Compute]

most likely raymarching
//TODO (Examples)

Water [Not Always]

Only where there is water meshes in the frame

//TODO (Examples)

Prepare Cubemap 3D Texture [Compute]

Needless effort to recreate a new 3d cubemap texture from existing one, but only generated less mips (7 instead of 9), but at the same time change format from BC6_UFLOAT to R16G16B16A16_FLOAT… Why not baked before shipping the game?
Examples of the cubemap are seen below.

Reflection Texture

This is going to be a step that is broken down to multiple steps, in order to prepare the needed render targets one by one

1.Reflection Mask

With the MotionRoughtnessTexture, ReflectanceTexture as well as LinearDepthTexturePyramidMin, the fragment shader end up with a 960*540 R16_TYPELESS mask texture that is used later (in the next step) in a compute shader to generate the ColorBuffer as well as GlossBuffer of the frame.

2.Color & Gloss Buffers [Compute]

using the LinearDepthTexturePyramidMin, NormalTexture, PrevSceneTexture, MotionRoughnessTexture as well as the generated MaskTexture from the previous step, a single compute dispatch runs to generate the OutputColorBuffer as well as OutputGlossBuffer.

3.Color Buffer to Quarter Texture Mip Chain

Taking that previously generated in the compute dispatch OutputColorBuffer, and in multiple executions of the blurring shader, it get downscaled and blurred at few steps.

i.Scale & Mip

Scale that Color Buffer that came out of the previous step from 960*540, to 480*270, so not it’s a “Quarter” of the target frame size. With that in mind, here as well down mips are generated and stored in that texture.

And of course, it’s a pyramid texture, and it contain just mips,…nothing fancy, but i’ll put below it’s chain details for the sake of demonstration, no more…

ii.Blurring Mips

As you see, by now that Quarter Texture Mip Chain is made of multiple mips (5 to be exact), so the shader run twice per mip, so it blur each mip horizontally, and then vertically….too much work!

Eventually, all that work, to end up with what so called the Quarter Texture Mip Chain (with all blurred and life is good!)

I intended to put all those steps in detail, even though it was very boring to extract, this is not because i’ve a lot of time to waste & not because i do have a lot of online storage on my blog. But my main reason here is because i always like to take any opportunity that makes me “pause” and wonder…and if you look to this matrix of ~15 images, you need every time to see it to think about 15 different commands or invocations that are full of instructions that ran per each mip in order to:
1st generate that mip,
2nd blur it on X,
3rd and finally to blur it on Y…
A lot of work, that yet done in ultra seconds! coming from DOS & Win3.11, expensive RAM in megabytes, no actual graphics card & software rendering…those type of things (the images above) makes me always want to wait, wonder, think about where we came from, and where we are heading in terms of the power of computing. Regardless, those steps are needed or not, can be optimized or not, it’s just beautiful to witness…

4.Generate the Reflection Texture

Not sure what is going on. Taking the GlossTexture, HalfTextuer as well as the previously generated QuarterTextureMipChain, we end up with the Reflection Texture that is 1/2 of the target resolution.

5.Accumulate Reflection Texture

as everything else, in order to avoid some artifacts, why not just accumulate with the previous’ frame results. So using the MotionVectors, Previous frame Reflection Texture, the Reflection Texture that is just generated in the previous step as well as the Mask Texture, we end up with smooth and ready to be used final Reflection Texture of the current frame (soon to be called previous Reflection Texture of the previous frame. Life is short!).

Copy & Clear

Yet another ClearRenderTargetView()