#1 2014-11-06 07:48:37

ThaOneDon
Member

tech thread

Updated (from time to time) with tons of tips/techniques not just for messing around in Tesseract but Graphics in General.
Resource thread for anyone who wants to experiment with stuff.



Inside Witcher 3 Doom 
jcgt.org GPU Zen(Source Code) C++ stories The Danger Zone


O3DE ID TECH 3 ID TECH 4 HPL ANKI3D TESSERACT SPARTAN WICKED IOLITE GODOT LUMIX


Rendering
Scalar Cache Arrays Command Buffer Reverse-Z Hi-Z Indirect The Forge Diligent
Thread Pool(Threads, Pool) ParallelFor(More) Instancing Dynamic Tokens Jobs G-Buffers(Packing) PhysicsFS
Code(etc) mem Concurrency Primer(malloc queue serial) SSE(etc,more) SIMD Terathon(math,bit,float,more)

A Primer On Efficient Rendering Another More Deferred Texturing Visibility Buffer Alternative Efficient Renderer
Cull that Cone(Improved/Another) Optimizing tile-based light culling Optimisations Synchronization problems 
(Clustered is compatible/works with both Forward and Deferred, while being less prone to problems)


PBR
BRDF Another Approx. Models(Source Code) Advances Multiple Scattering GGX(Another)     
IBL Reflectance Model Combining Reflection and Diffraction Energy Compensation 
Combined Fresnel/Visibility(Source Code) Accurate Fresnel Solid Angle Area Lights(Scattered)

RAY MARCHING / TRACING
Real-Time Volumetric Lighting/Shadows(Demo,More,etc) Raymarch(etc) Pixel-Projected Reflections(Code) Hi-Z More 
Clustered Volumetric Fog(Vapor) SDFGI Volumetric Clouds(Another,More) Clouds Shadows
Shadows(Another) Hybrid Reflections(Another) Gems 1 2 Traversal Scatter Smoke ReSTIR Packets NanoRT
Raymarching is a 3d-rendering technique, praised by programming-enthusiasts for both its simplicity and speed. It has been used extensively in the demoscene, producing low-size executables and amazing visuals. Ray stream techniques augment the fast single-ray traversal with increased utilization of vector units and leverage memory bandwidth for batches of rays.

VOXELS
Voxel Cone Tracing(Another) Global Illumination Simplified Lightmaps Clipmaps async Marching cubes 
Vertex Pooling(Source Code) Storage More Sandbox Terrain Terrain Deformation   

PROCEDURAL
Tile-based WFC Noise Vertex Generation(Another) L-System inspired 
Studying and solving visual artifacts occurring when procedural texturing Filtering
Terrain synthesis using noise Dijkstra-based Terrain Generation
Real-Time Editing of Procedural Terrains Terrain Modelling from Feature Primitives Procedural placement of 3D objects
Generating, Animating, and Rendering Varied Individuals for Real-Time Crowds



Light/Scattering/Reflections/Ambient Obscurance-Occlusion/GI
(minimal)Precomputed Atmospheric Scattering Fog(More) Another in Screen Space Single Scattering 
Real-time dreamy Cloudscapes with Volumetric Raymarching
Real-time rendering of participating media, such as fog is an important problem, because such media significantly influence the appearance of the rendered scene. Physically correct solution involves a costly simulation of a very large number of light-particle interactions, especially when considering multiple scattering.
This work briefly examines the existing solutions and then presents improved methods for real-time multiple scattering in quasi-heterogeneous media. Inherent visual artifacts are minimized with several techniques using analytically integrable density functions.
There are also several strategies to stylize volumetric single scattering, overcoming the difficulty that light shafts depend on the layout of an entire environment. These approaches are compatible with animated scenes and rely on very efficient solutions, which makes it ready to be used for real-time applications, and enable a quick exploration of the various settings. The techniques are applied at a global scope – i.e., for the whole scene – but can also be used to make local changes to the scattering behavior.

xatlas Modern LightMapping(Source Code) Lightmap Compression Little Lightmap Tricks Bin Packing
Precomputed Global Illumination Shading with Dynamic Lightmaps 
The Baking Lab(Source Code) Alternative Irradiance Caching (Source Code)
Various tests and observations to find most effective methods.
The main challenge of compressing lightmaps is that often they have a wider range than regular diffuse textures. This range is not as large as in typical HDR textures, but it’s large enough that using regular LDR formats results in obvious quantization artifacts. Lightmaps don’t usually have high frequency details, they are often close to greyscale, and only have smooth variations in the chrominance.

Separable Subsurface Scattering(Source Code) Simple Interactive Extending Another(Code)
Subsurface Scattering-Based Object Rendering Techniques Directional Subsurface Scattering Wrap 
Two techniques to generate separable approximations of diffuse reflectance profiles to simulate subsurface scattering for a variety of materials using just two 1D convolutions. Separable models yield state-of-the-art results in less than 0.5 millisecond per frame, which makes high-quality subsurface scattering affordable even in the most challenging real-time contexts such as games, where every desired effect may have a budget of tenths of a millisecond.

Min-Max Hi-Z Screen Space Planar Reflections Nonplanar Stochastic SSR Local Cubemaps Interior Mapping
Screen Space Reflections in Killing Floor 2(Source Code included)
Features screen space ray casting, backface depth buffer for per-pixel geometry thickness, distance attenuated pixel stride, rough/smooth surface reflection blurring, fully customizable for quality/speed and more.

GTAO(bitmask) Optimized SSAO SSDO(optimize) HBIL(Source Code)(More) LSAO/FFAO
Simplifying the rendering equation they derive an ambient transfer function that expresses the response of the surface and its neighborhood to ambient lighting, taking into account multiple reflection effects. The ambient transfer function is built on the obscurances of the point. If we make assumptions that the material properties are locally homogenous and incorporate a real-time obscurances algorithms, then the proposed ambient transfer can also be evaluated in real-time. Their model is physically based and thus can not only provide better results than empirical ambient occlusion techniques at the same cost, but also reveals where tradeoffs can be found between accuracy and efficiency.



Occlusion Culling/Caching/Level-of-Detail(LOD)/Terrain Rendering
Approximate projected bounds(Another) Design Patterns Object Pool Sprite Batch Double Precision 
An Adaptive and Hybrid Approach to Revisiting the Visibility Pipeline
In computer graphics very often you want to know how big an object looks in screen, probably measured in pixels. Or at least you want to have an upper bound of the pixel coverage, because that allows you to perform intelligent
Level of Detail (LOD) for that object. For example, if a character or a tree are not but a couple of pixels in screen, you probably want to render them with less detail. One easy way to get an upper bound of the pixel coverage is to embed your object in a bounding box or sphere, then rasterize the sphere or box and count the amount of pixels.
This requires complexity in your engine, and probably some delayed processing as the result of that rasterization won't be immediately ready.
It would be cool if a tessellation shader or a geometry shader would be able to tessellate or kill geometry on the fly based on the pixel coverage of the object, just immediately.
The pixel coverage of a (bounding) sphere happens to have analytic expression can be solved with no more than
one square root, its very compact.

Two-Pass Occlusion Culling GPU-Based(Another) Frustrum Culling(Another) Software Occlusion Vertex Discard 
Performing visibility determination in densely occluded environments is essential to avoid rendering unnecessary objects and achieve high frame rates. In this implementation, the image space Occlusion Culling algorithm is done completely in GPU, avoiding the latency introduced by returning the visibility results to the CPU. It utilizes the GPU rendering power to construct the Occlusion Map and then performs the image space visibility test by splitting the region of the screen space occludees into parallelizable blocks. This implementation is especially applicable for low end graphics hardware and the visibility results are accessible by GPU shaders. It can be applied with excellent results in scenes where pixel shaders alter the depth values of the pixels, without interfering with hardware Early-Z culling methods. They demonstrate the benefits and show the results of this method in real-time densely occluded scenes.

Efficient GPU Rendering A System for Rapid, Automatic Shader Level-of-Detail Geometry-Aware Portals
Explores a custom rendering architecture designed to efficiently render procedurally generated geometry.
Focus is on setting up compute shaders and optimizing batch counts by merging them into fewer material passes.
This approach aims to minimize memory allocation for the entire system.
http://graphics.cs.cmu.edu/projects/lodgen/

Height map compression techniques Terrain rendering Real-Time LOD Clipmaps
In practice, many applications handle the real-time rendering well with LOD schemes tailored to their needs.
In such cases, a compression method tied to a concrete LOD scheme is not feasible.
This method handles only the compression, so it can be used as a plug & play component in an existing real-time renderer. Its only job is to compress a block of terrain height samples sized 2nx2n and to provide fast progressive decompression of its mip-maps, while respecting the maximum error bound at every mip-map. The source code of the method is written modularly, so that any representation of the height samples can be compressed - doubles, floats or even arbitrary structures. It is inspired by C-BDAM - the compression method is extracted from the LOD scheme and simplified.
This approach introduces heavy redundancy of the data - a block corresponding to a certain quadtree node contains simplified blocks of its children and all these blocks are stored separately. The reason why this approach is used is that the user can navigate to any area almost immediately - only the data needed for the scene has to be fetched, without having to reconstruct it by traversing from the root. Moreover, this approach enables the user to flexibly extend the terrain data by high-resolution insets.
This algorithm should be able to compress a regular square block of height samples and progressively decompress it in the real-time, from the smallest mip-map to the largest one. Apart from this, the algorithm should not in any way interfere with the rendering pipeline of the application.



Shadows/Shadow Mapping-Volumes
Virtual Shadow Maps Screen space shadows CHSS(Cascaded,Another) Real-time(fixes) Volumes
Soft bilateral filtering volumetric shadows(Code) Shadow Terminator Optimizations (Source Code)(More) 
This method improves the rendering performance of the Percentage Closer Soft Shadows method by exploiting the temporal coherence between individual frames: The costly soft shadow recalculation is saved whenever possible by storing the old shadow values in a screen-space History Buffer. By extending the shadow map algorithm by a so-called Movement Map, they can not only identify regions disoccluded by camera movement, but also robustly detect and update shadows cast by moving objects: Only the shadows in the areas marked red in the right image have to be re-evaluated. This saves rendering time and doubles the soft shadow rendering performance in real-time 3D scenes with both static and dynamic objects.

Optimized Visibility Functions for Revectorization-Based Shadow Mapping(RBSM)(Source Code included)
Revectorization-based shadow mapping minimizes shadow aliasing by revectorizing the jagged shadow edges generated with shadow mapping, keeping low memory footprint and real-time performance for the shadow computation. However, the current implementation of RBSM is not so well optimized because its visibility functions are composed of a set of 43 cases, each one of them handling a specific revectorization scenario and being implemented as a specific branch in the shader.
Here, they take advantage of the shadow shape patterns to reformulate the RBSM visibility functions, simplifying the implementation of the technique and further providing an optimized version. Results indicate that this implementation runs faster than the original implementation, while keeping its same visual quality and memory consumption.

Non-Linearly Quantized Moment Shadow Maps(Source Code included) Moment-Based Methods
Moment Shadow Maps enable direct filtering to accomplish proper antialiasing of dynamic hard shadows. For each texel, the moment shadow map stores four powers of the depth in either 64 or 128 bits. After filtering, this information enables a heuristic reconstruction. However, the rounding errors introduced at 64 bits per texel necessitate a bias that strengthens light leaking artifacts noticeably. In this paper, they propose a non-linear transform which maps the four moments to four quantities describing the depth distribution more directly.
As a prerequisite for the use of its quantization schemes, they propose a compute shader that applies a resolve for a multisampled shadow map and a 9² two-pass Gaussian filter in shared memory. The quantized moments are written back to device memory only once at the very end. This approach makes the technique roughly as fast as
Variance Shadow Mapping without any of its drawbacks.
Since hardware-accelerated bilinear filtering is incompatible with non-linear quantization, they employ
blue noise dithering as inexpensive alternative to manual bilinear filtering.



Animation/Physics
Execution Time Optimization Optimization Strategies box pruning Rigid Bodies and Contacts
Collision Detection Collision Culling Output Sensitive Cubemap based Generic GPU Mapping Errors 
Optimization of various collision detection algorithms using GPU etc and ways to deal with errors.

Forward And Backward Reaching Inverse Kinematics(FABRIK)(Source Code) GPU 
Skinned Animation Textures Fast Non-Uniform 3D Frame iqm Klein Implicit ACL REPLACE quaternions
This mesh editing technique allows users to produce visually pleasing deformations. The linearity of the underlying objective functional makes the processing very efficient and improves the effectiveness of deformable surface computation. Method offers the possibility to encode global topological changes of the shape with respect of local influence and allows animators to re-use the estimation paraterization.

Wave Particles Water using FFT Texture Distortion(Another) Water Simulation(More) GPU Fluid
Real-Time Screen Space Fluid Rendering with Scene Reflections Real-time Interactive Water Waves
To solve the singular problem of water waves obtained with the traditional model, a hybrid deep-shallow-water model is estimated by using an automatic coupling algorithm. It can handle arbitrary water depth and different underwater terrain. As a certain feature of coastal terrain, coastline is detected with the collision detection technology. Then, unnecessary water grid cells are simplified by the automatic simplification algorithm according to the depth. Finally, the model is calculated on CPU and the simulation is implemented on GPU.

Animated Foliage and Cloth Tearing(Cloth Shading) Deformable Snow Boolean operations
GPU Assisted Self-Collisions of Cloths Real Time Cloth Simulation with B-spline Surfaces Mass-spring
The GPU is very effective doing vector math and the vertex program is already looping through all of the vertices to convert them to screen space and send them to the fragment program. Before this we can simply add a value to these positions before sending them down the line.
By supplying properties for the material stiffness wind direction and wind speed a more realistic look can be achieved which correlates more to what actually happens in nature. A second property of how attached parts of an object is could be supplied through vertex color.
The second scenario relates to vegetation. By instead building the bending properties into the shader you wouldn’t need any collision detection for each individual plant, instead just look at the distance between vertex and player and scale the bending based on this.

Simple But Effective Verlet Game Physics tinyphysicsengine qu3e 
Very easily implementable but effective method of doing physics for ropes, solid objects, chains, cloth, hair, characters jumping and sliding etc. Gives us big savings in performance too since everything doesnt need to be physically accurate just plausible enough.

JoltPhysics PositionBasedDynamics PhysX TressFX Newton OPCODE



Audio
Steam Audio - (Requires a License but Free - Valve Corporation) - HRTF etc

mojoAL
An SDL2-based implementation of OpenAL in a single C file.

Efficient Approximation of HRTF in Subbands for Accurate Sound Localization(Source Code)
Prioritized Computation for Numerical Sound Propagation
Results indicate that the proposed algorithms preserve the salience of spatial cues, even for relatively high approximation tolerances, yielding computationally very efficient implementations.



Anti-Aliasing
Bandlimited Pixel Filtering Texture Filtering Downsampling Fast Denoising
Ways to correctly filter pixel art. Goal is to preserve the pixellated nature of the textures, yet have an alias-free result. Takes a signal processing approach to the problem, solving the issue in a mathematically sound way.

Improved Geometric Specular Antialiasing(IGSA) Supplemental
Specific ways to adjust Lighting model and NDF Filtering and few more things in pixel shader.
Limited to specular aliasing mostly.

Temporal AA TRAA(Source Code) Improved Sampling 
Temporal techniques attempt to achieve supersampling by distributing the computations across multiple frames, while addressing all forms of aliasing. Very effective technique but needs a lot of adjustment to avoid artifacts.

QXAA CMAA2(Source)
A very efficient GPU MLAA based implementation, capable of handling subpixel features seamlessly, and featuring an improved and advanced pattern detection & handling mechanism.

Alpha to Coverage (Alpha Mipmaps) Anti-aliased Alpha Test
Efficient Dithering(Another) r11g11b10f Improved box/triangle filtering 
They solve the problems with alphas by sampling the alpha and interpreting how much it covers the pixel, dithering and distributing the result to an appropriate number of multisample samples.



Shaders/Effects
ReShade (Shaders) Shader Minifier(Source Code)(compiled exe) More XeSS FSR

Optimization Tips Advanced tricks

Low-level Thinking(Optimization)(More) Vertex Shader Tricks Compute Shaders(More/Optimize) Specialization

Variable Rate Shading(Another Implementation)(Another)(Another)(More)
Due to complex shaders and high-resolution displays (particularly on mobile graphics platforms), fragment shading often dominates the cost of rendering in games. To improve the efficiency of shading on GPUs, they extend the graphics pipeline to natively support techniques that adaptively sample components of the shading function more sparsely than per-pixel rates.
Extensive study of the challenges of integrating adaptive, multi-rate shading into the graphics pipeline, and evaluation on implementations that they believe are practical evolutions of modern GPU designs.

Particles Optimization Rendering Particles with Compute Shaders Using 3D Vector Fields Particle System API
GPU-based particle simulation(Source Code included) Noise
Particle Simulation with GPUs GParticles(Source Code)
Particle systems and particle effects are used to simulate a realistic and appealing atmosphere in many virtual environments. However, they do occupy a significant amount of computational resources. The demand for more advanced graphics increases by each generation, likewise does particle systems need to become increasingly more detailed.
This thesis proposes a texture-based 3D vector field particle system, computed on the GPU, and compares it to an equation-based particle system.
Several tests were conducted comparing different situations and parameters. All of the tests measured the computational time needed to execute the different methods.

High Dynamic Range Imaging Pipeline on the GPU Dynamic Local Exposure(for tonemapping) Gamma Correction
In this article they aim to fill a gap of providing a detailed description of how the HDRI pipeline, from HDR image assembly to tone mapping, can be implemented exclusively on the GPU. They also explain the trade-offs that need to be made for improving efficiency and show timing comparisons for CPU vs GPU implementations.
Another goal of this paper is to demonstrate how both the global and local versions of this operator can be efficiently implemented by using fragment shaders. Different from previous work, they show that the implementation of this operator neither requires expensive convolution nor Fourier transform operations to compute local adaptation luminances.



Meshes
meshoptimizer(Optimize meshes) Seam-aware Decimater Connected Triangles Quadratic Error Metric(Another)
Dynamic Vertex Formats Simplified and Tessellated Mesh 3D Mesh Simplification Quad   
Various methods of mesh simplification and efficient rendering.

simple Catmull-Clark in parallel GPU Tessellation with Compute Shaders(Source Code)
Feature-Adaptive Rendering of Loop Subdivision Surfaces on Modern GPUs Convex Hull Problems
GPU-based refinement scheme that is free from the limitations incurred by tessellation shaders. Specifically, scheme allows arbitrary subdivision levels at constant memory costs. Its achieved by manipulating an implicit (triangle-based) subdivision scheme for each polygon of the scene in a dedicated compute shader that reads from and writes to a compact, double-buffered array. Performance of the implementation is both fast and stable. Naturally, the average GPU rendering time depends on how the terrain is shaded.

Geometry Batching Using Texture-Arrays
Batching can be used to group and sort geometric primitives into batches to reduce the number of required state changes, whereas the size of the batches determines the number of required draw-calls, and therefore, is critical for rendering performance.
For example, in the case of texture atlases, which provide an approach for efficient texture management, the batch size is limited by the efficiency of the texture-packing algorithm and the texture resolution itself.
This paper presents a pre-processing approach and rendering technique that overcomes these limitations by further grouping textures or texture atlases and thus enables the creation of larger geometry batches. It is based on texture arrays in combination with an additional indexing schema that is evaluated at run-time using shader programs.
Basically, facilitates a flexible partitioning of geometry.



Textures
Bindless Descriptors Templates Another(Source Code) Virtual Texturing(Source Code
The Implementation of a Scalable Texture Cache(Source Code) Incremental loading of terrain textures min-max mip
MinLod Mipmap(Source Code)
Virtual texturing is a solution to the problem of real-time rendering of scenes with vast amounts of texture data which does not fit into graphics or main memory. Virtual texturing works by preprocessing the aggregate texture data into equally-sized tiles and determining the necessary tiles for rendering before each frame. These tiles are then streamed to the graphics card and rendering is performed with a special virtual texturing fragment shader that does texture coordinate adjustments to sample from the tile storage texture.

bcdec rgbcx.h betsy RGBV Real-time BC6H Compression on GPU fpng
Decompress/compress various texture formats.

Normal Mapping Using the Surface Gradient(paper) For Triplanar Shader Without Precomputed Tangents 
More efficient forms of Normal Maps.

Contact Refinement Parallax Mapping[o3de]
Clear and simple explanation how it works in the article. Parallax techniques for some reason were often really obscure and badly explained in the past, also it was hard to see clear difference that was worth it. This is essentially another revision of how intersection is refined with additional steps that avoid issues with earlier techniques and normal mapping. Has similar performance as other methods but superior in quality.

Bindless Deferred Decals Texture-space Decals Another
There are few ways to do decals which are used in games to draw images onto others surfaces but most of them have different tradeoffs. Rendering into texture space is one of them.

Height-blending Using Lerp in Delta-Time Fixing Texture Seams 
Way to blend between textures, most common example of this is terrain. Explains an effect where we can use additional lerp interpolation and height data to control exactly where the blending should occur.



AI/Scripting
Generic A* in C
Compromise-free Pathfinding on a Navigation Mesh(Source Code) Navigation Mesh Generator
Adaptive Layered Goal Oriented Action Planning(GOAP)(Source Code)
Dynamic and Robust Local Clearance Triangulations
A optimization of A* algorithm to make it close to human pathfinding behavior
Time-Bounded Best-First Search for Reversible and Non-reversible Search Graphs
Refers to a simplified STRIPS-like planning architecture specifically designed for real-time control of autonomous character behavior in games. To create the most dynamic AI.




State-of-the-Art/Comparisons/Roundups/Surveys/Analysis
Software optimization resources
Optimization Techniques for 3D Graphics Deployment More Debris: Opening the box
Rendering massive 3D scenes in real-time
Software-based approximate computing for mathematical functions
Bounding volume hierarchy
Specialization Opportunities in Graphical Workloads
Continuity and Interpolation Techniques More in Detail(Code/Samples/Algorithms)
Feature Aware Sampling and Reconstruction
Theory and Numerical Integration of Subsurface Light Transport
Real-Time Rendering Fourth Edition, Real-Time Ray Tracing
Hardware Accelerators for Animated Ray Tracing
Variance of integral approximation methods in ray tracing
Global Illumination in Participating Media
Scalable Algorithms for Height Field Illumination
Ambient Occlusion on Mobile: an empirical comparison (Source Code - last pages)
Transparency and Anti-Aliasing Techniques for Real-Time Rendering
Filtering Approaches for Real-Time Anti-Aliasing
Algorithms for Efficient Computation of Convolution
Kernel optimization by layout restructuring
A Bigger Mathematical Picture for Computer Graphics Intersection
3D mesh compression: survey, comparisons and emerging trends
On Some Interactive Mesh Deformations
Adaptive Physically Based Models in Computer Graphics
Efficient encoding of texture coordinates guided by mesh geometry
Fundamental computational geometry on the GPU
Real-time Rendering Techniques with Hardware Tessellation
Combining displacement mapping methods on the GPU for real-time terrain visualization
Comparison of spherical cube map projections used in planet-sized terrain rendering
Course/Book/Presentations that Provide Useful Info/Analyses most of most useful Shadow Map/Shadow Volume, Hard/Soft/Volumetric Shadow Techniques Shadow Mapping Algorithms
An evaluation of moving shadow detection techniques
A Comprehensive Study on Pathfinding Techniques for Robotics and Video Games





??? ... hmm
Radiosity, GPU
Radiosity for Real-Time Simulations of Highly Tessellated Models
Real-Time Dynamic Radiosity for High Quality Global Illumination Large Scale Scenes
Techniques based around Radiosity that provide unique advantages.

Intrinsic Triangulations
Different way to represent the geometry of a triangle mesh, by edge lengths, rather than vertex positions. This should avoid whole set of problems that can appear when adapting meshes from various sources for your game. You end up with regular ready to use meshes after the last step that have way less problems.

Importance Sampling for Lights(MIS) Generalized A Fresh Look at Generalized Sampling Path Space Filtering
Combining Reprojection and Adaptive Sampling Forced Random Sampling (Code)
Low-Discrepancy Blue Noise Filtering Animating Noise Non-Linear Transfer Functions etc
It decomposes a filter into two parts: a compactly supported continuous-domain function and a digital filter. This broadly summarizes the key aspects of the framework, and delves into specific applications in graphics. Using new notation, concisely presents and extends several key techniques.
In addition, demonstrates benefits for prefiltering in image downscaling, supersample-based rendering, and analyzes the effect that generalized sampling has on noise.

Acceleration OIT(Source Code) Depth Sorting OIT_Optimized with MSAA with Linked List
Real-Time Deep Image Rendering and Order Independent Transparency Guarded
Memory-Efficient Order-Independent Transparency with Dynamic Fragment Buffer Faster Transparency(Source Code
Many recent graphics hardware features, namely atomic operations and dynamic memory location writes, now make it possible to capture and store all per-pixel fragment data from the rasterizer in a single pass. A core and driving application is order-independent transparency(OIT).
A number of image sorting improvements are presented, significantly advancing the ability to perform transparency rendering in real time.

A Temporal Stable Distance To Edge Anti-aliasing
Improved Geometry Buffer Anti-Aliasing(GBAA+)(Source Code)
Triangle-based Geometry Anti-Aliasing(TGAA)
The implementation can, without any sub-pixel information and by storing extra geometrical data in a pre-render pass, prevent temporal instability and solve aliasing artifacts during a post-render pass. Thus being a real alternative to the state of the art post-processing Anti-Aliasing solutions, in sense of performance and quality in high end game engines and systems.
Reliance on hardware features for solving triangle edges can easily be removed from the solution making it implementable on a large variety of hardware. If this is the case, prototype 1 can be an excellent complement to Anti-Aliasing solutions such as Multi Sampling which can not solve alpha clipped edges.

AABO Zero-byte AABB-trees Dynamic BVH TSS BVH Another Hashed Shading Octree Quadtree
Generic Hybrid CPU-GPU Parallelization Dynamic Data Structures for Scheduling

A Modification of the Fast Inverse Square Root Algorithm A Non-linear GPU Thread Map for Triangular Domains
There is a stage in the GPU computing pipeline where a grid of thread-blocks, in parallel space, is mapped onto the problem domain, in data space. Threads that fall inside the domain perform computations while threads that fall outside are discarded at runtime.
In this work they study the case of mapping threads efficiently onto triangular domain problems and propose a block-space linear map λ(ω), based on the properties of the lower triangular matrix, that reduces the number of unnecessary threads from O(n2) to O(n).
This study is about the performance of algorithms, with similar purpose as Carmack and Lomont implementation of square root using three iterations of the Newton-Raphson method and the magic number “0x5f3759df”.

NNAO(Source Code) BNAO ESRGAN
Various implementations of techniques based around AI, Neural Networks. Mostly generated ahead as assets to use ingame.

DOD based ECS
The use of object oriented paradigms to model simulation objects in class hierarchies has been reported as incompatible with constantly changing demands during game development, resulting in anti-patterns and eventual, messy re-factoring. With the explicit goals to be simple, inherently compatible with data oriented design, this thesis describes the development of an architecture of software to manage large amounts of simulation objects in real-time while dealing with “crosscutting concerns” between subsystems.

Point Clouds
The basic idea is to spawn a compute shader that transforms points to screen space, encodes depth and color into a single 64 bit integer, and uses atomicMin to compute the closest point for each pixel. The color value is then extracted from the interleaved depth+color buffer and converted into a regular OpenGL texture for display. This allows several batch-level optimizations such as frustum culling, LOD rendering, and adaptive precision. Adaptive precision picks a sufficient coordinate precision (typically just 10 bit per axis) depending on the projected batch size, which boosts brute-force performance due to lower memory bandwidth requirements.

Fast Data Parallel Radix Sort Implementation by Avoiding Zero Bits Based on Divide and Conquer Technique Another
The algorithms implement several optimization techniques to take advantage of the HW architecture such as:
taking advantage of kernel fusion strategy, the synchronous execution of threads in a warp/waveform to eliminate the need for barrier synchronization, using shared memory across threads within a group, management of bank conflicts, eliminate divergence by avoiding branch conditions and complete unrolling of loops, use of adequate group/thread dimensions to increase HW occupancy and application of highly data-parallel algorithms to accelerate the scan operations.

Efficient FFT Algorithms Reduction Stockham radix-4 
Convolution is a mathematical tool used in filtering, correlation, compression and in many other applications. Although the concept of convolution is not new, the efficient computation of convolution is still an open topic. As the burden of data is constantly increasing, there appears request for fast manipulation with large data.
The fast convolution have been proposed to recursively determine if one new signal sample or new small portion of samples emerge in the given period N of a realization x(n) replacing the old one sample or old portion of samples, respectively. The number of operations for their speedy calculating is essentially reduced by the original recursive expression in comparison with the ordinary FFT procedure used only in the case of fixed values of samples.

Last edited by ThaOneDon (2024-03-02 04:14:15)

Offline

#2 2014-11-06 08:05:09

ImNotQ009
Moderator

Re: tech thread

We already have parallax occlusion mapping

Offline

#3 2014-11-06 11:10:13

ThaOneDon
Member

Re: tech thread

I guess i should always take a look at Cube engine's docs first as well before making suggestions?

:)

Still SSDO is quite interesting...

Offline

#4 2014-11-09 06:11:26

ThaOneDon
Member

Re: tech thread

Update 7

Offline

#5 2014-11-09 10:35:48

Calinou
Moderator

Re: tech thread

Now, add all this by yourself.

Last edited by Calinou (2014-11-09 10:35:52)

Offline

#6 2014-11-09 15:54:16

ThaOneDon
Member

Re: tech thread

Wish it was that easy, still theres tons of work to look forward to.
I hope all of this is useful.

Last edited by ThaOneDon (2016-01-29 16:23:36)

Offline

#7 2014-11-10 08:52:57

ThaOneDon
Member

Re: tech thread

Update 9

Offline

#8 2014-11-11 04:32:16

ThaOneDon
Member

Re: tech thread

I'm absorbing more research papers at the moment.

UPDATE: Added UPDATE 10

Last edited by ThaOneDon (2016-08-29 05:51:46)

Offline

#9 2014-11-11 07:04:44

eihrul
Administrator

Re: tech thread

Unless you've actually read all the stuff in here and can give actual descriptions of why each of these papers is individually interesting, I am going to have to delete this thread as spam.

Offline

#10 2014-11-11 07:34:39

ThaOneDon
Member

Re: tech thread

OK. Its going to take some time thou theres a lot to cover.

Last edited by ThaOneDon (2016-08-29 05:51:17)

Offline

#11 2014-11-12 11:47:47

noman222
Member

Re: tech thread

Tesseract is a great game, and bots are fun, but what's an online based game without a good sized player base?
A game like this would have a good chance to fly if it got into steam greenlight. Add the facts: that it's free, include steam workshop support for sharing maps and making mods, it's a tribute to old school gaming (a bit), and there'll be an awesome player base for lots of fun.
What's more, we'll be tapping into a huge source of ideas and good map designers, and, if it's not too hard, let tesseract use steam servers for multiplayer.
What do you think? Is it worth a try?







_______________________
Noman

Offline

#12 2014-11-12 11:57:10

spikeymikey0196
Member

Re: tech thread

noman222 wrote:

Tesseract is a great game, and bots are fun, but what's an online based game without a good sized player base?
A game like this would have a good chance to fly if it got into steam greenlight. Add the facts: that it's free, include steam workshop support for sharing maps and making mods, it's a tribute to old school gaming (a bit), and there'll be an awesome player base for lots of fun.
What's more, we'll be tapping into a huge source of ideas and good map designers, and, if it's not too hard, let tesseract use steam servers for multiplayer.
What do you think? Is it worth a try?







_______________________
Noman

As much as I get what you're saying, you have to realise that Tesseract isnt anywhere near ready to be put onto Greenlight.. The developers know this, otherwise it would be on there.
You're also missing out the fact that this is in very early stages and it would take ages to get it to actually be successful on greenlight.. there's just too little at the moment to work with.. although the userbase does need to excel, it wont happen just yet :3

Offline

#13 2014-11-12 17:29:48

ThaOneDon
Member

Re: tech thread

There are few conditions needed for that to happen.

Greenlight implies also short timeframe to make the game and that ofcourse would stress the development.

Right now the engine/game is going in steady and precise phase. Thats what i and i'm sure the team wants. To make small but meaningful changes.

If anyone is willing to use the engine to make something interesting for steam Greenlight, license wise it shouldn't be a problem. Don't use the stuff from "media", everything else is A-OK.

:)

Offline

#14 2014-11-18 15:11:02

ThaOneDon
Member

Re: tech thread

Massive Updates 18/11/2014

Offline

#15 2014-11-19 11:22:58

ThaOneDon
Member

Re: tech thread

More Updates 19/11/2014

Tech
*Line Space Gathering for Single Scattering in Large Scenes
*ManyLoDs: Parallel Many-View Level-of-Detail Selection for Real-Time Global Illumination
*Improving Performance and Accuracy of Local PCA

Performance saving
*Importance Caching for Complex Illumination
*Fast Parallel GPU-Sorting Using a Hybrid Algorithm

Shaders
*3D Unsharp Masking for Scene Coherent Enhancement
*Precision Selection for Energy-Efficient Pixel Shaders
*Bidirectional Light Transport with Vertex Merging

Offline

#16 2014-11-20 14:41:33

ThaOneDon
Member

Re: tech thread

Updates 20/11/2014

Tech
*Sample Distribution Shadow Maps
*Depth Interval Grid Displacement Mapping
*Frostbyte Engine Tech (incredibly advanced and performance friendly)

Performance Saving
*Parallel View-Dependent Level-of-Detail Control
*Efficient Interactive Rendering of Detailed Models with Hierarchical Levels of Detail

Last edited by ThaOneDon (2014-11-20 16:57:07)

Offline

#17 2014-11-20 16:13:34

spikeymikey0196
Member

Re: tech thread

Gonna add slightly to the list with this:
DOT Engine AI: https://github.com/MatrixCompSci/DOT

Offline

#18 2014-11-21 06:29:12

ThaOneDon
Member

Re: tech thread

Updates 21/11/2014

Tech
*Deep Opacity Maps

Performance Saving
*Frame Sequential Interpolation for Discrete Level-of-Detail Rendering

Shaders
*An Optimizing Compiler for Automatic Shader Bounding

Last edited by ThaOneDon (2014-11-21 06:47:16)

Offline

#19 2014-11-22 05:53:35

ThaOneDon
Member

Re: tech thread

Updates 22/11/2014

Tech
*PMAO (Photometric Ambient Occlusion)
*C-BDAM - Compressed Batched Dynamic Adaptive Meshes for Terrain Rendering
*Tile-Trees

Performance Saving
*Tuning Catmull-Clark Subdivision Surfaces (OpenSubDiv is based on these)
*An Interactive Perceptual Rendering Pipeline using Contrast and Spatial Masking

Shaders
*Implementing the Render Cache and the Edge-and-Point Image

Last edited by ThaOneDon (2014-11-22 07:09:15)

Offline

#20 2014-11-22 11:50:34

RaZgRiZ
Moderator

Re: tech thread

Maybe you should spend some time to cagegorize all of them and make the least more readable.. It's a total mess.

Offline

#21 2014-11-22 18:51:54

ThaOneDon
Member

Re: tech thread

I'll see what i can do with the stuff thats related but a lot of it isn't so theres really no good way of categorizing it.

Offline

#22 2014-11-22 21:55:07

RaZgRiZ
Moderator

Re: tech thread

ThaOneDon wrote:

I'll see what i can do with the stuff thats related but a lot of it isn't so theres really no good way of categorizing it.

At least make it prettier. Hide the links inside the URL tag and use just the text instead. That's one way to do it.

don't click me

Last edited by RaZgRiZ (2014-11-22 21:55:38)

Offline

#23 2014-11-23 01:18:46

ThaOneDon
Member

Re: tech thread

Working on it

DONE

Last edited by ThaOneDon (2014-11-23 08:31:51)

Offline

#24 2014-11-23 13:28:08

RaZgRiZ
Moderator

Re: tech thread

ThaOneDon wrote:

Working on it

DONE

Err, add a little spacing too and some title sizing. It's not mentioned but i think this forum supports basic bbcode so it should be possible.

Offline

#25 2014-11-23 22:49:25

ThaOneDon
Member

Re: tech thread

Done

Updates 24/11/2014
Tech
*Highlight Microdisparity for Improved Gloss Depiction
*Implicit Skinning: Real-Time Skin Deformation with Contact Modeling

Last edited by ThaOneDon (2014-11-24 01:57:35)

Offline

Board footer