#1 2014-11-06 07:48:37

ThaOneDon
Member

tech thread

Updated (from time to time) with tons of tips/techniques not just for messing around in Tesseract but Graphics in General.
Resource thread for anyone who wants to experiment with stuff.



Inside Supraleiter Uncharted 4 Far Cry 5 Witcher 3 Titanfall Doom
jcgt.org GPU Zen(Source Code) The ryg’s The Danger Zone

CRYENGINE(License) UNREAL ENGINE(License) UNITY(License)
SOURCE SDK(License) X-RAY(License) ID TECH 3 ID TECH 4+(Licenses/GPL)

ANKI3D TESSERACT WICKED GODOT LUMIX The Forge Diligent



Rendering/Shading
Reducing Driver Overhead code Scalar Concurrency Multi-Core(Cache) SSE(etc) math(bit) limits queue
Dynamic Resources Command Buffer ParallelFor Indirect Rendering Checkerboard Rendering
Reversed-Z Logarithmic Hi-Z G-Buffers(Packing) Triangles(Another)

- Clustered, Forward+, Deferred -
A Primer On Efficient Rendering Forward Clustered Shading Another Triangle Visibility Buffer Simple Alternative
Cull that Cone(Another) Optimizing tile-based light culling Optimisations(More) 
(Clustered is compatible/works with both Forward and Deferred, while being less prone to problems)


PBR
PBR(Source Code) Approximate Models For Physically Based Rendering(Source Code) Multiple Scattering   
Reflectance Model Combining Reflection and Diffraction Distant Lighting Energy Compensation optimization   
Combined Fresnel/Visibility(Source Code) Area Lights(Another) Sampling, GGX Importance Sampling

RAY MARCHING
Volumetric Lights Raymarching(Another) shader
Pixel-Projected Reflections(Implementation)(Source Code) Screen Space HIZ Tracing(Reflections)
Clustered Volumetric Fog Vapor Volumetric Fog Volumetric Clouds(Another) Tiny Clouds Shadows Occlusion 
Raymarching is a 3d-rendering technique, praised by programming-enthusiasts for both its simplicity and speed.
It has been used extensively in the demoscene, producing low-size executables and amazing visuals.

VOXELS
Voxel Cone Tracing Global Illumination(Demo/Source Code) 3D Textures Probes (Source Code) Clipmaps Another
Cascaded Voxel Cone Tracing in The Tomorrow Children(Source Code Another) async Marching cubes Another
Storage Infinite Sparse Volumes Isosurface Contouring More (Source Code Sandbox Terrain) Geometry shader   

OIT
Real Time Depth Sorting of Transparent Fragments OIT_Optimized with MSAA with Linked List/Shared Fragment Pool
Real-Time Deep Image Rendering and Order Independent Transparency Guarded
Memory-Efficient Order-Independent Transparency with Dynamic Fragment Buffer Faster Transparency(Source Code
Many recent graphics hardware features, namely atomic operations and dynamic memory location writes, now make it possible to capture and store all per-pixel fragment data from the rasterizer in a single pass. A core and driving application is order-independent transparency(OIT).
A number of image sorting improvements are presented, significantly advancing the ability to perform transparency rendering in real time.

PROCEDURAL
Vertex Generation(Another)
Studying and solving visual artifacts occurring when procedural texturing Filtering
Dijkstra-based Terrain Generation Using Advanced Weight Functions LOD Terrain Generation(Source Code)
Real-Time Editing of Procedural Terrains Terrain Modelling from Feature Primitives Procedural placement of 3D objects
Generating, Animating, and Rendering Varied Individuals for Real-Time Crowds



Light/Scattering/Reflections/Ambient Obscurance-Occlusion/GI
Real-Time Light Transport in Analytically Integrable Participating Media in Screen Space Single Scattering 
Precomputed Atmospheric Scattering: a New Implementation Analytic Fog Density(More
Real-time rendering of participating media, such as fog is an important problem, because such media significantly influence the appearance of the rendered scene. Physically correct solution involves a costly simulation of a very large number of light-particle interactions, especially when considering multiple scattering.
This work briefly examines the existing solutions and then presents an improved method for real-time multiple scattering in quasi-heterogeneous media. Inherent visual artifacts are minimized with several techniques using analytically integrable density functions.
There are also several strategies to stylize volumetric single scattering, overcoming the difficulty that light shafts depend on the layout of an entire environment. These approaches are compatible with animated scenes and rely on very efficient solutions, which makes it ready to be used for real-time applications, and enable a quick exploration of the various settings. The techniques are applied at a global scope – i.e., for the whole scene – but can also be used to make local changes to the scattering behavior.
Image-based occluder manipulations modify the complexity of the scattering appearance and are controlled by only a few parameters. Transfer functions allow us to interactively design a general mood and the result can even
be transferred to other scenes. 

Separable Subsurface Scattering(Source Code) Extending Another Implementation(Source Code)
Subsurface Scattering-Based Object Rendering Techniques Directional Subsurface Scattering
Wrap Shading Extension to Energy-Conserving Wrapped Diffuse
Radiance Scaling for Versatile Surface Enhancement
Addressing Grazing Angle Reflections in Phong Models
Two techniques to generate separable approximations of diffuse reflectance profiles to simulate subsurface scattering for a variety of materials using just two 1D convolutions. Separable models yield state-of-the-art results in less than 0.5 millisecond per frame, which makes high-quality subsurface scattering affordable even in the most challenging real-time contexts such as games, where every desired effect may have a budget of tenths of a millisecond.

Modern LightMapping(Source Code) artifact-free lightmaps lightmapper
Lightmap Compression Little Lightmap Tricks Bin Packing
Shading with Dynamic Lightmaps
The Baking Lab(Source Code) Ambient Dice Irradiance Caching Part 2 (Source Code)
Adaptive Texturing and Geometry Processing for High-Detail Real-Time Rendering
Various tests and observations to find most effective methods.
The main challenge of compressing lightmaps is that often they have a wider range than regular diffuse textures. This range is not as large as in typical HDR textures, but it’s large enough that using regular LDR formats results in obvious quantization artifacts. Lightmaps don’t usually have high frequency details, they are often close to greyscale, and only have smooth variations in the chrominance.

Screen Space Planar Reflections Stochastic SSR Local Cubemaps
Screen Space Reflections in Killing Floor 2(Source Code included) Screen Space Reflections(Source Code)
kode80's screen space reflections implementation for Unity3D 5. Features screen space ray casting, backface depth buffer for per-pixel geometry thickness, distance attenuated pixel stride, rough/smooth surface reflection blurring, fully customizable for quality/speed and more.

BNAO NNAO(Source Code) HBIL(Source Code) Accurate Indirect Occlusion(More) SAO, Temporal Reprojection   
LSAO/FFAO MiniEngineAO SSDO(Deferred) Volumetric SSAO(Another) 
Simplifying the rendering equation they derive an ambient transfer function that expresses the response of the surface and its neighborhood to ambient lighting, taking into account multiple reflection effects. The ambient transfer function is built on the obscurances of the point. If we make assumptions that the material properties are locally homogenous and incorporate a real-time obscurances algorithms, then the proposed ambient transfer can also be evaluated in real-time. Their model is physically based and thus can not only provide better results than empirical ambient occlusion techniques at the same cost, but also reveals where tradeoffs can be found between accuracy and efficiency.

Real-time Rendering of Translucent Material by Contrast-Reversing Procedure
The conventional method of rendering the translucence of an object is difficult to implement in real time, since the translucency is accompanied by complicated light behavior such as scattering and absorption. To simplify this rendering process, they focus on the contrast-reversing stimulant property in vision science. This property is based on the perception that we can recognize a luminance histogram compatible between scattering and absorption. According to this property, they propose a simple rendering method to reverse the light path between reflection and transmission.
Their method adopts an additional function for selecting a front or back scattering process in the calculation of each pixel value. Because this improvement makes only slight alterations in the conventional reflection model, it can reproduce a translucent appearance in real time while inheriting the advantages of various reflection models.



Shadows/Shadow Mapping-Volumes
Real-time Shadows Shadows(Source Code)(More) Contact-hardening Soft Shadows Made Fast(Another)
Soft bilateral filtering volumetric shadows using cube shadow maps(Code) Realistic Local Lighting
This method improves the rendering performance of the Percentage Closer Soft Shadows method by exploiting the temporal coherence between individual frames: The costly soft shadow recalculation is saved whenever possible by storing the old shadow values in a screen-space History Buffer. By extending the shadow map algorithm by a so-called Movement Map, they can not only identify regions disoccluded by camera movement, but also robustly detect and update shadows cast by moving objects: Only the shadows in the areas marked red in the right image have to be re-evaluated. This saves rendering time and doubles the soft shadow rendering performance in real-time 3D scenes with both static and dynamic objects.

Optimized Visibility Functions for Revectorization-Based Shadow Mapping(RBSM)(Source Code included)
Revectorization-based shadow mapping minimizes shadow aliasing by revectorizing the jagged shadow edges generated with shadow mapping, keeping low memory footprint and real-time performance for the shadow computation. However, the current implementation of RBSM is not so well optimized because its visibility functions are composed of a set of 43 cases, each one of them handling a specific revectorization scenario and being implemented as a specific branch in the shader.
Here, they take advantage of the shadow shape patterns to reformulate the RBSM visibility functions, simplifying the implementation of the technique and further providing an optimized version. Results indicate that this implementation runs faster than the original implementation, while keeping its same visual quality and memory consumption.

Non-Linearly Quantized Moment Shadow Maps(Source Code included) Moment-Based Methods
Moment Shadow Maps enable direct filtering to accomplish proper antialiasing of dynamic hard shadows. For each texel, the moment shadow map stores four powers of the depth in either 64 or 128 bits. After filtering, this information enables a heuristic reconstruction. However, the rounding errors introduced at 64 bits per texel necessitate a bias that strengthens light leaking artifacts noticeably. In this paper, they propose a non-linear transform which maps the four moments to four quantities describing the depth distribution more directly.
As a prerequisite for the use of its quantization schemes, they propose a compute shader that applies a resolve for a multisampled shadow map and a 9² two-pass Gaussian filter in shared memory. The quantized moments are written back to device memory only once at the very end. This approach makes the technique roughly as fast as
Variance Shadow Mapping without any of its drawbacks.
Since hardware-accelerated bilinear filtering is incompatible with non-linear quantization, they employ
blue noise dithering as inexpensive alternative to manual bilinear filtering.

Irregular Adaptive Shadow Maps
A tile based partitioning scheme is provided to facilitate dynamic customizability and allow for adaptive distribution of resources at run time which leads to more efficient use of memory resources.

Efficient High-Quality Shadow Maps
This thesis provides an efficient GPU implementation of various optimizations to basic shadow mapping. The optimizations, which echo the idea of making full use of the available resolution and precision, are simple to implement, provide a great deal of improvement and allow for some amount of dynamic refinement of shadows with change in the camera view.



Terrain Rendering/Level-of-Detail(LOD)/Occlusion Culling/Caching
GPU-Based Occlusion Culling Frustrum Culling Software Occlusion Culling
Vertex Discard Occlusion Culling Clip Space Sample Culling Stochastic Light Culling
Performing visibility determination in densely occluded environments is essential to avoid rendering unnecessary objects and achieve high frame rates. In this implementation, the image space Occlusion Culling algorithm is done completely in GPU, avoiding the latency introduced by returning the visibility results to the CPU. It utilizes the GPU rendering power to construct the Occlusion Map and then performs the image space visibility test by splitting the region of the screen space occludees into parallelizable blocks. This implementation is especially applicable for low end graphics hardware and the visibility results are accessible by GPU shaders. It can be applied with excellent results in scenes where pixel shaders alter the depth values of the pixels, without interfering with hardware Early-Z culling methods. They demonstrate the benefits and show the results of this method in real-time densely occluded scenes.

Sphere Projection
An Adaptive and Hybrid Approach to Revisiting the Visibility Pipeline
In computer graphics very often you want to know how big an object looks in screen, probably measured in pixels. Or at least you want to have an upper bound of the pixel coverage, because that allows you to perform intelligent
Level of Detail (LOD) for that object. For example, if a character or a tree are not but a couple of pixels in screen, you probably want to render them with less detail. One easy way to get an upper bound of the pixel coverage is to embed your object in a bounding box or sphere, then rasterize the sphere or box and count the amount of pixels.
This requires complexity in your engine, and probably some delayed processing as the result of that rasterization won't be immediately ready.
It would be cool if a tessellation shader or a geometry shader would be able to tessellate or kill geometry on the fly based on the pixel coverage of the object, just immediately.
The pixel coverage of a (bounding) sphere happens to have analytic expression can be solved with no more than
one square root, its very compact.

Height map compression techniques
In practice, many applications handle the real-time rendering well with LOD schemes tailored to their needs.
In such cases, a compression method tied to a concrete LOD scheme is not feasible.
This method handles only the compression, so it can be used as a plug & play component in an existing real-time renderer. Its only job is to compress a block of terrain height samples sized 2nx2n and to provide fast progressive decompression of its mip-maps, while respecting the maximum error bound at every mip-map. The source code of the method is written modularly, so that any representation of the height samples can be compressed - doubles, floats or even arbitrary structures. It is inspired by C-BDAM - the compression method is extracted from the LOD scheme and simplified.
This approach introduces heavy redundancy of the data - a block corresponding to a certain quadtree node contains simplified blocks of its children and all these blocks are stored separately. The reason why this approach is used is that the user can navigate to any area almost immediately - only the data needed for the scene has to be fetched, without having to reconstruct it by traversing from the root. Moreover, this approach enables the user to flexibly extend the terrain data by high-resolution insets.
This algorithm should be able to compress a regular square block of height samples and progressively decompress it in the real-time, from the smallest mip-map to the largest one. Apart from this, the algorithm should not in any way interfere with the rendering pipeline of the application.

Fast Terrain Rendering with Continuous Detail on a Modern GPU Real-Time LOD (Algorithms, Code etc)
Applying Tessellation to Clipmap Terrain Rendering More(solutions for clipmap/heightmap issues)
Implementation Details of Sample App Using Hybrid Terrain Representation(voxels+heightmap, and deformation)
How to achieve fast terrain rendering using combination of best what was available (that they know) at the time this was posted.

Downsampling Scattering Parameters for Rendering Anisotropic Media
A new approach to compute scattering parameters at reduced resolutions. Many detailed appearance models involve high-resolution volumetric representations. Such level of detail leads to high storage but is usually unnecessary especially when the object is rendered at a distance. However, naïve downsampling often loses intrinsic shadowing structures and brightens resulting images.
This method computes scaled phase functions, a combined representation of single-scattering albedo and phase function, and provides significantly better accuracy while reducing the data size by almost three orders of magnitude.
They also show that modularity can be exploited to greatly reduce the amortized optimization overhead by allowing multiple synthesized models to share one set of downsampled parameters. Optimized parameters generalize well to novel lighting and viewing configurations.



Animation/Physics
Forward And Backward Reaching Inverse Kinematics(FABRIK)(Source Code) RBDL(Source Code)
Klein Skinning REPLACE quaternions Animated Models Implicit ACL
This mesh editing technique allows users to produce visually pleasing deformations. The linearity of the underlying objective functional makes the processing very efficient and improves the effectiveness of deformable surface computation. Method offers the possibility to encode global topological changes of the shape with respect of local influence and allows animators to re-use the estimation paraterization.

Texture Distortion(Another,Tessellation) Optimizing a Water Simulation(More) Scalable GPU Fluid Simulation
Real-Time Screen Space Fluid Rendering with Scene Reflections Real-time Interactive Water Waves
To solve the singular problem of water waves obtained with the traditional model, a hybrid deep-shallow-water model is estimated by using an automatic coupling algorithm. It can handle arbitrary water depth and different underwater terrain. As a certain feature of coastal terrain, coastline is detected with the collision detection technology. Then, unnecessary water grid cells are simplified by the automatic simplification algorithm according to the depth. Finally, the model is calculated on CPU and the simulation is implemented on GPU.

Animated Foliage and Cloth(gif of this) Tearing(Cloth Shading)
GPU Assisted Self-Collisions of Cloths Real Time Cloth Simulation with B-spline Surfaces(Source Code)
The GPU is very effective doing vector math and the vertex program is already looping through all of the vertices to convert them to screen space and send them to the fragment program. Before this we can simply add a value to these positions before sending them down the line.
By supplying properties for the material stiffness wind direction and wind speed a more realistic look can be achieved which correlates more to what actually happens in nature. A second property of how attached parts of an object is could be supplied through vertex color.
The second scenario relates to vegetation. By instead building the bending properties into the shader you wouldn’t need any collision detection for each individual plant, instead just look at the distance between vertex and player and scale the bending based on this.

OPCODE box pruning revisited Optimization Collision Culling
Collision Detection Output Sensitive Cubemap based Generic GPU Mapping Errors 
Various collision detection algorithms using GPU and ways to deal with errors.

PhysX TressFX Newton

A Cracking Algorithm for Destructible 3D Objects
Simulating Rigid Body Fracture with Surface Meshes
A Novel GPU-Based Deformation Pipeline
Fracturable Surface Model Fast Algorithm to Split and Reconstruct Triangular Meshes
Parallel explicit FEM algorithms using GPU's 
By combining an indirect boundary integral formulation, explicit surface tracking and a kernel-independent fast multipole method, presented method is effective for rigid body brittle fracture using the boundary surface mesh only.
Existing explicit mesh tracking methods are modified to support evolving cracks directly in the triangle mesh representation, giving highly detailed fractures with sharp features, independent of any volumetric sampling (unlike tetrahedral mesh or level set approaches) and avoids the need for calculations; the triangle mesh representation also allows simple integration into rigid body engines.
It is accurate, and at the same time computationally economical, and it successfully resolves crack evolution in various settings.



Audio
TinyOAL
Apache License v2.0. Open-Source. Cross-platform. Written for C/C++ and .NET.
A minimalist OpenAL Soft audio engine for quick implementation.

Steam Audio - (Requires a License but Free - Valve Corporation) - HRTF etc

Sound occlusion for virtual 3D environments(Source Code)
Efficient HRTF-based Spatial Audio for Area and Volumetric Sources
Efficient Approximation of HRTF in Subbands for Accurate Sound Localization(Source Code)
Prioritized Computation for Numerical Sound Propagation
HRTF for XAudio2 + X3DAudio(Source Code)
Results indicate that the proposed algorithms preserve the salience of spatial cues, even for relatively high approximation tolerances, yielding computationally very efficient implementations.



Anti-Aliasing
Improved Geometric Specular Antialiasing(IGSA) Supplemental
Specific ways to adjust Lighting model and NDF Filtering and few more things in pixel shader.
Limited to specular aliasing mostly.

Temporal Reprojection Anti-Aliasing(TRAA)(Source Code) Improved Sampling Anti-Ghosting
Source Code/Paper for TRAA used in Playdead's INSIDE.

Enhanced Subpixel Morphological Antialiasing(SMAA)(Detail) CMAA2
A very efficient GPU-based MLAA implementation, capable of handling subpixel features seamlessly, and featuring an improved and advanced pattern detection & handling mechanism.

Alpha to Coverage (Alpha Mipmaps) Anti-aliased Alpha Test
Efficient Dithering(Another) Improved box/triangle filtering 
They solve the problems with alphas by sampling the alpha and interpreting how much it covers the pixel, dithering and distributing the result to an appropriate number of multisample samples.

Controlling and Sampling Visibility Information on the Image Plane
Visibility-induced aliasing can be reduced substantially by, first, choosing a suitable function space that admits a sampling theorem for the given locations; second, determining the pre-filtering of the step function for this space; third, constructing a sampling theorem with the given locations; and fourth, deriving the quadrature weights from the sampling theorem.
They applied their methodology to the classical setting of bandlimited functions but also considered shift invariant spaces. Also demonstrated that the better spatial localization of the kernel functions in the latter setting compared to the sinc-function also yields lower error rates.



Shaders/Effects
ReShade (Shaders)
Shader Minifier(Source Code)(compiled exe) More
Shader Live-Reloading

Vertex Shader Tricks Compute Shaders(More) Low-level Shader Optimization(More)

Particle effect system using the GPU Particle Systems Using 3D Vector Fields with Compute Shaders
GPU-based particle simulation(Source Code included) Practical Particle Lighting
Particle Simulation with GPUs GParticles(Source Code)
Particle systems and particle effects are used to simulate a realistic and appealing atmosphere in many virtual environments. However, they do occupy a significant amount of computational resources. The demand for more advanced graphics increases by each generation, likewise does particle systems need to become increasingly more detailed.
This thesis proposes a texture-based 3D vector field particle system, computed on the GPU, and compares it to an equation-based particle system.
Several tests were conducted comparing different situations and parameters. All of the tests measured the computational time needed to execute the different methods.

A System for Rapid, Automatic Shader Level-of-Detail
Geometry-Aware Framebuffer Level of Detail
Using an optimized greedy search algorithm, adding parameter binding time processing capabilities (parameter shader), and a simple but general simplification rule (ACSE) yields a system that can process complex game-style shaders to produce policies featuring simplified shaders similar to those created by hand.
http://graphics.cs.cmu.edu/projects/lodgen/

High Dynamic Range Imaging Pipeline on the GPU
In this article they aim to fill a gap of providing a detailed description of how the HDRI pipeline, from HDR image assembly to tone mapping, can be implemented exclusively on the GPU. They also explain the trade-offs that need to be made for improving efficiency and show timing comparisons for CPU vs GPU implementations.
Another goal of this paper is to demonstrate how both the global and local versions of this operator can be efficiently implemented by using fragment shaders. Different from previous work, they show that the implementation of this operator neither requires expensive convolution nor Fourier transform operations to compute local adaptation luminances.

Adaptive Multi-Rate Shading Variable Rate Shading Bandwidth prediction Precision selection
Due to complex shaders and high-resolution displays (particularly on mobile graphics platforms), fragment shading often dominates the cost of rendering in games. To improve the efficiency of shading on GPUs, they extend the graphics pipeline to natively support techniques that adaptively sample components of the shading function more sparsely than per-pixel rates.
They perform an extensive study of the challenges of integrating adaptive, multi-rate shading into the graphics pipeline, and evaluate two- and three-rate implementations that they believe are practical evolutions of modern GPU designs.



Meshes
meshoptimizer GPU Tessellation with Compute Shaders(Source Code) Adaptive Catmull-Clark Skeleton-guided
Simplified and Tessellated Mesh for Realtime High Quality Rendering 3D Mesh Simplification 
Evaluating the visibility threshold for a local geometric distortion on a 3D mesh and its applications
GPU-based refinement scheme that is free from the limitations incurred by tessellation shaders. Specifically, scheme allows arbitrary subdivision levels at constant memory costs. Its achieved by manipulating an implicit (triangle-based) subdivision scheme for each polygon of the scene in a dedicated compute shader that reads from and writes to a compact, double-buffered array. Performance of the implementation is both fast and stable. Naturally, the average GPU rendering time depends on how the terrain is shaded.

Connected Triangles Multi-Resolution Meshes for Feature-Aware Hardware Tessellation More(Source Code - 34-37)
Feature-Adaptive Rendering of Loop Subdivision Surfaces on Modern GPUs Parametric Surfaces
Quadratic Error Metric Mesh Simplification Algorithm Based on Discrete Curvature Convex Hull Problems
A general framework for the construction and rendering of non-uniform LODs suitable for hardware tessellation.
Its key component is a novel hierarchical representation of multiresolution meshes that allows us to finely control the topological locations of vertex splits and merges. they thus managed to relax the regularity of fractional tessellation, while retaining the efficiency of the respective GPU’s units.
Within the framework, they presented a dedicated mesh decimation scheme that can be driven by any edge-based error metric. In particular, by applying it with a feature-preserving geometric error, they leveraged hardware tessellation for feature-aware LOD rendering of meshes.

Geometry Batching Using Texture-Arrays
Batching can be used to group and sort geometric primitives into batches to reduce the number of required state changes, whereas the size of the batches determines the number of required draw-calls, and therefore, is critical for rendering performance.
For example, in the case of texture atlases, which provide an approach for efficient texture management, the batch size is limited by the efficiency of the texture-packing algorithm and the texture resolution itself.
This paper presents a pre-processing approach and rendering technique that overcomes these limitations by further grouping textures or texture atlases and thus enables the creation of larger geometry batches. It is based on texture arrays in combination with an additional indexing schema that is evaluated at run-time using shader programs.
Basically, facilitates a flexible partitioning of geometry.



Textures
MinLod Mipmap(Source Code) Texture tiling/swizzling ESRGAN
Bindless(Source Code) Virtual Texturing(VT)(Source Code) GPU Driven Adaptive VT
The Implementation of a Scalable Texture Cache(Source Code) Incremental loading of terrain textures min-max mip
Virtual texturing is a solution to the problem of real-time rendering of scenes with vast amounts of texture data which does not fit into graphics or main memory. Virtual texturing works by preprocessing the aggregate texture data into equally-sized tiles and determining the necessary tiles for rendering before each frame. These tiles are then streamed to the graphics card and rendering is performed with a special virtual texturing fragment shader that does texture coordinate adjustments to sample from the tile storage texture.

bc7enc RGBV Real-time BC6H Compression on GPU(Source Code) ASTC(Codec)
GST(GPU-decodable Supercompressed Textures)(Source Code)
Modern GPUs supporting compressed textures allow interactive application developers to save scarce GPU resources such as VRAM and bandwidth. Compressed textures use fixed compression ratios whose lossy representations are significantly poorer quality than traditional image compression formats such as JPEG. They present a new method in the class of supercompressed textures that provides an additional layer of compression to already compressed textures. Texture representation is designed for endpoint compressed formats such as DXT and PVRTC and decoding on commodity GPUs. They apply this algorithm to commonly used formats by separating their representation into two parts that are processed independently and then entropy encoded. Method preserves the CPU-GPU bandwidth during the decoding phase and exploits the parallelism of GPUs to provide up to 3X faster decode compared to prior texture supercompression algorithms. Along with the gains in decoding speed, the method maintains both the compression size and quality of current state of the art supercompressed texture representations.

Normal Mapping Using the Surface Gradient(paper) For Triplanar Shader Without Precomputed Tangents 
Horizon Occlusion for Normal Mapped Reflections
More efficient forms of Normal Maps.

Texture-space Decals Another
There are few ways to do decals which are used in games to draw images onto others surfaces but most of them have different tradeoffs. Rendering into texture space is one of them.

Height-blending
Way to blend between textures, most common example of this is terrain. Explains an effect where we can use additional lerp interpolation and height data to control exactly where the blending should occur.



AI/Scripting
Compromise-free Pathfinding on a Navigation Mesh(Source Code)
Adaptive Layered Goal Oriented Action Planning(GOAP)(Source Code)
Dynamic and Robust Local Clearance Triangulations
A optimization of A* algorithm to make it close to human pathfinding behavior
Time-Bounded Best-First Search for Reversible and Non-reversible Search Graphs
Refers to a simplified STRIPS-like planning architecture specifically designed for real-time control of autonomous character behavior in games. To create the most dynamic AI.




State-of-the-Art/Comparisons/Roundups/Surveys/Analysis
Optimization Techniques for 3D Graphics Deployment More Debris: Opening the box
Rendering massive 3D scenes in real-time
Specialization Opportunities in Graphical Workloads
Continuity and Interpolation Techniques More in Detail(Code/Samples/Algorithms)
Feature Aware Sampling and Reconstruction
Recent Advances in Adaptive Sampling and Reconstruction for Monte Carlo Rendering
Anti-Aliased Low Discrepancy Samplers for Monte Carlo Estimators in Physically Based Rendering Another
A survey of photon mapping state-of-the-art research and future challenges
Theory and Numerical Integration of Subsurface Light Transport
Real-Time Rendering Fourth Edition, Real-Time Ray Tracing
Global Illumination in Participating Media
Scalable Algorithms for Height Field Illumination
Ambient Occlusion on Mobile: an empirical comparison (Source Code - last pages)
Temporal Coherence Methods in Real-Time Rendering(Warping)
Transparency and Anti-Aliasing Techniques for Real-Time Rendering
Filtering Approaches for Real-Time Anti-Aliasing
Algorithms for Efficient Computation of Convolution
Kernel optimization by layout restructuring
A Bigger Mathematical Picture for Computer Graphics Intersection
3D mesh compression: survey, comparisons and emerging trends
On Some Interactive Mesh Deformations
Adaptive Physically Based Models in Computer Graphics
Efficient encoding of texture coordinates guided by mesh geometry
Methods for Avoiding Round-off Errors on 2D and 3D Geometric Simplification
Fundamental computational geometry on the GPU
Real-time Rendering Techniques with Hardware Tessellation
Combining displacement mapping methods on the GPU for real-time terrain visualization
Comparison of spherical cube map projections used in planet-sized terrain rendering
Course/Book/Presentations that Provide Useful Info/Analyses most of most useful Shadow Map/Shadow Volume, Hard/Soft/Volumetric Shadow Techniques Shadow Mapping Algorithms
An evaluation of moving shadow detection techniques
A Comprehensive Study on Pathfinding Techniques for Robotics and Video Games
Variance of integral approximation methods in ray tracing




??? ... hmm
Smarter Screen Space Shading
General approach, called deep screen space, using which a variety of light transport aspects can be simulated.
This approach is then further extended to additionally handle scenes containing participating media like clouds.
Shows how to improve the correctness of screen space and related algorithms by accounting for mutual visibility of points in a scene. After that, taking a completely different point of view on image generation using a learning-based approach to approximate a rendering function. Neural networks can hallucinate shading effects which otherwise have to be computed using costly analytic computations. Finally, a holistic framework to deal with phosphorescent materials in computer graphics, covering all aspects from acquisition of real materials, to easy editing, to image synthesis.

Parallel Computing and Optimization for Radiosity(Source Code etc) GPU
Radiosity for Real-Time Simulations of Highly Tessellated Models
Real-Time Dynamic Radiosity for High Quality Global Illumination Large Scale Scenes
Techniques based around Radiosity that provide unique advantages.

Alternative definition of Spherical Harmonics for Lighting Renderer More Converting SH Radiance to Irradiance
Various ways to do Spherical Harmonics.

Shadows and Reflections Part2(Reflections Shadows) Denoising Surfaces and Light Scattering Fire and Smoke GI   
Annotated Realtime Raytracing(Source Code) Q2 Realtime Pathtracer(Project) GPU RT(Code) NanoRT Gideon
Ray Tracing Gems Ray Marching Sampling Intersection(Source) Another À-Trous PSVGF Batching(Packets)
Ray stream techniques augment the fast single-ray traversal with increased utilization of vector units and leverage memory bandwidth for batches of rays. Despite their success, the proposed implementations suffer from high bookkeeping cost and batch fragmentation, especially for small batch sizes.
Various contributions here make it all the way to real-time.

Multiple Importance Sampling Generalized A Fresh Look at Generalized Sampling Path Space Filtering
Combining Reprojection and Adaptive Sampling Forced Random Sampling (Code)
Low-Discrepancy Blue Noise Filtering(More) Advancing Front Animating Noise Non-Linear Transfer Functions etc
It decomposes a filter into two parts: a compactly supported continuous-domain function and a digital filter. This broadly summarizes the key aspects of the framework, and delves into specific applications in graphics. Using new notation, concisely presents and extends several key techniques.
In addition, demonstrates benefits for prefiltering in image downscaling, supersample-based rendering, and analyzes the effect that generalized sampling has on noise.

A Temporal Stable Distance To Edge Anti-aliasing
Improved Geometry Buffer Anti-Aliasing(GBAA+)(Source Code)
Triangle-based Geometry Anti-Aliasing(TGAA)
The implementation can, without any sub-pixel information and by storing extra geometrical data in a pre-render pass, prevent temporal instability and solve aliasing artifacts during a post-render pass. Thus being a real alternative to the state of the art post-processing Anti-Aliasing solutions, in sense of performance and quality in high end game engines and systems.
Reliance on hardware features for solving triangle edges can easily be removed from the solution making it implementable on a large variety of hardware. If this is the case, prototype 1 can be an excellent complement to Anti-Aliasing solutions such as Multi Sampling which can not solve alpha clipped edges.

Light Propagation Volumes
Stores lighting information from a light in a 3D grid. Every light stores which points in the world they light up.
These points have a coordinate in the world, which means you can stratify those coordinates in a grid. In that way you save lit points (Virtual Point Lights) in a 3D grid and can use those initial points to spread light across the scene.

AABO Zero-byte AABB-trees Dynamic BVH BVH splitting More TSS BVH Another Hashed Shading Octree Quadtree
Generic Hybrid CPU-GPU Parallelization Dynamic Data Structures for Scheduling

A Non-linear GPU Thread Map for Triangular Domains Improving the accuracy
There is a stage in the GPU computing pipeline where a grid of thread-blocks, in parallel space, is mapped onto the problem domain, in data space. Threads that fall inside the domain perform computations while threads that fall outside are discarded at runtime.
In this work they study the case of mapping threads efficiently onto triangular domain problems and propose a block-space linear map λ(ω), based on the properties of the lower triangular matrix, that reduces the number of unnecessary threads from O(n2) to O(n).
This study is about the performance of algorithms, with similar purpose as Carmack and Lomont implementation of square root using three iterations of the Newton-Raphson method and the magic number “0x5f3759df”.

Fast Data Parallel Radix Sort Implementation by Avoiding Zero Bits Based on Divide and Conquer Technique Another
The algorithms implement several optimization techniques to take advantage of the HW architecture such as:
taking advantage of kernel fusion strategy, the synchronous execution of threads in a warp/waveform to eliminate the need for barrier synchronization, using shared memory across threads within a group, management of bank conflicts, eliminate divergence by avoiding branch conditions and complete unrolling of loops, use of adequate group/thread dimensions to increase HW occupancy and application of highly data-parallel algorithms to accelerate the scan operations.

Revised fast convolution Efficient FFT Algorithms Reduction
Convolution is a mathematical tool used in filtering, correlation, compression and in many other applications. Although the concept of convolution is not new, the efficient computation of convolution is still an open topic. As the burden of data is constantly increasing, there appears request for fast manipulation with large data.
The fast convolution have been proposed to recursively determine if one new signal sample or new small portion of samples emerge in the given period N of a realization x(n) replacing the old one sample or old portion of samples, respectively. The number of operations for their speedy calculating is essentially reduced by the original recursive expression in comparison with the ordinary FFT procedure used only in the case of fixed values of samples.

Last edited by ThaOneDon (2020-05-31 19:33:19)

Offline

#2 2014-11-06 08:05:09

ImNotQ009
Moderator

Re: tech thread

We already have parallax occlusion mapping

Offline

#3 2014-11-06 11:10:13

ThaOneDon
Member

Re: tech thread

I guess i should always take a look at Cube engine's docs first as well before making suggestions?

:)

Still SSDO is quite interesting...

Offline

#4 2014-11-09 06:11:26

ThaOneDon
Member

Re: tech thread

Update 7

Offline

#5 2014-11-09 10:35:48

Calinou
Moderator

Re: tech thread

Now, add all this by yourself.

Last edited by Calinou (2014-11-09 10:35:52)

Offline

#6 2014-11-09 15:54:16

ThaOneDon
Member

Re: tech thread

Wish it was that easy, still theres tons of work to look forward to.
I hope all of this is useful.

Last edited by ThaOneDon (2016-01-29 16:23:36)

Offline

#7 2014-11-10 08:52:57

ThaOneDon
Member

Re: tech thread

Update 9

Offline

#8 2014-11-11 04:32:16

ThaOneDon
Member

Re: tech thread

I'm absorbing more research papers at the moment.

UPDATE: Added UPDATE 10

Last edited by ThaOneDon (2016-08-29 05:51:46)

Offline

#9 2014-11-11 07:04:44

eihrul
Administrator

Re: tech thread

Unless you've actually read all the stuff in here and can give actual descriptions of why each of these papers is individually interesting, I am going to have to delete this thread as spam.

Offline

#10 2014-11-11 07:34:39

ThaOneDon
Member

Re: tech thread

OK. Its going to take some time thou theres a lot to cover.

Last edited by ThaOneDon (2016-08-29 05:51:17)

Offline

#11 2014-11-12 11:47:47

noman222
Member

Re: tech thread

Tesseract is a great game, and bots are fun, but what's an online based game without a good sized player base?
A game like this would have a good chance to fly if it got into steam greenlight. Add the facts: that it's free, include steam workshop support for sharing maps and making mods, it's a tribute to old school gaming (a bit), and there'll be an awesome player base for lots of fun.
What's more, we'll be tapping into a huge source of ideas and good map designers, and, if it's not too hard, let tesseract use steam servers for multiplayer.
What do you think? Is it worth a try?







_______________________
Noman

Offline

#12 2014-11-12 11:57:10

spikeymikey0196
Member

Re: tech thread

noman222 wrote:

Tesseract is a great game, and bots are fun, but what's an online based game without a good sized player base?
A game like this would have a good chance to fly if it got into steam greenlight. Add the facts: that it's free, include steam workshop support for sharing maps and making mods, it's a tribute to old school gaming (a bit), and there'll be an awesome player base for lots of fun.
What's more, we'll be tapping into a huge source of ideas and good map designers, and, if it's not too hard, let tesseract use steam servers for multiplayer.
What do you think? Is it worth a try?







_______________________
Noman

As much as I get what you're saying, you have to realise that Tesseract isnt anywhere near ready to be put onto Greenlight.. The developers know this, otherwise it would be on there.
You're also missing out the fact that this is in very early stages and it would take ages to get it to actually be successful on greenlight.. there's just too little at the moment to work with.. although the userbase does need to excel, it wont happen just yet :3

Offline

#13 2014-11-12 17:29:48

ThaOneDon
Member

Re: tech thread

There are few conditions needed for that to happen.

Greenlight implies also short timeframe to make the game and that ofcourse would stress the development.

Right now the engine/game is going in steady and precise phase. Thats what i and i'm sure the team wants. To make small but meaningful changes.

If anyone is willing to use the engine to make something interesting for steam Greenlight, license wise it shouldn't be a problem. Don't use the stuff from "media", everything else is A-OK.

:)

Offline

#14 2014-11-18 15:11:02

ThaOneDon
Member

Re: tech thread

Massive Updates 18/11/2014

Offline

#15 2014-11-19 11:22:58

ThaOneDon
Member

Re: tech thread

More Updates 19/11/2014

Tech
*Line Space Gathering for Single Scattering in Large Scenes
*ManyLoDs: Parallel Many-View Level-of-Detail Selection for Real-Time Global Illumination
*Improving Performance and Accuracy of Local PCA

Performance saving
*Importance Caching for Complex Illumination
*Fast Parallel GPU-Sorting Using a Hybrid Algorithm

Shaders
*3D Unsharp Masking for Scene Coherent Enhancement
*Precision Selection for Energy-Efficient Pixel Shaders
*Bidirectional Light Transport with Vertex Merging

Offline

#16 2014-11-20 14:41:33

ThaOneDon
Member

Re: tech thread

Updates 20/11/2014

Tech
*Sample Distribution Shadow Maps
*Depth Interval Grid Displacement Mapping
*Frostbyte Engine Tech (incredibly advanced and performance friendly)

Performance Saving
*Parallel View-Dependent Level-of-Detail Control
*Efficient Interactive Rendering of Detailed Models with Hierarchical Levels of Detail

Last edited by ThaOneDon (2014-11-20 16:57:07)

Offline

#17 2014-11-20 16:13:34

spikeymikey0196
Member

Re: tech thread

Gonna add slightly to the list with this:
DOT Engine AI: https://github.com/MatrixCompSci/DOT

Offline

#18 2014-11-21 06:29:12

ThaOneDon
Member

Re: tech thread

Updates 21/11/2014

Tech
*Deep Opacity Maps

Performance Saving
*Frame Sequential Interpolation for Discrete Level-of-Detail Rendering

Shaders
*An Optimizing Compiler for Automatic Shader Bounding

Last edited by ThaOneDon (2014-11-21 06:47:16)

Offline

#19 2014-11-22 05:53:35

ThaOneDon
Member

Re: tech thread

Updates 22/11/2014

Tech
*PMAO (Photometric Ambient Occlusion)
*C-BDAM - Compressed Batched Dynamic Adaptive Meshes for Terrain Rendering
*Tile-Trees

Performance Saving
*Tuning Catmull-Clark Subdivision Surfaces (OpenSubDiv is based on these)
*An Interactive Perceptual Rendering Pipeline using Contrast and Spatial Masking

Shaders
*Implementing the Render Cache and the Edge-and-Point Image

Last edited by ThaOneDon (2014-11-22 07:09:15)

Offline

#20 2014-11-22 11:50:34

RaZgRiZ
Moderator

Re: tech thread

Maybe you should spend some time to cagegorize all of them and make the least more readable.. It's a total mess.

Offline

#21 2014-11-22 18:51:54

ThaOneDon
Member

Re: tech thread

I'll see what i can do with the stuff thats related but a lot of it isn't so theres really no good way of categorizing it.

Offline

#22 2014-11-22 21:55:07

RaZgRiZ
Moderator

Re: tech thread

ThaOneDon wrote:

I'll see what i can do with the stuff thats related but a lot of it isn't so theres really no good way of categorizing it.

At least make it prettier. Hide the links inside the URL tag and use just the text instead. That's one way to do it.

don't click me

Last edited by RaZgRiZ (2014-11-22 21:55:38)

Offline

#23 2014-11-23 01:18:46

ThaOneDon
Member

Re: tech thread

Working on it

DONE

Last edited by ThaOneDon (2014-11-23 08:31:51)

Offline

#24 2014-11-23 13:28:08

RaZgRiZ
Moderator

Re: tech thread

ThaOneDon wrote:

Working on it

DONE

Err, add a little spacing too and some title sizing. It's not mentioned but i think this forum supports basic bbcode so it should be possible.

Offline

#25 2014-11-23 22:49:25

ThaOneDon
Member

Re: tech thread

Done

Updates 24/11/2014
Tech
*Highlight Microdisparity for Improved Gloss Depiction
*Implicit Skinning: Real-Time Skin Deformation with Contact Modeling

Last edited by ThaOneDon (2014-11-24 01:57:35)

Offline

Board footer