
Neural Graphics Texture Compression
A novel approach to texture set compression that integrates traditional GPU texture representation and Neural Image Compression techniques, designed to enable random access and support many-channel texture sets.
Introduction
This groundbreaking research by Farzad Farhadzadeh et al. [1] addresses a critical challenge in computer graphics: efficiently compressing texture assets while maintaining quality and enabling random access during rendering.
"Advances in rendering have led to tremendous growth in texture assets, including resolution, complexity, and novel textures components, but this growth in data volume has not been matched by advances in its compression." - Farhadzadeh et al. [1]
Motivation
Modern renderers utilize a broad range of material properties beyond just color channels. Conventional texture compression methods like ASTC [2] can only compress textures with up to four channels and compress each mip level separately, failing to capture correlations across all channels and mip levels.
The research identifies significant redundancy within feature pyramids used in previous neural texture compression methods [3, 4]. As texture resolution increases, this redundancy becomes more pronounced, adversely affecting compression performance.

Figure 1: Visualization of grid features showing the dual-bank representation that captures different frequency information.
Method
The paper introduces an asymmetric autoencoder framework with four key components:
Global Transformer
Maps a texture set to a bottleneck latent representation, capturing spatial-channel-resolution redundancy.
Grid Constructor
Two grid constructors map the latent representation to grid pairs that store quantized features.
Grid Sampler
Samples the grids based on texture coordinates and mip level, facilitating texel reconstruction from different mip levels by sampling features with varying strides.
Texture Synthesizer
Reconstructs texels at specific positions and mip levels.
class CompressionModel(nn.Module):
"""
Complete neural texture compression model.
This integrates all components of the pipeline:
1. Global Transformer (Encoder)
2. Grid Constructor
3. Grid Sampler
4. Texture Synthesizer (Decoder)
"""
def __init__(self, in_channels, encoder_channels=[64, 128, 256],
grid_channels=[16, 16], quantization_bits=4, hidden_dim=32,
num_residual_blocks=4, positional_encoding_levels=10,
use_attention=False):
super().__init__()
# Global Transformer (Encoder)
self.global_transformer = GlobalTransformer(
in_channels=in_channels,
channels=encoder_channels,
use_attention=use_attention
)
# Grid Constructor
self.grid_constructor = GridConstructor(
in_channels=encoder_channels[-1],
grid_channels=grid_channels,
quantization_bits=quantization_bits
)
# Grid Sampler
self.grid_sampler = GridSampler()
# Texture Synthesizer (Decoder)
self.texture_synthesizer = TextureSynthesizer(
g0_channels=grid_channels[0],
g1_channels=grid_channels[1],
out_channels=in_channels,
hidden_dim=hidden_dim,
num_residual_blocks=num_residual_blocks,
positional_encoding_levels=positional_encoding_levels
)
Multi-Resolution Support
A key innovation in this work is the support for multi-resolution mip levels, which is essential for texture filtering in real-time rendering. The method can reconstruct textures at any mip level from the same compressed representation.

Mip Level 0 (Full Resolution)

Mip Level 2 (Medium Resolution)

Mip Level 4 (Low Resolution)
Figure 2: Comparison of original (top) and reconstructed (bottom) textures at different mip levels, showing the method's ability to maintain quality across resolutions.
Experimental Results
The method achieves impressive compression results:

Figure 3: PSNR comparison across different textures

Figure 4: SSIM comparison across different textures
Performance varies across different texture types, with some textures achieving PSNR values over 31 dB. Compared to conventional methods like ASTC [2], this approach shows significant improvements with BD-rate savings of -88.67%.
Implementation Details
The positional encoding used in the texture synthesizer is a critical component that enables high-quality reconstruction:
class PositionalEncoding(nn.Module):
"""
Positional encoding as described in the paper.
This is based on the encoding used in NeRF and similar methods,
which maps coordinates to a higher-dimensional space using
sinusoidal functions at different frequencies.
"""
def __init__(self, num_levels=10, include_identity=True):
super().__init__()
self.num_levels = num_levels
self.include_identity = include_identity
# Frequency multipliers: 2^0, 2^1, 2^2, ...
self.freq_bands = 2 ** torch.arange(num_levels).float()
def forward(self, x, y):
# Ensure inputs are float tensors
x = x.float()
y = y.float()
# Reshape to [batch_size, num_points, 1]
x = x.unsqueeze(-1)
y = y.unsqueeze(-1)
# Apply frequency bands
x_enc = x * self.freq_bands.to(x.device)
y_enc = y * self.freq_bands.to(y.device)
# Apply sin and cos to each frequency
x_sin = torch.sin(x_enc)
x_cos = torch.cos(x_enc)
y_sin = torch.sin(y_enc)
y_cos = torch.cos(y_enc)
# Concatenate all encodings
out = torch.cat([x_sin, x_cos, y_sin, y_cos], dim=-1)
# Optionally include the original coordinates
if self.include_identity:
out = torch.cat([x, y, out], dim=-1)
return out
Analysis
The research analyzes several aspects of the compression method:
- Interpolation in grid samplers: The grid pair captures different frequency information, with G0 focusing on high-frequency details and G1 on low-frequency features.
- Global transformer impact: Models without a global transformer struggle to capture high-frequency information.
- Sampling with stride: Using a single resolution grid-pair with stride sampling outperforms multi-resolution grid-pairs.
- Synthesizer depth: The residual blocks in the texture synthesizer are critical for performance.
Conclusion
This research introduces an effective method for texture compression in photorealistic rendering, leveraging multiple levels of redundancy:
- Among different channels of a texture
- Across various resolutions of the same texture
- Within individual pixels within each channel
The method achieves state-of-the-art performance, significantly outperforming conventional texture compression methods and competitive neural compression methods.
References
- Farhadzadeh, F., et al. "Neural Graphics Texture Compression: Supporting Random Access." arXiv:2407.00021, 2024.
- Nystad, J., et al. "Adaptive scalable texture compression." In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on High Performance Graphics, 2012.
- Mentzer, F., et al. "High-Fidelity Generative Image Compression." NeurIPS, 2020.
- Ballé, J., et al. "Variational image compression with a scale hyperprior." ICLR, 2018.