Software is rescuing chip limits

With die sizes bumping up against physical limits, vendors are leaning on software tricks like DLSS and AI upscaling to extend performance without larger silicon — the technique is being framed as the industry’s workaround for die‑size ceilings. (xda-developers.com)

NVIDIA’s datacenter GH100 (H100) die measures about 814 mm² with roughly 80 billion transistors, showing how current flagship GPUs already occupy nearly the maximum single‑die area used in production. (developer.nvidia.com)) Consumer‑class Ada AD102 dies measure roughly 609 mm² with ~76.3 billion transistors, while reporting on NVIDIA’s newest Blackwell‑family silicon places the RTX 5090 die near ~750 mm². (techpowerup.com)) The practical photomask/reticle exposure field for current EUV scanners constrains single‑die areas to roughly 830–858 mm², a hard manufacturing ceiling that forces alternative approaches to keep scaling performance. (tomshardware.com)) Foundry and packaging workarounds already in production let designers assemble much larger systems: TSMC’s CoWoS interposers can reach about 2,831 mm², enabling multi‑chip SiPs that exceed the reticle limit by more than 3×. (techspot.com)) NVIDIA’s DLSS evolution has been engineered to multiply effective frame rates at the software level—DLSS 4’s Multi‑Frame Generation was advertised as multiplying frame rates up to 8× in supported workloads. (nvidia.com)) Independent testing highlights the tradeoffs: Digital Foundry reported DLSS 4’s transformer and ray‑reconstruction upgrades produced radical frame‑rate and quality gains in multiple titles, validating the software‑first route to higher throughput. (digitalfoundry.net)) NVIDIA is pairing software gains with multi‑chip hardware: Blackwell’s MCM approach uses NVIDIA’s NV‑HBI to stitch dies together at roughly 10 TB/s bidirectional bandwidth, effectively scaling a GPU without a larger monolithic reticle print. (exxactcorp.com)) That software‑first strategy has limits and friction—XDA’s analysis framed recent DLSS advances as enabling workloads that would otherwise need ~1,000 W of brute‑force rendering to run on ~450 W of modern silicon, while community tests show DLSS 4.5 can be ~20% slower on older RTX 20/30 series hardware, underscoring both the power and the hardware dependency of upscaling fixes. (xda-developers.com))

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.