ReFTA claims cheaper, faster fine-tuning

A new fine‑tuning method called ReFTA claims to beat LoRA by more than 5% in accuracy while using about 96% fewer parameters, by skipping slow weight reconstruction during tensor fine‑tuning. The paper frames this as a big efficiency win for adapter-style tuning that could lower the cost of task-specific models (x.com).

Fine-tuning is the cheap way to retask a big model without retraining all of it. Instead of replacing the whole engine, you bolt on a small part and teach only that part, which is why methods like low-rank adaptation became standard for adapting large language models. (arxiv.org) Low-rank adaptation works by approximating a weight update with two skinny matrices instead of one full-size matrix. That cuts the number of trainable values sharply, but each layer still gets its own adapter, so the count keeps growing as models get deeper. (arxiv.org) Some newer papers try to go one step further by treating many similar layers as one stack, like bundling dozens of pages into a single 3D block instead of editing each page alone. In machine learning papers, that stack is called a tensor, and the goal is to share structure across layers instead of paying for separate edits everywhere. (openaccess.thecvf.com) That tensor trick saves parameters, but it usually creates a new bottleneck during training. Many tensor methods repeatedly rebuild a large weight tensor on every forward pass and backward pass, which adds extra compute, extra memory use, and extra implementation complexity. (finance.sina.cn) ReFTA, short for Reconstruction-Free Tensor Adaptation, is a CVPR 2026 paper that says you can skip that rebuild step entirely. The authors are Jingjing Zheng, Anda Tang, Qiangqiang Mao, Zhouchen Lin, and Yankai Cao, and the paper is listed on the authors’ publication pages as accepted to the 2026 Conference on Computer Vision and Pattern Recognition. (jzheng20.github.io) (zhouchenlin.github.io) The core move is simple to describe even if the math is not: ReFTA changes the order of operations so the model combines input features first and only then fuses them across the tensor structure. That means training no longer has to explicitly construct the full weight tensor inside the computation graph. (finance.sina.cn) The paper’s claim is that this shifts the heavy work away from giant weight objects and toward intermediate features tied to batch size. In the common case where those feature objects are smaller than the reconstructed weights, training uses less peak memory and less computation. (finance.sina.cn) The headline numbers are aggressive. A report summarizing the paper says ReFTA beats low-rank adaptation and PiSSA on common Vision Transformer baselines under the same low-parameter budget, and on RoBERTa tasks it stays competitive while using far fewer trainable parameters; the social post attached to the paper claims more than 5% better accuracy than low-rank adaptation with about 96% fewer parameters. (finance.sina.cn) (x.com) There is still one big asterisk: this is a fresh conference paper, not an industry standard yet. The GitHub repository says “Code will be released soon,” so outside researchers still need a full public implementation and broader replication before anyone can say the gains hold across the messy mix of real production models. (github.com) If the results survive that test, ReFTA points to a different way of making custom models cheap. Instead of squeezing each layer harder with ever-smaller adapters, it tries to remove a hidden tax in tensor fine-tuning itself: the repeated rebuild of weights that the model never needed to materialize in the first place. (finance.sina.cn)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.