Awesome
Neural Tangent Kernel Papers
This list contains papers that adopt Neural Tangent Kernel (NTK) as a main theme or core idea.
NOTE: If there are any papers I've missed, please feel free to raise an issue.
2024
Title | Venue | CODE | |
---|---|---|---|
Faithful and Efficient Explanations for Neural Networks via Neural Tangent Kernel Surrogate Models | ICLR | CODE | |
PINNACLE: PINN Adaptive ColLocation and Experimental points selection | ICLR | - | |
On the Foundations of Shortcut Learning | ICLR | - | |
Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation | ICLR | - | |
Sample Relationship from Learning Dynamics Matters for Generalisation | ICLR | - | |
Robust NAS benchmark under adversarial training: assessment, theory, and beyond | ICLR | - | |
Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach | ICLR | CODE | |
Heterogeneous Personalized Federated Learning by Local-Global Updates Mixing via Convergence Rate | ICLR | - | |
Neural Network-Based Score Estimation in Diffusion Models: Optimization and Generalization | ICLR | - | |
Grokking as the Transition from Lazy to Rich Training Dynamics | ICLR | - | |
Generalization of Deep ResNets in the Mean-Field Regime | ICLR | - |
2023
Title | Venue | CODE | |
---|---|---|---|
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models | NeurIPS | CODE | |
Deep Learning with Kernels through RKHM and the Perron–Frobenius Operator | NeurIPS | - | |
A Theoretical Analysis of the Test Error of Finite-Rank Kernel Ridge Regression | NeurIPS | - | |
Fixing the NTK: From Neural Network Linearizations to Exact Convex Programs | NeurIPS | - | |
Beyond NTK with Vanilla Gradient Descent: A Mean-Field Analysis of Neural Networks with Polynomial Width, Samples, and Time | NeurIPS | - | |
Feature-Learning Networks Are Consistent Across Widths At Realistic Scales | NeurIPS | - | |
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks | NeurIPS | CODE | |
Spectral Evolution and Invariance in Linear-width Neural Networks | NeurIPS | - | |
Analyzing Generalization of Neural Networks through Loss Path Kernels | NeurIPS | - | |
Neural (Tangent Kernel) Collapse | NeurIPS | - | |
Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension | NeurIPS | CODE | |
A PAC-Bayesian Perspective on the Interpolating Information Criterion | NeurIPS-W | - | |
A Kernel Perspective of Skip Connections in Convolutional Networks | ICLR | - | |
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel | ICLR | - | |
Symmetric Pruning in Quantum Neural Networks | ICLR | - | |
The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks | ICLR | - | |
Few-shot Backdoor Attacks via Neural Tangent Kernels | ICLR | - | |
Analyzing Tree Architectures in Ensembles via Neural Tangent Kernel | ICLR | - | |
Supervision Complexity and its Role in Knowledge Distillation | ICLR | - | |
NTK-SAP: Improving Neural Network Pruning By Aligning Training Dynamics | ICLR | CODE | |
Tuning Frequency Bias in Neural Network Training with Nonuniform Data | ICLR | - | |
Simple initialization and parametrization of sinusoidal networks via their kernel bandwidth | ICLR | - | |
Characterizing the spectrum of the NTK via a power series expansion | ICLR | CODE | |
Adaptive Optimization in the $\infty$-Width Limit | ICLR | - | |
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization | ICLR | - | |
The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich Regimes | ICLR | - | |
Restricted Strong Convexity of Deep Learning Models with Smooth Activations | ICLR | - | |
Feature selection and low test error in shallow low-rotation ReLU networks | ICLR | - | |
Exploring Active 3D Object Detection from a Generalization Perspective | ICLR | CODE | |
On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks | AISTATS | - | |
Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks | AISTATS | - | |
Regularize Implicit Neural Representation by Itself | CVPR | - | |
WIRE: Wavelet Implicit Neural Representations | CVPR | CODE | |
Regularizing Second-Order Influences for Continual Learning | CVPR | CODE | |
Multiplicative Fourier Level of Detail | CVPR | - | |
KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection | ICCV | CODE | |
TKIL: Tangent Kernel Approach for Class Balanced Incremental Learning | ICCV-W | - | |
A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel | ICML | - | |
Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels | ICML | CODE | |
Graph Neural Tangent Kernel: Convergence on Large Graphs | ICML | - | |
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels | ICML | CODE | |
Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels | ICML | - | |
Benign Overfitting in Deep Neural Networks under Lazy Training | ICML | - | |
Gradient Descent in Neural Networks as Sequential Learning in Reproducing Kernel Banach Space | ICML | - | |
A Kernel-Based View of Language Model Fine-Tuning | ICML | - | |
Combinatorial Neural Bandits | ICML | - | |
What Can Be Learnt With Wide Convolutional Neural Networks? | ICML | CODE | |
Reward-Biased Maximum Likelihood Estimation for Neural Contextual Bandits | AAAI | - | |
Neural tangent kernel at initialization: linear width suffices | UAI | - | |
Kernel Ridge Regression-Based Graph Dataset Distillation | SIGKDD | CODE | |
Analyzing Deep PAC-Bayesian Learning with Neural Tangent Kernel: Convergence, Analytic Generalization Bound, and Efficient Hyperparameter Selection | TMLR | - | |
The Eigenlearning Framework: A Conservation Law Perspective on Kernel Regression and Wide Neural Networks | TMLR | CODE | |
Empirical Limitations of the NTK for Understanding Scaling Laws in Deep Learning | TMLR | - | |
Analysis of Convolutions, Non-linearity and Depth in Graph Neural Networks using Neural Tangent Kernel | TMLR | - | |
A Framework and Benchmark for Deep Batch Active Learning for Regression | JMLR | CODE | |
A Continual Learning Algorithm Based on Orthogonal Gradient Descent Beyond Neural Tangent Kernel Regime | IEEE | - | |
The Quantum Path Kernel: A Generalized Neural Tangent Kernel for Deep Quantum Machine Learning | QE | - | |
NeuralBO: A Black-box Optimization Algorithm using Deep Neural Networks | NC | - | |
Deep Learning in Random Neural Fields: Numerical Experiments via Neural Tangent Kernel | NN | CODE | |
Physics-informed radial basis network (PIRBN): A local approximating neural network for solving nonlinear partial differential equations | CMAME | - | |
A non-gradient method for solving elliptic partial differential equations with deep neural networks | JoCP | - | |
Self-Adaptive Physics-Informed Neural Networks using a Soft Attention Mechanism | JoCP | - | |
Towards a phenomenological understanding of neural networks: data | MLST | - | |
Weighted Neural Tangent Kernel: A Generalized and Improved Network-Induced Kernel | ML | CODE | |
Tensor Programs IVb: Adaptive Optimization in the ∞-Width Limit | arXiv | - |
2022
Title | Venue | CODE | |
---|---|---|---|
Generalization Properties of NAS under Activation and Skip Connection Search | NeurIPS | - | |
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study | NeurIPS | CODE | |
Graph Neural Network Bandits | NeurIPS | - | |
Lossless Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach | NeurIPS | - | |
GraphQNTK: Quantum Neural Tangent Kernel for Graph Data | NeurIPS | CODE | |
Evolution of Neural Tangent Kernels under Benign and Adversarial Training | NeurIPS | CODE | |
TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels | NeurIPS | CODE | |
Making Look-Ahead Active Learning Strategies Feasible with Neural Tangent Kernels | NeurIPS | CODE | |
Disentangling the Predictive Variance of Deep Ensembles through the Neural Tangent Kernel | NeurIPS | CODE | |
On the Generalization Power of the Overfitted Three-Layer Neural Tangent Kernel Model | NeurIPS | - | |
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness? | NeurIPS | - | |
On the Spectral Bias of Convolutional Neural Tangent and Gaussian Process Kernels | NeurIPS | - | |
Fast Neural Kernel Embeddings for General Activations | NeurIPS | CODE | |
Bidirectional Learning for Offline Infinite-width Model-based Optimization | NeurIPS | - | |
Infinite Recommendation Networks: A Data-Centric Approach | NeurIPS | CODE1 <br> CODE2 | |
Distribution-Informed Neural Networks for Domain Adaptation Regression | NeurIPS | - | |
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks | NeurIPS | - | |
Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime | NeurIPS | CODE | |
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization) | NeurIPS | - | |
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture | NeurIPS | - | |
A Neural Pre-Conditioning Active Learning Algorithm to Reduce Label Complexity | NeurIPS | - | |
NFT-K: Non-Fungible Tangent Kernels | ICASSP | CODE | |
Label Propagation Across Grapsh: Node Classification Using Graph Neural Tangent Kenrels | ICASSP | - | |
A Neural Tangent Kernel Perspective of Infinite Tree Ensembles | ICLR | - | |
Neural Networks as Kernel Learners: The Silent Alignment Effect | ICLR | - | |
Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective | ICLR | - | |
Overcoming The Spectral Bias of Neural Value Approximation | ICLR | CODE | |
Efficient Computation of Deep Nonlinear Infinite-Width Neural Networks that Learn Features | ICLR | CODE | |
Learning Neural Contextual Bandits Through Perturbed Rewards | ICLR | - | |
Learning Curves for Continual Learning in Neural Networks: Self-knowledge Transfer and Forgetting | ICLR | - | |
The Spectral Bias of Polynomial Neural Networks | ICLR | - | |
On Feature Learning in Neural Networks with Global Convergence Guarantees | ICLR | - | |
Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks | ICLR | - | |
Eigenspace Restructuring: A Principle of Space and Frequency in Neural Networks | COLT | - | |
Neural Networks can Learn Representations with Gradient Descent | COLT | - | |
Neural Contextual Bandits without Regret | AISTATS | - | |
Finding Dynamics Preserving Adversarial Winning Tickets | AISTATS | - | |
Embedded Ensembles: Infinite Width Limit and Operating Regimes | AISTATS | - | |
Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning | CVPR | CODE | |
Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training? | CVPR | CODE | |
A Structured Dictionary Perspective on Implicit Neural Representations | CVPR | CODE | |
NL-FFC: Non-Local Fast Fourier Convolution for Image Super Resolution | CVPR-W | CODE | |
Intrinsic Neural Fields: Learning Functions on Manifolds | ECCV | - | |
Random Gegenbauer Features for Scalable Kernel Methods | ICML | - | |
Fast Finite Width Neural Tangent Kernel | ICML | CODE | |
A Neural Tangent Kernel Perspective of GANs | ICML | CODE | |
Neural Tangent Kernel Empowered Federated Learning | ICML | - | |
Reverse Engineering the Neural Tangent Kernel | ICML | CODE | |
How to Train Your Wide Neural Network Without Backprop: An Input-Weight Alignment Perspective | ICML | CODE | |
Bounding the Width of Neural Networks via Coupled Initialization – A Worst Case Analysis – | ICML | - | |
Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time | ICML | - | |
Lazy Estimation of Variable Importance for Large Neural Networks | ICML | - | |
DAVINZ: Data Valuation using Deep Neural Networks at Initialization | ICML | - | |
Neural Tangent Kernel Beyond the Infinite-Width Limit: Effects of Depth and Initialization | ICML | CODE | |
NeuralEF: Deconstructing Kernels by Deep Neural Networks | ICML | CODE | |
Feature Learning and Signal Propagation in Deep Neural Networks | ICML | - | |
More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize | ICML | CODE | |
Fast Graph Neural Tangent Kernel via Kronecker Sketching | AAAI | - | |
Rethinking Influence Functions of Neural Networks in the Over-parameterized Regime | AAAI | - | |
On the Empirical Neural Tangent Kernel of Standard Finite-Width Convolutional Neural Network Architectures | UAI | - | |
Feature Learning and Random Features in Standard Finite-Width Convolutional Neural Networks: An Empirical Study | UAI | - | |
Out of Distribution Detection via Neural Network Anchoring | ACML | CODE | |
Learning Neural Ranking Models Online from Implicit User Feedback | WWW | - | |
Trust Your Robots! Predictive Uncertainty Estimation of Neural Networks with Sparse Gaussian Processes | CoRL | - | |
When and why PINNs fail to train: A neural tangent kernel perspective | CP | CODE | |
How Neural Architectures Affect Deep Learning for Communication Networks? | ICC | - | |
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks | ACHA | - | |
Feature Purification: How Adversarial Training Performs Robust Deep Learning | FOCS | - | |
Kernel-Based Smoothness Analysis of Residual Networks | MSML | - | |
Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory? | MSML | - | |
The Training Response Law Explains How Deep Neural Networks Learn | IoP | - | |
Simple, Fast, and Flexible Framework for Matrix Completion with Infinite Width Neural Networks | PNAS | CODE | |
Representation Learning via Quantum Neural Tangent Kernels | PRX Quantum | - | |
TorchNTK: A Library for Calculation of Neural Tangent Kernels of PyTorch Models | arXiv | CODE | |
Neural Tangent Kernel Analysis of Shallow α-Stable ReLU Neural Networks | arXiv | - | |
Neural Tangent Kernel: A Survey | arXiv | - |
2021
Title | Venue | CODE | |
---|---|---|---|
Neural Tangent Kernel Maximum Mean Discrepancy | NeurIPS | - | |
DNN-based Topology Optimisation: Spatial Invariance and Neural Tangent Kernel | NeurIPS | - | |
Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent Kernel | NeurIPS | - | |
Scaling Neural Tangent Kernels via Sketching and Random Features | NeurIPS | - | |
Dataset Distillation with Infinitely Wide Convolutional Networks | NeurIPS | - | |
On the Equivalence between Neural Network and Support Vector Machine | NeurIPS | CODE | |
Local Signal Adaptivity: Provable Feature Learning in Neural Networks Beyond Kernels | NeurIPS | CODE | |
Explicit Loss Asymptotics in the Gradient Descent Training of Neural Networks | NeurIPS | - | |
Kernelized Heterogeneous Risk Minimization | NeurIPS | CODE | |
An Empirical Study of Neural Kernel Bandits | NeurIPS-W | - | |
The Curse of Depth in Kernel Regime | NeurIPS-W | - | |
Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels | ICASSP | CODE | |
The Dynamics of Gradient Descent for Overparametrized Neural Networks | L4DC | - | |
The Recurrent Neural Tangent Kernel | ICLR | - | |
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS | ICLR | - | |
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime | ICLR | - | |
Meta-Learning with Neural Tangent Kernels | ICLR | - | |
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks | ICLR | - | |
Deep Networks and the Multiple Manifold Problem | ICLR | - | |
Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective | ICLR | CODE | |
Neural Thompson Sampling | ICLR | - | |
Deep Equals Shallow for ReLU Networks in Kernel Regimes | ICLR | - | |
A Recipe for Global Convergence Guarantee in Deep Neural Networks | AAAI | - | |
A Deep Conditioning Treatment of Neural Networks | ALT | - | |
Nonparametric Regression with Shallow Overparameterized Neural Networks Trained by GD with Early Stopping | COLT | - | |
Learning with invariances in random features and kernel models | COLT | - | |
Implicit Regularization via Neural Feature Alignment | AISTATS | CODE | |
Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network | AISTATS | - | |
One-pass Stochastic Gradient Descent in Overparametrized Two-layer Neural Networks | AISTATS | - | |
Fast Adaptation with Linearized Neural Networks | AISTATS | CODE | |
Fast Learning in Reproducing Kernel Kreın Spaces via Signed Measures | AISTATS | - | |
Stable ResNet | AISTATS | - | |
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks | AISTATS | - | |
Can We Characterize Tasks Without Labels or Features? | CVPR | CODE | |
The Neural Tangent Link Between CNN Denoisers and Non-Local Filters | CVPR | CODE | |
Nerfies: Deformable Neural Radiance Fields | ICCV | CODE | |
Kernel Methods in Hyperbolic Spaces | ICCV | - | |
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks | ICML | - | |
On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models | ICML | - | |
Tensor Programs IIb: Architectural Universality of Neural Tangent Kernel Training Dynamics | ICML | - | |
Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks | ICML | CODE | |
FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis | ICML | - | |
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent | ICML | - | |
Feature Learning in Infinite-Width Neural Networks | ICML | CODE | |
On Monotonic Linear Interpolation of Neural Network Parameters | ICML | - | |
Uniform Convergence, Adversarial Spheres and a Simple Remedy | ICML | - | |
Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels | ICML | - | |
Efficient Statistical Tests: A Neural Tangent Kernel Approach | ICML | - | |
Neural Tangent Generalization Attacks | ICML | CODE | |
On the Random Conjugate Kernel and Neural Tangent Kernel | ICML | - | |
Generalization Guarantees for Neural Architecture Search with Train-Validation Split | ICML | - | |
Tilting the playing field: Dynamical loss functions for machine learning | ICML | CODE | |
PHEW : Constructing Sparse Networks that Learn Fast and Generalize Well Without Training Data | ICML | - | |
On the Neural Tangent Kernel of Deep Networks with Orthogonal Initialization | IJCAI | CODE | |
Towards Understanding the Spectral Bias of Deep Learning | IJCAI | - | |
On Random Kernels of Residual Architectures | UAI | - | |
How Shrinking Gradient Noise Helps the Performance of Neural Networks | ICBD | - | |
Unsupervised Shape Completion via Deep Prior in the Neural Tangent Kernel Perspective | ACM TOG | - | |
Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel Analysis | TIT | - | |
Reinforcement Learning via Gaussian Processes with Neural Network Dual Kernels | CoG | - | |
Kernel-Based Smoothness Analysis of Residual Networks | MSML | - | |
Mathematical Models of Overparameterized Neural Networks | IEEE | - | |
A Feature Fusion Based Indicator for Training-Free Neural Architecture Search | IEEE | - | |
Pathological spectra of the Fisher information metric and its variants in deep neural networks | NC | - | |
Linearized two-layers neural networks in high dimension | Ann. Statist. | - | |
Geometric compression of invariant manifolds in neural nets | J. Stat. Mech. | CODE | |
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks | arXiv | - | |
Learning with Neural Tangent Kernels in Near Input Sparsity Time | arXiv | - | |
Spectral Analysis of the Neural Tangent Kernel for Deep Residual Networks | arXiv | - | |
Properties of the After Kernel | arXiv | CODE |
2020
Title | Venue | CODE | |
---|---|---|---|
Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations | ECCV | - | |
Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? — A Neural Tangent Kernel Perspective | NeurIPS | - | |
Label-Aware Neural Tangent Kernel: Toward Better Generalization and Local Elasticity | NeurIPS | CODE | |
Finite Versus Infinite Neural Networks: an Empirical Study | NeurIPS | - | |
On the linearity of large non-linear models: when and why the tangent kernel is constant | NeurIPS | - | |
On the Similarity between the Laplace and Neural Tangent Kernels | NeurIPS | - | |
A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks | NeurIPS | - | |
Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics | NeurIPS | - | |
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains | NeurIPS | CODE | |
Network size and weights size for memorization with two-layers neural networks | NeurIPS | - | |
Neural Networks Learning and Memorization with (almost) no Over-Parameterization | NeurIPS | - | |
Towards Understanding Hierarchical Learning: Benefits of Neural Representations | NeurIPS | - | |
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher | NeurIPS | - | |
On Infinite-Width Hypernetworks | NeurIPS | - | |
Predicting Training Time Without Training | NeurIPS | - | |
Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel | NeurIPS | - | |
Spectra of the Conjugate Kernel and Neural Tangent Kernel for Linear-Width Neural Networks | NeurIPS | - | |
Kernel and Rich Regimes in Overparametrized Models | COLT | - | |
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK | COLT | - | |
Finite Depth and Width Corrections to the Neural Tangent Kernel | ICLR | - | |
Neural tangent kernels, transportation mappings, and universal approximation | ICLR | - | |
Neural Tangents: Fast and Easy Infinite Neural Networks in Python | ICLR | CODE | |
Picking Winning Tickets Before Training by Preserving Gradient Flow | ICLR | CODE | |
Truth or Backpropaganda? An Empirical Investigation of Deep Learning Theory | ICLR | - | |
Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee | ICLR | - | |
The asymptotic spectrum of the Hessian of DNN throughout training | ICLR | - | |
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks | ICLR | CODE | |
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks | ICLR | - | |
Asymptotics of Wide Networks from Feynman Diagrams | ICLR | - | |
The equivalence between Stein variational gradient descent and black-box variational inference | ICLR-W | - | |
Neural Kernels Without Tangents | ICML | CODE | |
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization | ICML | - | |
Dynamics of Deep Neural Networks and Neural Tangent Hierarchy | ICML | - | |
Disentangling Trainability and Generalization in Deep Neural Networks | ICML | - | |
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks | ICML | CODE | |
Finding trainable sparse networks through Neural Tangent Transfer | ICML | CODE | |
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders | ICML | - | |
Neural Contextual Bandits with UCB-based Exploration | ICML | - | |
Optimization Theory for ReLU Neural Networks Trained with Normalization Layers | ICML | - | |
Towards a General Theory of Infinite-Width Limits of Neural Classifiers | ICML | - | |
Generalisation guarantees for continual learning with orthogonal gradient descent | ICML-W | CODE | |
Neural Spectrum Alignment: Empirical Study | ICANN | - | |
A type of generalization error induced by initialization in deep neural networks | MSML | - | |
Disentangling feature and lazy training in deep neural networks | J. Stat. Mech. | CODE | |
Scaling description of generalization with number of parameters in deep learning | J. Stat. Mech. | CODE | |
Any Target Function Exists in a Neighborhood of Any Sufficiently Wide Random Network: A Geometrical Perspective | NC | - | |
Kolmogorov Width Decay and Poor Approximation in Machine Learning: Shallow Neural Networks, Random Feature Models and Neural Tangent Kernels | RMS | - | |
On the infinite width limit of neural networks with a standard parameterization | arXiv | CODE | |
On the Empirical Neural Tangent Kernel of Standard Finite-Width Convolutional Neural Network Architectures | arXiv | - | |
Infinite-Width Neural Networks for Any Architecture: Reference Implementations | arXiv | CODE | |
Every Model Learned by Gradient Descent Is Approximately a Kernel Machine | arXiv | - | |
Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory? | arXiv | - | |
Scalable Neural Tangent Kernel of Recurrent Architectures | arXiv | CODE | |
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning | arXiv | - |
2019
Title | Venue | CODE | |
---|---|---|---|
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel | NeurIPS | - | |
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent | NeurIPS | CODE | |
On Exact Computation with an Infinitely Wide Neural Net | NeurIPS | CODE | |
Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels | NeurIPS | CODE | |
On the Inductive Bias of Neural Tangent Kernels | NeurIPS | CODE | |
Convergence of Adversarial Training in Overparametrized Neural Networks | NeurIPS | - | |
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks | NeurIPS | - | |
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers | NeurIPS | - | |
Limitations of Lazy Training of Two-layers Neural Networks | NeurIPS | - | |
The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies | NeurIPS | CODE | |
On Lazy Training in Differentiable Programming | NeurIPS | - | |
Information in Infinite Ensembles of Infinitely-Wide Neural Networks | AABI | - | |
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation | arXiv | - | |
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems | arXiv | - | |
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems | arXiv | - | |
Mean-field Behaviour of Neural Tangent Kernel for Deep Neural Networks | arXiv | - | |
Order and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts | arXiv | - | |
A Fine-Grained Spectral Perspective on Neural Networks | arXiv | CODE | |
Enhanced Convolutional Neural Tangent Kernels | arXiv | - |
2018
Title | Venue | CODE | |
---|---|---|---|
Neural Tangent Kernel: Convergence and Generalization in Neural Networks | NeurIPS | - |