Home

Awesome

Neural Tangent Kernel Papers

This list contains papers that adopt Neural Tangent Kernel (NTK) as a main theme or core idea.
NOTE: If there are any papers I've missed, please feel free to raise an issue.

2024

TitleVenuePDFCODE
Faithful and Efficient Explanations for Neural Networks via Neural Tangent Kernel Surrogate ModelsICLRPDFCODE
PINNACLE: PINN Adaptive ColLocation and Experimental points selectionICLRPDF-
On the Foundations of Shortcut LearningICLRPDF-
Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset DistillationICLRPDF-
Sample Relationship from Learning Dynamics Matters for GeneralisationICLRPDF-
Robust NAS benchmark under adversarial training: assessment, theory, and beyondICLRPDF-
Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK ApproachICLRPDFCODE
Heterogeneous Personalized Federated Learning by Local-Global Updates Mixing via Convergence RateICLRPDF-
Neural Network-Based Score Estimation in Diffusion Models: Optimization and GeneralizationICLRPDF-
Grokking as the Transition from Lazy to Rich Training DynamicsICLRPDF-
Generalization of Deep ResNets in the Mean-Field RegimeICLRPDF-

2023

TitleVenuePDFCODE
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained ModelsNeurIPSPDFCODE
Deep Learning with Kernels through RKHM and the Perron–Frobenius OperatorNeurIPSPDF-
A Theoretical Analysis of the Test Error of Finite-Rank Kernel Ridge RegressionNeurIPSPDF-
Fixing the NTK: From Neural Network Linearizations to Exact Convex ProgramsNeurIPSPDF-
Beyond NTK with Vanilla Gradient Descent: A Mean-Field Analysis of Neural Networks with Polynomial Width, Samples, and TimeNeurIPSPDF-
Feature-Learning Networks Are Consistent Across Widths At Realistic ScalesNeurIPSPDF-
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural NetworksNeurIPSPDFCODE
Spectral Evolution and Invariance in Linear-width Neural NetworksNeurIPSPDF-
Analyzing Generalization of Neural Networks through Loss Path KernelsNeurIPSPDF-
Neural (Tangent Kernel) CollapseNeurIPSPDF-
Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimensionNeurIPSPDFCODE
A PAC-Bayesian Perspective on the Interpolating Information CriterionNeurIPS-WPDF-
A Kernel Perspective of Skip Connections in Convolutional NetworksICLRPDF-
Scale-invariant Bayesian Neural Networks with Connectivity Tangent KernelICLRPDF-
Symmetric Pruning in Quantum Neural NetworksICLRPDF-
The Influence of Learning Rule on Representation Dynamics in Wide Neural NetworksICLRPDF-
Few-shot Backdoor Attacks via Neural Tangent KernelsICLRPDF-
Analyzing Tree Architectures in Ensembles via Neural Tangent KernelICLRPDF-
Supervision Complexity and its Role in Knowledge DistillationICLRPDF-
NTK-SAP: Improving Neural Network Pruning By Aligning Training DynamicsICLRPDFCODE
Tuning Frequency Bias in Neural Network Training with Nonuniform DataICLRPDF-
Simple initialization and parametrization of sinusoidal networks via their kernel bandwidthICLRPDF-
Characterizing the spectrum of the NTK via a power series expansionICLRPDFCODE
Adaptive Optimization in the $\infty$-Width LimitICLRPDF-
Understanding the Generalization of Adam in Learning Neural Networks with Proper RegularizationICLRPDF-
The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich RegimesICLRPDF-
Restricted Strong Convexity of Deep Learning Models with Smooth ActivationsICLRPDF-
Feature selection and low test error in shallow low-rotation ReLU networksICLRPDF-
Exploring Active 3D Object Detection from a Generalization PerspectiveICLRPDFCODE
On the Neural Tangent Kernel Analysis of Randomly Pruned Neural NetworksAISTATSPDF-
Adversarial Noises Are Linearly Separable for (Nearly) Random Neural NetworksAISTATSPDF-
Regularize Implicit Neural Representation by ItselfCVPRPDF-
WIRE: Wavelet Implicit Neural RepresentationsCVPRPDFCODE
Regularizing Second-Order Influences for Continual LearningCVPRPDFCODE
Multiplicative Fourier Level of DetailCVPRPDF-
KECOR: Kernel Coding Rate Maximization for Active 3D Object DetectionICCVPDFCODE
TKIL: Tangent Kernel Approach for Class Balanced Incremental LearningICCV-WPDF-
A Fast, Well-Founded Approximation to the Empirical Neural Tangent KernelICMLPDF-
Stochastic Marginal Likelihood Gradients using Neural Tangent KernelsICMLPDFCODE
Graph Neural Tangent Kernel: Convergence on Large GraphsICMLPDF-
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent KernelsICMLPDFCODE
Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent KernelsICMLPDF-
Benign Overfitting in Deep Neural Networks under Lazy TrainingICMLPDF-
Gradient Descent in Neural Networks as Sequential Learning in Reproducing Kernel Banach SpaceICMLPDF-
A Kernel-Based View of Language Model Fine-TuningICMLPDF-
Combinatorial Neural BanditsICMLPDF-
What Can Be Learnt With Wide Convolutional Neural Networks?ICMLPDFCODE
Reward-Biased Maximum Likelihood Estimation for Neural Contextual BanditsAAAIPDF-
Neural tangent kernel at initialization: linear width sufficesUAIPDF-
Kernel Ridge Regression-Based Graph Dataset DistillationSIGKDDPDFCODE
Analyzing Deep PAC-Bayesian Learning with Neural Tangent Kernel: Convergence, Analytic Generalization Bound, and Efficient Hyperparameter SelectionTMLRPDF-
The Eigenlearning Framework: A Conservation Law Perspective on Kernel Regression and Wide Neural NetworksTMLRPDFCODE
Empirical Limitations of the NTK for Understanding Scaling Laws in Deep LearningTMLRPDF-
Analysis of Convolutions, Non-linearity and Depth in Graph Neural Networks using Neural Tangent KernelTMLRPDF-
A Framework and Benchmark for Deep Batch Active Learning for RegressionJMLRPDFCODE
A Continual Learning Algorithm Based on Orthogonal Gradient Descent Beyond Neural Tangent Kernel RegimeIEEEPDF-
The Quantum Path Kernel: A Generalized Neural Tangent Kernel for Deep Quantum Machine LearningQEPDF-
NeuralBO: A Black-box Optimization Algorithm using Deep Neural NetworksNCPDF-
Deep Learning in Random Neural Fields: Numerical Experiments via Neural Tangent KernelNNPDFCODE
Physics-informed radial basis network (PIRBN): A local approximating neural network for solving nonlinear partial differential equationsCMAMEPDF-
A non-gradient method for solving elliptic partial differential equations with deep neural networksJoCPPDF-
Self-Adaptive Physics-Informed Neural Networks using a Soft Attention MechanismJoCPPDF-
Towards a phenomenological understanding of neural networks: dataMLSTPDF-
Weighted Neural Tangent Kernel: A Generalized and Improved Network-Induced KernelMLPDFCODE
Tensor Programs IVb: Adaptive Optimization in the ∞-Width LimitarXivPDF-

2022

TitleVenuePDFCODE
Generalization Properties of NAS under Activation and Skip Connection SearchNeurIPSPDF-
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net StudyNeurIPSPDFCODE
Graph Neural Network BanditsNeurIPSPDF-
Lossless Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel ApproachNeurIPSPDF-
GraphQNTK: Quantum Neural Tangent Kernel for Graph DataNeurIPSPDFCODE
Evolution of Neural Tangent Kernels under Benign and Adversarial TrainingNeurIPSPDFCODE
TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent KernelsNeurIPSPDFCODE
Making Look-Ahead Active Learning Strategies Feasible with Neural Tangent KernelsNeurIPSPDFCODE
Disentangling the Predictive Variance of Deep Ensembles through the Neural Tangent KernelNeurIPSPDFCODE
On the Generalization Power of the Overfitted Three-Layer Neural Tangent Kernel ModelNeurIPSPDF-
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?NeurIPSPDF-
On the Spectral Bias of Convolutional Neural Tangent and Gaussian Process KernelsNeurIPSPDF-
Fast Neural Kernel Embeddings for General ActivationsNeurIPSPDFCODE
Bidirectional Learning for Offline Infinite-width Model-based OptimizationNeurIPSPDF-
Infinite Recommendation Networks: A Data-Centric ApproachNeurIPSPDFCODE1 <br> CODE2
Distribution-Informed Neural Networks for Domain Adaptation RegressionNeurIPSPDF-
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural NetworksNeurIPSPDF-
Spectral Bias Outside the Training Set for Deep Networks in the Kernel RegimeNeurIPSPDFCODE
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)NeurIPSPDF-
Transition to Linearity of General Neural Networks with Directed Acyclic Graph ArchitectureNeurIPSPDF-
A Neural Pre-Conditioning Active Learning Algorithm to Reduce Label ComplexityNeurIPSPDF-
NFT-K: Non-Fungible Tangent KernelsICASSPPDFCODE
Label Propagation Across Grapsh: Node Classification Using Graph Neural Tangent KenrelsICASSPPDF-
A Neural Tangent Kernel Perspective of Infinite Tree EnsemblesICLRPDF-
Neural Networks as Kernel Learners: The Silent Alignment EffectICLRPDF-
Towards Deepening Graph Neural Networks: A GNTK-based Optimization PerspectiveICLRPDF-
Overcoming The Spectral Bias of Neural Value ApproximationICLRPDFCODE
Efficient Computation of Deep Nonlinear Infinite-Width Neural Networks that Learn FeaturesICLRPDFCODE
Learning Neural Contextual Bandits Through Perturbed RewardsICLRPDF-
Learning Curves for Continual Learning in Neural Networks: Self-knowledge Transfer and ForgettingICLRPDF-
The Spectral Bias of Polynomial Neural NetworksICLRPDF-
On Feature Learning in Neural Networks with Global Convergence GuaranteesICLRPDF-
Implicit Bias of MSE Gradient Optimization in Underparameterized Neural NetworksICLRPDF-
Eigenspace Restructuring: A Principle of Space and Frequency in Neural NetworksCOLTPDF-
Neural Networks can Learn Representations with Gradient DescentCOLTPDF-
Neural Contextual Bandits without RegretAISTATSPDF-
Finding Dynamics Preserving Adversarial Winning TicketsAISTATSPDF-
Embedded Ensembles: Infinite Width Limit and Operating RegimesAISTATSPDF-
Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot LearningCVPRPDFCODE
Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training?CVPRPDFCODE
A Structured Dictionary Perspective on Implicit Neural RepresentationsCVPRPDFCODE
NL-FFC: Non-Local Fast Fourier Convolution for Image Super ResolutionCVPR-WPDFCODE
Intrinsic Neural Fields: Learning Functions on ManifoldsECCVPDF-
Random Gegenbauer Features for Scalable Kernel MethodsICMLPDF-
Fast Finite Width Neural Tangent KernelICMLPDFCODE
A Neural Tangent Kernel Perspective of GANsICMLPDFCODE
Neural Tangent Kernel Empowered Federated LearningICMLPDF-
Reverse Engineering the Neural Tangent KernelICMLPDFCODE
How to Train Your Wide Neural Network Without Backprop: An Input-Weight Alignment PerspectiveICMLPDFCODE
Bounding the Width of Neural Networks via Coupled Initialization – A Worst Case Analysis –ICMLPDF-
Leverage Score Sampling for Tensor Product Matrices in Input Sparsity TimeICMLPDF-
Lazy Estimation of Variable Importance for Large Neural NetworksICMLPDF-
DAVINZ: Data Valuation using Deep Neural Networks at InitializationICMLPDF-
Neural Tangent Kernel Beyond the Infinite-Width Limit: Effects of Depth and InitializationICMLPDFCODE
NeuralEF: Deconstructing Kernels by Deep Neural NetworksICMLPDFCODE
Feature Learning and Signal Propagation in Deep Neural NetworksICMLPDF-
More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations GeneralizeICMLPDFCODE
Fast Graph Neural Tangent Kernel via Kronecker SketchingAAAIPDF-
Rethinking Influence Functions of Neural Networks in the Over-parameterized RegimeAAAIPDF-
On the Empirical Neural Tangent Kernel of Standard Finite-Width Convolutional Neural Network ArchitecturesUAIPDF-
Feature Learning and Random Features in Standard Finite-Width Convolutional Neural Networks: An Empirical StudyUAIPDF-
Out of Distribution Detection via Neural Network AnchoringACMLPDFCODE
Learning Neural Ranking Models Online from Implicit User FeedbackWWWPDF-
Trust Your Robots! Predictive Uncertainty Estimation of Neural Networks with Sparse Gaussian ProcessesCoRLPDF-
When and why PINNs fail to train: A neural tangent kernel perspectiveCPPDFCODE
How Neural Architectures Affect Deep Learning for Communication Networks?ICCPDF-
Loss landscapes and optimization in over-parameterized non-linear systems and neural networksACHAPDF-
Feature Purification: How Adversarial Training Performs Robust Deep LearningFOCSPDF-
Kernel-Based Smoothness Analysis of Residual NetworksMSMLPDF-
Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory?MSMLPDF-
The Training Response Law Explains How Deep Neural Networks LearnIoPPDF-
Simple, Fast, and Flexible Framework for Matrix Completion with Infinite Width Neural NetworksPNASPDFCODE
Representation Learning via Quantum Neural Tangent KernelsPRX QuantumPDF-
TorchNTK: A Library for Calculation of Neural Tangent Kernels of PyTorch ModelsarXivPDFCODE
Neural Tangent Kernel Analysis of Shallow α-Stable ReLU Neural NetworksarXivPDF-
Neural Tangent Kernel: A SurveyarXivPDF-

2021

TitleVenuePDFCODE
Neural Tangent Kernel Maximum Mean DiscrepancyNeurIPSPDF-
DNN-based Topology Optimisation: Spatial Invariance and Neural Tangent KernelNeurIPSPDF-
Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent KernelNeurIPSPDF-
Scaling Neural Tangent Kernels via Sketching and Random FeaturesNeurIPSPDF-
Dataset Distillation with Infinitely Wide Convolutional NetworksNeurIPSPDF-
On the Equivalence between Neural Network and Support Vector MachineNeurIPSPDFCODE
Local Signal Adaptivity: Provable Feature Learning in Neural Networks Beyond KernelsNeurIPSPDFCODE
Explicit Loss Asymptotics in the Gradient Descent Training of Neural NetworksNeurIPSPDF-
Kernelized Heterogeneous Risk MinimizationNeurIPSPDFCODE
An Empirical Study of Neural Kernel BanditsNeurIPS-WPDF-
The Curse of Depth in Kernel RegimeNeurIPS-WPDF-
Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent KernelsICASSPPDFCODE
The Dynamics of Gradient Descent for Overparametrized Neural NetworksL4DCPDF-
The Recurrent Neural Tangent KernelICLRPDF-
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHSICLRPDF-
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel RegimeICLRPDF-
Meta-Learning with Neural Tangent KernelsICLRPDF-
How Neural Networks Extrapolate: From Feedforward to Graph Neural NetworksICLRPDF-
Deep Networks and the Multiple Manifold ProblemICLRPDF-
Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired PerspectiveICLRPDFCODE
Neural Thompson SamplingICLRPDF-
Deep Equals Shallow for ReLU Networks in Kernel RegimesICLRPDF-
A Recipe for Global Convergence Guarantee in Deep Neural NetworksAAAIPDF-
A Deep Conditioning Treatment of Neural NetworksALTPDF-
Nonparametric Regression with Shallow Overparameterized Neural Networks Trained by GD with Early StoppingCOLTPDF-
Learning with invariances in random features and kernel modelsCOLTPDF-
Implicit Regularization via Neural Feature AlignmentAISTATSPDFCODE
Regularization Matters: A Nonparametric Perspective on Overparametrized Neural NetworkAISTATSPDF-
One-pass Stochastic Gradient Descent in Overparametrized Two-layer Neural NetworksAISTATSPDF-
Fast Adaptation with Linearized Neural NetworksAISTATSPDFCODE
Fast Learning in Reproducing Kernel Kreın Spaces via Signed MeasuresAISTATSPDF-
Stable ResNetAISTATSPDF-
A Dynamical View on Optimization Algorithms of Overparameterized Neural NetworksAISTATSPDF-
Can We Characterize Tasks Without Labels or Features?CVPRPDFCODE
The Neural Tangent Link Between CNN Denoisers and Non-Local FiltersCVPRPDFCODE
Nerfies: Deformable Neural Radiance FieldsICCVPDFCODE
Kernel Methods in Hyperbolic SpacesICCVPDF-
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU NetworksICMLPDF-
On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel ModelsICMLPDF-
Tensor Programs IIb: Architectural Universality of Neural Tangent Kernel Training DynamicsICMLPDF-
Tensor Programs IV: Feature Learning in Infinite-Width Neural NetworksICMLPDFCODE
FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence AnalysisICMLPDF-
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror DescentICMLPDF-
Feature Learning in Infinite-Width Neural NetworksICMLPDFCODE
On Monotonic Linear Interpolation of Neural Network ParametersICMLPDF-
Uniform Convergence, Adversarial Spheres and a Simple RemedyICMLPDF-
Quantifying the Benefit of Using Differentiable Learning over Tangent KernelsICMLPDF-
Efficient Statistical Tests: A Neural Tangent Kernel ApproachICMLPDF-
Neural Tangent Generalization AttacksICMLPDFCODE
On the Random Conjugate Kernel and Neural Tangent KernelICMLPDF-
Generalization Guarantees for Neural Architecture Search with Train-Validation SplitICMLPDF-
Tilting the playing field: Dynamical loss functions for machine learningICMLPDFCODE
PHEW : Constructing Sparse Networks that Learn Fast and Generalize Well Without Training DataICMLPDF-
On the Neural Tangent Kernel of Deep Networks with Orthogonal InitializationIJCAIPDFCODE
Towards Understanding the Spectral Bias of Deep LearningIJCAIPDF-
On Random Kernels of Residual ArchitecturesUAIPDF-
How Shrinking Gradient Noise Helps the Performance of Neural NetworksICBDPDF-
Unsupervised Shape Completion via Deep Prior in the Neural Tangent Kernel PerspectiveACM TOGPDF-
Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel AnalysisTITPDF-
Reinforcement Learning via Gaussian Processes with Neural Network Dual KernelsCoGPDF-
Kernel-Based Smoothness Analysis of Residual NetworksMSMLPDF-
Mathematical Models of Overparameterized Neural NetworksIEEEPDF-
A Feature Fusion Based Indicator for Training-Free Neural Architecture SearchIEEEPDF-
Pathological spectra of the Fisher information metric and its variants in deep neural networksNCPDF-
Linearized two-layers neural networks in high dimensionAnn. Statist.PDF-
Geometric compression of invariant manifolds in neural netsJ. Stat. Mech.PDFCODE
A Convergence Theory Towards Practical Over-parameterized Deep Neural NetworksarXivPDF-
Learning with Neural Tangent Kernels in Near Input Sparsity TimearXivPDF-
Spectral Analysis of the Neural Tangent Kernel for Deep Residual NetworksarXivPDF-
Properties of the After KernelarXivPDFCODE

2020

TitleVenuePDFCODE
Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output ObservationsECCVPDF-
Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? — A Neural Tangent Kernel PerspectiveNeurIPSPDF-
Label-Aware Neural Tangent Kernel: Toward Better Generalization and Local ElasticityNeurIPSPDFCODE
Finite Versus Infinite Neural Networks: an Empirical StudyNeurIPSPDF-
On the linearity of large non-linear models: when and why the tangent kernel is constantNeurIPSPDF-
On the Similarity between the Laplace and Neural Tangent KernelsNeurIPSPDF-
A Generalized Neural Tangent Kernel Analysis for Two-layer Neural NetworksNeurIPSPDF-
Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamicsNeurIPSPDF-
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional DomainsNeurIPSPDFCODE
Network size and weights size for memorization with two-layers neural networksNeurIPSPDF-
Neural Networks Learning and Memorization with (almost) no Over-ParameterizationNeurIPSPDF-
Towards Understanding Hierarchical Learning: Benefits of Neural RepresentationsNeurIPSPDF-
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect TeacherNeurIPSPDF-
On Infinite-Width HypernetworksNeurIPSPDF-
Predicting Training Time Without TrainingNeurIPSPDF-
Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent KernelNeurIPSPDF-
Spectra of the Conjugate Kernel and Neural Tangent Kernel for Linear-Width Neural NetworksNeurIPSPDF-
Kernel and Rich Regimes in Overparametrized ModelsCOLTPDF-
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTKCOLTPDF-
Finite Depth and Width Corrections to the Neural Tangent KernelICLRPDF-
Neural tangent kernels, transportation mappings, and universal approximationICLRPDF-
Neural Tangents: Fast and Easy Infinite Neural Networks in PythonICLRPDFCODE
Picking Winning Tickets Before Training by Preserving Gradient FlowICLRPDFCODE
Truth or Backpropaganda? An Empirical Investigation of Deep Learning TheoryICLRPDF-
Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization GuaranteeICLRPDF-
The asymptotic spectrum of the Hessian of DNN throughout trainingICLRPDF-
Harnessing the Power of Infinitely Wide Deep Nets on Small-data TasksICLRPDFCODE
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural NetworksICLRPDF-
Asymptotics of Wide Networks from Feynman DiagramsICLRPDF-
The equivalence between Stein variational gradient descent and black-box variational inferenceICLR-WPDF-
Neural Kernels Without TangentsICMLPDFCODE
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of GeneralizationICMLPDF-
Dynamics of Deep Neural Networks and Neural Tangent HierarchyICMLPDF-
Disentangling Trainability and Generalization in Deep Neural NetworksICMLPDF-
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural NetworksICMLPDFCODE
Finding trainable sparse networks through Neural Tangent TransferICMLPDFCODE
Associative Memory in Iterated Overparameterized Sigmoid AutoencodersICMLPDF-
Neural Contextual Bandits with UCB-based ExplorationICMLPDF-
Optimization Theory for ReLU Neural Networks Trained with Normalization LayersICMLPDF-
Towards a General Theory of Infinite-Width Limits of Neural ClassifiersICMLPDF-
Generalisation guarantees for continual learning with orthogonal gradient descentICML-WPDFCODE
Neural Spectrum Alignment: Empirical StudyICANNPDF-
A type of generalization error induced by initialization in deep neural networksMSMLPDF-
Disentangling feature and lazy training in deep neural networksJ. Stat. Mech.PDFCODE
Scaling description of generalization with number of parameters in deep learningJ. Stat. Mech.PDFCODE
Any Target Function Exists in a Neighborhood of Any Sufficiently Wide Random Network: A Geometrical PerspectiveNCPDF-
Kolmogorov Width Decay and Poor Approximation in Machine Learning: Shallow Neural Networks, Random Feature Models and Neural Tangent KernelsRMSPDF-
On the infinite width limit of neural networks with a standard parameterizationarXivPDFCODE
On the Empirical Neural Tangent Kernel of Standard Finite-Width Convolutional Neural Network ArchitecturesarXivPDF-
Infinite-Width Neural Networks for Any Architecture: Reference ImplementationsarXivPDFCODE
Every Model Learned by Gradient Descent Is Approximately a Kernel MachinearXivPDF-
Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory?arXivPDF-
Scalable Neural Tangent Kernel of Recurrent ArchitecturesarXivPDFCODE
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep LearningarXivPDF-

2019

TitleVenuePDFCODE
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced KernelNeurIPSPDF-
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient DescentNeurIPSPDFCODE
On Exact Computation with an Infinitely Wide Neural NetNeurIPSPDFCODE
Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph KernelsNeurIPSPDFCODE
On the Inductive Bias of Neural Tangent KernelsNeurIPSPDFCODE
Convergence of Adversarial Training in Overparametrized Neural NetworksNeurIPSPDF-
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural NetworksNeurIPSPDF-
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two LayersNeurIPSPDF-
Limitations of Lazy Training of Two-layers Neural NetworksNeurIPSPDF-
The Convergence Rate of Neural Networks for Learned Functions of Different FrequenciesNeurIPSPDFCODE
On Lazy Training in Differentiable ProgrammingNeurIPSPDF-
Information in Infinite Ensembles of Infinitely-Wide Neural NetworksAABIPDF-
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel DerivationarXivPDF-
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification ProblemsarXivPDF-
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression ProblemsarXivPDF-
Mean-field Behaviour of Neural Tangent Kernel for Deep Neural NetworksarXivPDF-
Order and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary ArtifactsarXivPDF-
A Fine-Grained Spectral Perspective on Neural NetworksarXivPDFCODE
Enhanced Convolutional Neural Tangent KernelsarXivPDF-

2018

TitleVenuePDFCODE
Neural Tangent Kernel: Convergence and Generalization in Neural NetworksNeurIPSPDF-