Awesome

papers-I-read

I am trying a new initiative - a-paper-a-week. This repository will hold all those papers and related summaries and notes.

List of papers

Toolformer - Language Models Can Teach Themselves to Use Tools
Hints for Computer System Design
Synthesized Policies for Transfer and Adaptation across Tasks and Environments
Deep Neural Networks for YouTube Recommendations
The Tail at Scale
Practical Lessons from Predicting Clicks on Ads at Facebook
Ad Click Prediction - a View from the Trenches
Anatomy of Catastrophic Forgetting - Hidden Representations and Task Semantics
When Do Curricula Work?
Continual learning with hypernetworks
Zero-shot Learning by Generating Task-specific Adapters
HyperNetworks
Energy-based Models for Continual Learning
GPipe - Easy Scaling with Micro-Batch Pipeline Parallelism
Compositional Explanations of Neurons
Design patterns for container-based distributed systems
Cassandra - a decentralized structured storage system
CAP twelve years later - How the rules have changed
Consistency Tradeoffs in Modern Distributed Database System Design
Exploring Simple Siamese Representation Learning
Data Management for Internet-Scale Single-Sign-On
Searching for Build Debt - Experiences Managing Technical Debt at Google
One Solution is Not All You Need - Few-Shot Extrapolation via Structured MaxEnt RL
Learning Explanations That Are Hard To Vary
Remembering for the Right Reasons - Explanations Reduce Catastrophic Forgetting
A Foliated View of Transfer Learning
Harvest, Yield, and Scalable Tolerant Systems
MONet - Unsupervised Scene Decomposition and Representation
Revisiting Fundamentals of Experience Replay
Deep Reinforcement Learning and the Deadly Triad
Alpha Net: Adaptation with Composition in Classifier Space
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Gradient Surgery for Multi-Task Learning
GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks
TaskNorm: Rethinking Batch Normalization for Meta-Learning
Averaging Weights leads to Wider Optima and Better Generalization
Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions
When to use parametric models in reinforcement learning?
Network Randomization - A Simple Technique for Generalization in Deep Reinforcement Learning
On the Difficulty of Warm-Starting Neural Network Training
Supervised Contrastive Learning
CURL - Contrastive Unsupervised Representations for Reinforcement Learning
Competitive Training of Mixtures of Independent Deep Generative Models
What Does Classifying More Than 10,000 Image Categories Tell Us?
mixup - Beyond Empirical Risk Minimization
ELECTRA - Pre-training Text Encoders as Discriminators Rather Than Generators
Gradient based sample selection for online continual learning
Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One
Massively Multilingual Neural Machine Translation in the Wild - Findings and Challenges
Observational Overfitting in Reinforcement Learning
Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML
Accurate, Large Minibatch SGD - Training ImageNet in 1 Hour
Superposition of many models into one
Towards a Unified Theory of State Abstraction for MDPs
ALBERT - A Lite BERT for Self-supervised Learning of Language Representations
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Contrastive Learning of Structured World Models
Gossip based Actor-Learner Architectures for Deep RL
How to train your MAML
PHYRE - A New Benchmark for Physical Reasoning
Large Memory Layers with Product Keys
Abductive Commonsense Reasoning
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
Assessing Generalization in Deep Reinforcement Learning
Quantifying Generalization in Reinforcement Learning
Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks
Measuring abstract reasoning in neural networks
Hamiltonian Neural Networks
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations
Meta-Reinforcement Learning of Structured Exploration Strategies
Relational Reinforcement Learning
Good-Enough Compositional Data Augmentation
Multiple Model-Based Reinforcement Learning
Towards a natural benchmark for continual learning
Meta-Learning Update Rules for Unsupervised Representation Learning
GNN Explainer - A Tool for Post-hoc Explanation of Graph Neural Networks
To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks
Model Primitive Hierarchical Lifelong Reinforcement Learning
TuckER - Tensor Factorization for Knowledge Graph Completion
Linguistic Knowledge as Memory for Recurrent Neural Networks
Diversity is All You Need - Learning Skills without a Reward Function
Modular meta-learning
Hierarchical RL Using an Ensemble of Proprioceptive Periodic Policies
Efficient Lifelong Learningi with A-GEM
Pre-training Graph Neural Networks with Kernels
Smooth Loss Functions for Deep Top-k Classification
Hindsight Experience Replay
Representation Tradeoffs for Hyperbolic Embeddings
Learned Optimizers that Scale and Generalize
One-shot Learning with Memory-Augmented Neural Networks
BabyAI - First Steps Towards Grounded Language Learning With a Human In the Loop
Poincaré Embeddings for Learning Hierarchical Representations
When Recurrent Models Don’t Need To Be Recurrent
HoME - a Household Multimodal Environment
Emergence of Grounded Compositional Language in Multi-Agent Populations
A Semantic Loss Function for Deep Learning with Symbolic Knowledge
Hierarchical Graph Representation Learning with Differentiable Pooling
Imagination-Augmented Agents for Deep Reinforcement Learning
Kronecker Recurrent Units
Learning Independent Causal Mechanisms
Memory-based Parameter Adaptation
Born Again Neural Networks
Net2Net-Accelerating Learning via Knowledge Transfer
Learning to Count Objects in Natural Images for Visual Question Answering
Neural Message Passing for Quantum Chemistry
Unsupervised Learning by Predicting Noise
The Lottery Ticket Hypothesis - Training Pruned Neural Networks
Cyclical Learning Rates for Training Neural Networks
Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning
An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks
Learning an SAT Solver from Single-Bit Supervision
Neural Relational Inference for Interacting Systems
Stylistic Transfer in Natural Language Generation Systems Using Recurrent Neural Networks
Get To The Point: Summarization with Pointer-Generator Networks
StarSpace - Embed All The Things!
Emotional Chatting Machine - Emotional Conversation Generation with Internal and External Memory
Exploring Models and Data for Image Question Answering
How transferable are features in deep neural networks
Distilling the Knowledge in a Neural Network
Revisiting Semi-Supervised Learning with Graph Embeddings
Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension
Higher-order organization of complex networks
Network Motifs - Simple Building Blocks of Complex Networks
Word Representations via Gaussian Embedding
HARP - Hierarchical Representation Learning for Networks
Swish - a Self-Gated Activation Function
Reading Wikipedia to Answer Open-Domain Questions
Task-Oriented Query Reformulation with Reinforcement Learning
Refining Source Representations with Relation Networks for Neural Machine Translation
Pointer Networks
Learning to Compute Word Embeddings On the Fly
R-NET - Machine Reading Comprehension with Self-matching Networks
ReasoNet - Learning to Stop Reading in Machine Comprehension
Principled Detection of Out-of-Distribution Examples in Neural Networks
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
One Model To Learn Them All
Two/Too Simple Adaptations of Word2Vec for Syntax Problems
A Decomposable Attention Model for Natural Language Inference
A Fast and Accurate Dependency Parser using Neural Networks
Neural Module Networks
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Conditional Similarity Networks
Simple Baseline for Visual Question Answering
VQA: Visual Question Answering
Learning to Generate Reviews and Discovering Sentiment
Seeing the Arrow of Time
End-to-end optimization of goal-driven and visually grounded dialogue systems
GuessWhat?! Visual object discovery through multi-modal dialogue
Semantic Parsing via Paraphrasing
Traversing Knowledge Graphs in Vector Space
PPDB: The Paraphrase Database
NewsQA: A Machine Comprehension Dataset
A Persona-Based Neural Conversation Model
“Why Should I Trust You?” Explaining the Predictions of Any Classifier
Conditional Generative Adversarial Nets
Addressing the Rare Word Problem in Neural Machine Translation
Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
Improving Word Representations via Global Context and Multiple Word Prototypes
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
Skip-Thought Vectors
Deep Convolutional Generative Adversarial Nets
Generative Adversarial Nets
A Roadmap towards Machine Intelligence
Smart Reply: Automated Response Suggestion for Email
Convolutional Neural Network For Sentence Classification
Conditional Image Generation with PixelCNN Decoders
Pixel Recurrent Neural Networks
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
Bag of Tricks for Efficient Text Classification
GloVe: Global Vectors for Word Representation
SimRank: A Measure of Structural-Context Similarity
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge
WikiReading : A Novel Large-scale Language Understanding Task over Wikipedia
WikiQA: A challenge dataset for open-domain question answering
Teaching Machines to Read and Comprehend
Evaluating Prerequisite Qualities for Learning End-to-end Dialog Systems
Recurrent Neural Network Regularization
Deep Math: Deep Sequence Models for Premise Selection
A Neural Conversational Model
Key-Value Memory Networks for Directly Reading Documents
Advances In Optimizing Recurrent Networks
Query Regression Networks for Machine Comprehension
Sequence to Sequence Learning with Neural Networks
The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training
Question Answering with Subgraph Embeddings
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
Visualizing Large-scale and High-dimensional Data
Visualizing Data using t-SNE
Curriculum Learning
End-To-End Memory Networks
Memory Networks
Learning To Execute
Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud
Large Scale Distributed Deep Networks
Efficient Estimation of Word Representations in Vector Space
Regularization and variable selection via the elastic net
Fractional Max-Pooling
TAO: Facebook’s Distributed Data Store for the Social Graph
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
The Unified Logging Infrastructure for Data Analytics at Twitter
A Few Useful Things to Know about Machine Learning
Hive – A Petabyte Scale Data Warehouse Using Hadoop
Kafka: a Distributed Messaging System for Log Processing
Power-law distributions in Empirical data
Pregel: A System for Large-Scale Graph Processing
GraphX: Unifying Data-Parallel and Graph-Parallel Analytics
Pig Latin: A Not-So-Foreign Language for Data Processing
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
MapReduce: Simplified Data Processing on Large Clusters
BigTable: A Distributed Storage System for Structured Data
Spark SQL: Relational Data Processing in Spark
Spark: Cluster Computing with Working Sets
Fast Data in the Era of Big Data: Twitter’s Real-Time Related Query Suggestion Architecture
Scaling Memcache at Facebook
Dynamo: Amazon’s Highly Available Key-value Store
f4 : Facebook's Warm BLOB Storage System
A Theoretician’s Guide to the Experimental Analysis of Algorithms
Cuckoo Hashing
Never Ending Learning