Category: Research

Technical review of the newest machine intelligence research.

AI Computer Vision & Graphics Machine Learning & Data Science Research

NVIDIA’s Minimal Video Instance Segmentation Framework Achieves SOTA Performance Without Video-Based Training

In the new paper MinVIS: A Minimal Video Instance Segmentation Framework Without Video-based Training, an NVIDIA research team presents MinVIS, a minimal video instance segmentation framework that outperforms state-of-the-art VIS approaches without requiring video-based training.

AI Machine Learning & Data Science Research

Microsoft & Arizona U’s TextWorldExpress Simulates Text Games at 1M SPS, a Speedup of 3 Orders of Magnitude

In the new paper TextWorldExpress: Simulating Text Games at One Million Steps Per Second, a research team from the University of Arizona and Microsoft Research Montréal presents TextWorldExpress, a high-performance text-game simulator that boosts throughput by approximately three orders of magnitude, reaching one million steps per second.

AI Machine Learning & Data Science Research

OpenAI Presents a Simple and Efficient Training Strategy to Boost Language Models’ Text-Infilling Capabilities

In the new paper Efficient Training of Language Models to Fill in the Middle, an OpenAI research team shows that causal decoder-based autoregressive (AR) language models can learn to infill texts via a very simple and straightforward transformation to the training data and without any architectural modifications.

AI Computer Vision & Graphics Machine Learning & Data Science Research

IITM & UT Austin’s Generalizable NeRF Transformer Demonstrates Transformers’ Capabilities for Graphical Rendering

In the new paper Is Attention All NeRF Needs?, a research team from the Indian Institute of Technology Madras and the University of Texas at Austin proposes Generalizable NeRF Transformer (GNT), a pure and universal transformer-based architecture for efficient on-the-fly reconstruction of NeRFs. The work demonstrates that a pure attention mechanism suffices for learning a physically-grounded rendering process.

AI Machine Learning & Data Science Nature Language Tech Research

Fancy a Friendly Chat? Stanford NLP’s Chirpy Cardinal Enables Open-Domain and Humanlike Conversations

In the new paper Neural Generation Meets Real People: Building a Social, Informative Open-Domain Dialogue Agent, a Stanford NLP research team presents Chirpy Cardinal, an open-domain conversational social chatbot with emotional and social intelligence that enables authentic and engaging interactions with real people.

AI Machine Learning & Data Science Research

Google & DeepMind Study the Interactions Between Scaling Laws and Neural Network Architectures

In the new paper Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?, a research team from Google and DeepMind posits that understanding the connections between neural network architectures and scaling laws is essential for designing and evaluating new models. The team pretrains and finetunes over 100 models to reveal useful insights on the scaling behaviours of ten diverse model architectures.

AI Machine Learning & Data Science Research

DeepMind & UCL’s Stochastic MuZero Achieves SOTA Results in Complex Stochastic Environments

In the new paper Planning in Stochastic Environments with a Learned Model, a research team from DeepMind and University College London extends the deterministic MuZero model to Stochastic MuZero for stochastic model learning, achieving performance comparable or superior to state-of-the-art methods in complex single- and multi-agent environments.

AI Machine Learning & Data Science Research

SYSU and UBTECH Propose Big Learning for Justifying, Analyzing, and Improving Foundation Models

A research team from Sun Yat-sen University and UBTECH proposes a unified approach for justifying, analyzing, and improving foundation models in the new paper Big Learning: A Universal Machine Learning Paradigm? The team’s big learning framework can model many-to-all joint/conditional/marginal data distributions and delivers extraordinary data and task flexibilities.

AI Computer Vision & Graphics Machine Learning & Data Science Popular Research

Academia Sinica’s YOLOv7 Outperforms All Object Detectors, Reduces Costs by 50%

In the new paper YOLOv7: Trainable Bag-Of-Freebies Sets New State-Of-The-Art for Real-Time Object Detectors, an Academia Sinica research team releases YOLOv7. This latest YOLO version introduces novel “extend” and “compound scaling” methods that effectively utilize parameters and computation; and surpasses all known real-time object detectors in speed and accuracy.

AI Machine Learning & Data Science Research

Salesforce’s CodeRL Achieves SOTA Code Generation Results With Strong Zero-Shot Transfer Capabilities

In the new paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning, a Salesforce Research team presents CodeRL, a novel framework for program synthesis tasks that employs pretrained language models (LMs) and deep reinforcement learning (RL) and achieves state-of-the-art performance on the challenging APPS benchmark while also demonstrating impressive zero-shot transfer capabilities.

AI Machine Learning & Data Science Research

Learning Without Simulations? UC Berkeley’s DayDreamer Establishes a Strong Baseline for Real-World Robotic Training

In the new paper DayDreamer: World Models for Physical Robot Learning, researchers from the University of California, Berkeley leverage recent advances in the Dreamer world model to enable online reinforcement learning for robot training without simulators or demonstrations, establishing a strong baseline for efficient real-world robotic learning.

AI Computer Vision & Graphics Machine Learning & Data Science Research

NVIDIA’s Global Context ViT Achieves SOTA Performance on CV Tasks Without Expensive Computation

In the new paper Global Context Vision Transformers, an NVIDIA research team proposes the Global Context Vision Transformer, a novel yet simple hierarchical ViT architecture comprising global self-attention and token generation modules that enables the efficient modelling of both short- and long-range dependencies without costly compute operations while achieving SOTA results across various computer vision tasks.

AI Machine Learning & Data Science Nature Language Tech Research

CMU’s Novel ‘ReStructured Pre-training’ NLP Approach Scores 40 Points Above Student Average on a Standard English Exam

In the new paper ReStructured Pre-training, a Carnegie Mellon University research team proposes “reStructured Pre-training” (RST), a novel NLP paradigm that pretrains models over valuable restructured data. The team’s resulting QIN system scores 40 points higher than the student average on the Gaokao-English Exam and 15 points higher than GPT-3 with 1/16 of the parameters.

AI Machine Learning & Data Science Research

Allen AI & UW Propose Unified-IO: A High-Performance, Task-Agnostic Model for CV, NLP, and Multi-Modal Tasks

In the new paper Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks, a research team from the Allen Institute for AI and the University of Washington introduces UNIFIED-IO, a neural model that achieves strong performance across a wide variety of vision, language, and multi-modal tasks without task- or modality-specific branches or fine-tuning.

AI Machine Learning & Data Science Research

A WaveNet Rival? Stanford U Study Models Raw Audio Waveforms Over Contexts of 500k Samples

In the new paper GoodBye WaveNet — A Language Model for Raw Audio with Context of 1/2 Million Samples, Stanford University researcher Prateek Verma presents a generative auto-regressive architecture that models audio waveforms over contexts greater than 500,000 samples and outperforms state-of-the-art WaveNet baselines.

AI Machine Learning & Data Science Research

DeepMind Boosts RL Agents’ Retrieval Capability to Tens of Millions of Pieces of Information

In the new paper Large-Scale Retrieval for Reinforcement Learning, a DeepMind research team dramatically expands the information accessible to reinforcement learning (RL) agents, enabling them to attend to tens of millions of information pieces, incorporate new information without retraining, and learn decision making in an end-to-end manner.

AI Machine Learning & Data Science Research

Google Leverages Transformers to Vastly Simplify Neural Video Compression With SOTA Results

In the new paper VCT: A Video Compression Transformer, a Google Research team presents an elegantly simple but powerful video compression transformer (VCT) that does not require architectural biases and priors and learns totally from data without any hand-crafting. VCT is easy to implement and outperforms conventional video compression approaches.

AI Machine Learning & Data Science Research

Wav2Vec 2.0 Learns Brain-Like Representations From Just 600 Hours of Unlabeled Speech Data in New Study

In the new paper Toward a Realistic Model of Speech Processing in the Brain with Self-supervised Learning, researchers show that self-supervised architectures such as Wav2Vec 2.0 can learn brain-like representations from as little as 600 hours of unlabelled speech; and can also learn sound-generic and speech- and language-specific representations similar to those of the prefrontal and temporal cortices.

AI Machine Learning & Data Science Research

444 Authors From 132 Institutions Release BIG-bench: A 204-Task ‘Extremely Difficult and Diverse’ Benchmark for Large Language Models

In the new paper Beyond the Imitation Game: Quantifying and Extrapolating the Capabilities of Language Models, 444 authors from 132 institutions introduce Beyond the Imitation Game (BIG-bench), a large-scale, extremely difficult and diverse benchmark that includes 204 tasks for predicting the potentially transformative effects of large language models.

AI Machine Learning & Data Science Research

Cambridge, Google & Secondmind’s Neural Diffusion Processes Challenge Gaussian Processes for Describing Rich Distributions Over Functions

In the new paper Neural Diffusion Processes, a research team from the University of Cambridge, Secondmind, and Google Research presents Neural Diffusion Processes (NDPs), a novel framework that learns to sample from rich distributions over functions at a lower computational cost than the true Bayesian posterior of a conventional Gaussian process.

AI Machine Learning & Data Science Research

Yoshua Bengio Team’s Large-Scale Analysis Reveals the Benefits of Modularity and Sparsity for DNNs

In the new paper Is a Modular Architecture Enough?, a research team from Mila and the Université de Montréal conducts a rigorous and thorough quantitative assessment of common modular architectures that reveals the benefits of modularity and sparsity for deep neural networks and the sub-optimality of existing end-to-end learned modular systems.

AI Machine Learning & Data Science Research

Microsoft’s XTC Extreme Lightweight Compression Method for Pretrained Transformers Achieves SOTA Results and 50x Smaller Model Sizes

In the new paper Extreme Compression for Pre-trained Transformers Made Simple and Efficient, a Microsoft research team introduces XTC, a simple yet effective extreme compression pipeline for pretrained transformers that can achieve state-of-the-art results while reducing model size by 50x.

AI Machine Learning & Data Science Research

Gem-Miner: Finding Lottery Tickets at Initialization and Bettering All Baselines at 19x Faster Speeds

In the new paper Rare Gems: Finding Lottery Tickets at Initialization, a research team from Carnegie Mellon University, MBZUAI, Petuum, Inc and the University of Wisconsin-Madison proposes GEM-MINER, an algorithm that finds sparse subnetworks at initialization trainable to accuracy that is comparable or better than iterative magnitude pruning (IMP) with warm-up.

AI Machine Learning & Data Science Research

NVIDIA & UW Introduce Factory: A Set of Physics Simulation Methods and Learning Tools for Contact-Rich Robotic Assembly

In the new paper Factory: Fast Contact for Robotic Assembly, a research team from NVIDIA and the University of Washington introduces Factory, a set of physics simulation methods and robot learning tools for simulating contact-rich interactions in assembly with high accuracy, efficiency, and robustness.