Author: Synced

Machine Intelligence | Technology & Industry | Information & Analysis
AI Machine Learning & Data Science Research

Wider, Not Deeper: Cambridge, Oxford & ICL Challenge Conventional Transformer Design Approaches

In the new paper Wide Attention Is The Way Forward For Transformers, a research team from the University of Cambridge, Imperial College London, and the University of Oxford challenges the commonly held belief that deeper is better for transformer architectures, demonstrating that wider layers result in superior performance on natural language processing tasks.

AI Machine Learning & Data Science Research

Embedding Training With 1% GPU Memory and 100 Times Less Budget, an Open Source Solution for Super-Large Recommendation Model Training on a Single GPU

Colossal-AI has successfully used a heterogeneous training strategy to increase the number of NLP model training parameters capacity by hundreds of times at the same hardware. And experiment results show that it only needs to keep 1~5% of the embedding parameters in the GPU, and is still able to maintain excellent end-to-end training speed.

AI Machine Learning & Data Science Research

Stanford U & Google Brain’s Classifier-Free Guidance Model Diffusion Technique Reduces Sampling Steps by 256x

In the new paper On Distillation of Guided Diffusion Models, researchers from Google Brain and Stanford University propose a novel approach for distilling classifier-free guided diffusion models with high sampling efficiency. The resulting models achieve performance comparable to the original model but with sampling steps reduced by up to 256 times.

AI Machine Learning & Data Science Nature Language Tech Research

‘Ask Me Anything’: Stanford U, Numbers Station & UW Madison’s Novel Prompting Strategy Enables LLMs With 30x Fewer Parameters to Outperform Few-Shot GPT3-175B

In the new paper Ask Me Anything: A Simple Strategy for Prompting Language Models, a research team from Stanford University, Numbers Station, and the University of Wisconsin-Madison presents Ask Me Anything Prompting (AMA), a simple large language model prompting strategy that enables a 30x smaller language model to outperform few-shot GPT3-175B.

AI Computer Vision & Graphics Machine Learning & Data Science Research

Maximizing FLOPS Utilization: DeepMind & NYU Propose Efficiency Evaluations for Visual Pretraining Methods

In the new paper Where Should I Spend My FLOPS? Efficiency Evaluations of Visual Pre-training Methods, DeepMind and NYU Center for Neural Systems researchers introduce computational efficiency evaluation approaches designed to aid in the selection of optimal methods, datasets and models for pretraining visual tasks on a fixed FLOP budget.

AI Machine Learning & Data Science Research

UNC Chapel Hill’s Textless Vision-Language Transformer: Comparable Performance to Text-Based Approaches but 28x Faster

In the new paper TVLT: Textless Vision-Language Transformer, researchers from UNC Chapel Hill present the Textless Vision-Language Transformer (TVLT) for vision-and-language representation learning. TVLT uses only raw visual and audio inputs and performs comparably to its text-based counterparts but requires only 1/3 the parameters and achieves 28x faster inference speeds.

AI Machine Learning & Data Science Research

DeepMind, Oxford U, IDSIA, Mila & Purdue U’s General Neural Algorithmic Learner Matches Task-Specific Expert Performance

In the new paper A Generalist Neural Algorithmic Learner, a research team from DeepMind, University of Oxford, IDSIA, Mila, and Purdue University presents a novel generalist neural algorithmic learner — a single graph neural network (GNN) capable of solving various classical algorithms at single-task expert level.

AI Machine Learning & Data Science Research

Transformers on Edge Devices? Monash U’s Energy-Saving Attention With Linear Complexity Reduces Compute Cost by 73%

In the new paper EcoFormer: Energy-Saving Attention with Linear Complexity, a Monash University research team presents EcoFormer, an attention mechanism with linear complexity that replaces expensive multiply-accumulate operations with simple accumulations and achieves a 73 percent energy footprint reduction on ImageNet.

AI Machine Learning & Data Science Nature Language Tech Research

Google Brain’s Vec2Text Models for Sentence Generation Excel in Universality, Diversity, Fluency & Semantic Structure

In the new paper Vec2text With Round-Trip Translations, Google Brain researchers explore large language models’ capabilities for generating arbitrary natural language text from inputs of fixed-size vectors — a vec2text setting — and propose a simple data augmentation approach based on round-trip translations to improve vec2text model performance.

AI Machine Learning & Data Science Research

DeepMind’s ‘Expert-Aware’ Data Augmentation Technique Enables Data-Efficient Learning from Parametric Experts

The new DeepMind paper Data Augmentation for Efficient Learning from Parametric Experts proposes Augmented Policy Cloning (APC), a simple yet effective data-augmentation approach designed to support data-efficient learning from parametric experts. The method significantly improves data efficiency across various control and reinforcement learning settings.

AI Machine Learning & Data Science Nature Language Tech Research

Peking U & Microsoft’s Knowledge Attribution Method Enables Editing Factual Knowledge in Pretrained Transformers Without Fine-Tuning

In the new paper Knowledge Neurons in Pretrained Transformers, a research team from Peking University and Microsoft Research introduces a knowledge attribution method that identifies the neurons that store factual knowledge in pretrained transformers and leverages these neurons to edit factual knowledge in transformers without any fine-tuning.

AI Machine Learning & Data Science Research

DeepMind’s Model-Based Offline Options Framework Supports Automatic Skill & Behaviour Discovery, Boosts Transfer Capabilities

In the new paper MO2: Model-Based Offline Options, a DeepMind research team introduces Model-Based Offline Options (MO2), an offline hindsight bottleneck options framework that supports sample-efficient option discovery over continuous state-action spaces for efficient skill transfer to new tasks.

AI Machine Learning & Data Science Research

Toward a Turing Machine? Microsoft & Harvard Propose Neural Networks That Discover Learning Algorithms Themselves

A research team from Microsoft and Harvard University demonstrates that neural networks can discover succinct learning algorithms on their own in polynomial time and presents an architecture that combines recurrent weight-sharing between layers and convolutional weight-sharing to reduce parameter size from even trillions of nodes down to a constant.

AI Machine Learning & Data Science Research

Meta AI & Inria Saclay Advance BCIs to Enable Natural Speech Decoding From Non-Invasive Brain Recordings

In the new paper Decoding Speech From Non-Invasive Brain Recordings, a research team from Meta AI and the Inria Saclay Centre presents a single end-to-end architecture for decoding natural speech processing from non-invasive magnetoencephalography (MEG) or electroencephalography (EEG) brain recordings that can detect macroscopic brain signals in real-time.

AI Machine Learning & Data Science Nature Language Tech Research

Plan, Edit, Explain and Repeat: The PEER Collaborative Language Model Brings a Humanlike Process to Text Generation

In the new paper PEER: A Collaborative Language Model, a research team from Meta AI, Carnegie Mellon University, PSL University, and University College London presents PEER, a collaborative language model that performs a humanlike writing process — composing drafts, adding suggestions, proposing edits and providing explanations for its actions.

AI Computer Vision & Graphics Machine Learning & Data Science Research

Princeton U & Adobe’s 3D-FM GAN Enables Precise 3D-Controllable Face Manipulation

In the new paper 3D-FM GAN: Towards 3D-Controllable Face Manipulation, a team from Princeton University and Adobe Research presents 3D-FM GAN, a novel conditional GAN framework that enables precise 3D-controllable face manipulation with high photorealism and strong identity preservation without requiring any manual tuning or optimizations.

AI Computer Vision & Graphics Machine Learning & Data Science Popular Research

Microsoft’s BEiT-3 Foundation Model: A ‘Big Convergence of Language, Vision, and Multimodal Pretraining’ That Achieves SOTA Results on Popular Benchmarks

In the new paper Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks, a Microsoft research team presents BEiT-3, a general-purpose state-of-the-art multimodal foundation model for both vision and vision-language tasks that advances the big convergence of backbone architectures, pretraining tasks, and model scaling.

AI Machine Learning & Data Science Nature Language Tech Research

CMU Details 6 Years of Contributions to the National Science Foundation- Funded DialPort Project for Dialog Research

Carnegie Mellon University researchers provide background information and details on contributions to the DialPort project over the last six years in their new paper The DialPort Tools. These tools — such as the DialPort Portal and DialCrowd — will be demoed at the SIGDIAL 2022 conference next month in Edinburgh.

AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s Parameter-Efficient Z-Code++ Language Model Beats the 200x Larger GPT3-175B on Abstractive Text Summarization

In the new paper Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization, a research team from Microsoft Azure AI and Microsoft Research presents Z-Code++, a novel encoder-decoder pretrained language model optimized for abstractive summarization that significantly improves performance on low-resource summarization tasks.

AI Computer Vision & Graphics Machine Learning & Data Science Research

Adobe and ANU’s Paint2Pix: Intent-Accurate Image Synthesis from Simple Brushstroke Inputs

In the new paper Paint2Pix: Interactive Painting based Progressive Image Synthesis and Editing, a research team from Adobe Research and Australian National University presents paint2pix, a novel model that learns to predict users’ intentions and produce photorealistic images from primitive and coarse human brushstroke inputs.

AI Machine Learning & Data Science Research

Microsoft, Penn U & UC San Diego’s TiCoder Framework Generates Code With 90.4% Consistency to User Intent

In the new paper Interactive Code Generation via Test-Driven User-Intent Formalization, a team from Microsoft Research, the University of Pennsylvania, and the University of California, San Diego proposes a workflow for test-driven user-intent formalization that leverages user feedback to generate code that is 90.40 percent consistent with user intent.

AI Machine Learning & Data Science Research

Georgia Tech & Google Propose a Novel Discrete Variational Autoencoder for Automatically Improving Code Efficiency

In the new paper Learning to Improve Code Efficiency, a research team from the Georgia Institute of Technology and Google Research presents a novel discrete generative latent-variable model designed to help programmers identify more computationally efficient code variants, taking a step toward automating the process of code performance optimization.

AI Machine Learning & Data Science Nature Language Tech Research

Meet Atlas: A Pretrained Retrieval Augmented Language Model That Outperforms a 540B Parameter Model But Requires 50x Fewer Parameters

In the new paper Few-shot Learning With Retrieval Augmented Language Models, a research team from Meta AI, PSL University, Inria, and University College London presents Atlas, a pretrained retrieval augmented language model that effectively learns new knowledge-intensive tasks under few-shot settings. Atlas outperforms the 540B parameter PaLM model on QA tasks while using 50x fewer parameters.

AI Machine Learning & Data Science Research

Meta AI & Mila Publicly Release BlenderBot 3: A 175B SOTA Chatbot That Continually Improves via Human Interactions

In the new paper BlenderBot 3: A Deployed Conversational Agent That Continually Learns to Responsibly Engage, researchers from Meta AI and Mila/McGill University release BlenderBot 3, a 175B parameter state-of-the-art open-domain dialogue model deployed on a public website. BlenderBot 3 is designed for continual learning via its user interactions.