Skip to main content

Showing 1–17 of 17 results for author: Co-Reyes, J

  1. arXiv:2404.11018  [pdf, other

    cs.LG cs.AI cs.CL

    Many-Shot In-Context Learning

    Authors: Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Luis Rosias, Stephanie Chan, Biao Zhang, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal Behbahani, Aleksandra Faust, Hugo Larochelle

    Abstract: Large language models (LLMs) excel at few-shot in-context learning (ICL) -- learning from a few examples provided in context at inference, without any weight updates. Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples -- the many-shot regime. Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative… ▽ More

    Submitted 22 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  2. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  3. arXiv:2402.05821  [pdf, other

    cs.LG cs.NE

    Guided Evolution with Binary Discriminators for ML Program Search

    Authors: John D. Co-Reyes, Yingjie Miao, George Tucker, Aleksandra Faust, Esteban Real

    Abstract: How to automatically design better machine learning programs is an open problem within AutoML. While evolution has been a popular tool to search for better ML programs, using learning itself to guide the search has been less successful and less understood on harder problems but has the promise to dramatically increase the speed and final performance of the optimization process. We propose guiding… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  4. arXiv:2312.06585  [pdf, other

    cs.LG

    Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

    Authors: Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron Parisi, Abhishek Kumar, Alex Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron , et al. (16 additional authors not shown)

    Abstract: Fine-tuning language models~(LMs) on human-generated data remains a prevalent practice. However, the performance of such models is often limited by the quantity and diversity of high-quality human data. In this paper, we explore whether we can go beyond human data on tasks where we have access to scalar feedback, for example, on math problems where one can verify correctness. To do so, we investig… ▽ More

    Submitted 17 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted to TMLR. Camera-ready version. First three authors contributed equally

  5. arXiv:2311.07587  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

    Authors: C. Daniel Freeman, Laura Culp, Aaron Parisi, Maxwell L Bileschi, Gamaleldin F Elsayed, Alex Rizkowsky, Isabelle Simpson, Alex Alemi, Azade Nova, Ben Adlam, Bernd Bohnet, Gaurav Mishra, Hanie Sedghi, Igor Mordatch, Izzeddin Gur, Jaehoon Lee, JD Co-Reyes, Jeffrey Pennington, Kelvin Xu, Kevin Swersky, Kshiteej Mahajan, Lechao Xiao, Rosanne Liu, Simon Kornblith, Noah Constant , et al. (5 additional authors not shown)

    Abstract: We introduce and study the problem of adversarial arithmetic, which provides a simple yet challenging testbed for language model alignment. This problem is comprised of arithmetic questions posed in natural language, with an arbitrary adversarial string inserted before the question is complete. Even in the simple setting of 1-digit addition problems, it is easy to find adversarial prompts that mak… ▽ More

    Submitted 15 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

  6. arXiv:2310.10047  [pdf, other

    cs.CL

    Improving Large Language Model Fine-tuning for Solving Math Problems

    Authors: Yixin Liu, Avi Singh, C. Daniel Freeman, John D. Co-Reyes, Peter J. Liu

    Abstract: Despite their success in many natural language tasks, solving math problems remains a significant challenge for large language models (LLMs). A large gap exists between LLMs' pass-at-one and pass-at-N performance in solving math problems, suggesting LLMs might be close to finding correct solutions, motivating our exploration of fine-tuning methods to unlock LLMs' performance. Using the challenging… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  7. arXiv:2310.08710  [pdf, other

    cs.RO cs.LG

    Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research

    Authors: Cole Gulino, Justin Fu, Wenjie Luo, George Tucker, Eli Bronstein, Yiren Lu, Jean Harb, Xinlei Pan, Yan Wang, Xiangyu Chen, John D. Co-Reyes, Rishabh Agarwal, Rebecca Roelofs, Yao Lu, Nico Montali, Paul Mougin, Zoey Yang, Brandyn White, Aleksandra Faust, Rowan McAllister, Dragomir Anguelov, Benjamin Sapp

    Abstract: Simulation is an essential tool to develop and benchmark autonomous vehicle planning software in a safe and cost-effective manner. However, realistic simulation requires accurate modeling of nuanced and complex multi-agent interactive behaviors. To address these challenges, we introduce Waymax, a new data-driven simulator for autonomous driving in multi-agent scenes, designed for large-scale simul… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  8. arXiv:2309.14322  [pdf, other

    cs.LG

    Small-scale proxies for large-scale Transformer training instabilities

    Authors: Mitchell Wortsman, Peter J. Liu, Lechao Xiao, Katie Everett, Alex Alemi, Ben Adlam, John D. Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith

    Abstract: Teams that have trained large Transformer-based models have reported training instabilities at large scale that did not appear when training with the same hyperparameters at smaller scales. Although the causes of such instabilities are of scientific interest, the amount of resources required to reproduce them has made investigation difficult. In this work, we seek ways to reproduce and study train… ▽ More

    Submitted 16 October, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

  9. arXiv:2204.04292  [pdf, other

    cs.LG

    Evolving Pareto-Optimal Actor-Critic Algorithms for Generalizability and Stability

    Authors: Juan Jose Garau-Luis, Yingjie Miao, John D. Co-Reyes, Aaron Parisi, Jie Tan, Esteban Real, Aleksandra Faust

    Abstract: Generalizability and stability are two key objectives for operating reinforcement learning (RL) agents in the real world. Designing RL algorithms that optimize these objectives can be a costly and painstaking process. This paper presents MetaPG, an evolutionary method for automated design of actor-critic loss functions. MetaPG explicitly optimizes for generalizability and performance, and implicit… ▽ More

    Submitted 24 April, 2023; v1 submitted 8 April, 2022; originally announced April 2022.

  10. arXiv:2112.03899  [pdf, other

    cs.LG cs.AI

    Information is Power: Intrinsic Control via Information Capture

    Authors: Nicholas Rhinehart, Jenny Wang, Glen Berseth, John D. Co-Reyes, Danijar Hafner, Chelsea Finn, Sergey Levine

    Abstract: Humans and animals explore their environment and acquire useful skills even in the absence of clear goals, exhibiting intrinsic motivation. The study of intrinsic motivation in artificial agents is concerned with the following question: what is a good general-purpose objective for an agent? We study this question in dynamic partially-observed environments, and argue that a compact and general lear… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

    Comments: NeurIPS 2021

  11. arXiv:2106.02229  [pdf, other

    cs.LG cs.AI cs.CV

    Differentiable Architecture Search for Reinforcement Learning

    Authors: Yingjie Miao, Xingyou Song, John D. Co-Reyes, Daiyi Peng, Summer Yue, Eugene Brevdo, Aleksandra Faust

    Abstract: In this paper, we investigate the fundamental question: To what extent are gradient-based neural architecture search (NAS) techniques applicable to RL? Using the original DARTS as a convenient baseline, we discover that the discrete architectures found can achieve up to 250% performance compared to manual architecture designs on both discrete and continuous action space environments across off-pol… ▽ More

    Submitted 15 November, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

    Comments: Published as a conference paper at the first Automated Machine Learning Conference (AutoML-Conf) 2022. Code can be found at https://github.com/google/brain_autorl/tree/main/rl_darts

  12. arXiv:2101.03958  [pdf, other

    cs.LG cs.AI cs.NE

    Evolving Reinforcement Learning Algorithms

    Authors: John D. Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Sergey Levine, Quoc V. Le, Honglak Lee, Aleksandra Faust

    Abstract: We propose a method for meta-learning reinforcement learning algorithms by searching over the space of computational graphs which compute the loss function for a value-based model-free RL agent to optimize. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. Our method can both learn from scratch and bootstrap off known existing algorithms, l… ▽ More

    Submitted 10 November, 2022; v1 submitted 8 January, 2021; originally announced January 2021.

    Comments: ICLR 2021 Oral. See project website at https://sites.google.com/view/evolvingrl

  13. arXiv:2006.12478  [pdf, other

    cs.LG cs.AI stat.ML

    Ecological Reinforcement Learning

    Authors: John D. Co-Reyes, Suvansh Sanjeev, Glen Berseth, Abhishek Gupta, Sergey Levine

    Abstract: Much of the current work on reinforcement learning studies episodic settings, where the agent is reset between trials to an initial state distribution, often with well-shaped reward functions. Non-episodic settings, where the agent must learn through continuous interaction with the world without resets, and where the agent receives only delayed and sparse reward signals, is substantially more diff… ▽ More

    Submitted 22 June, 2020; originally announced June 2020.

    Comments: Preprint. Website at: https://sites.google.com/view/ecological-rl/home

  14. arXiv:1910.12827  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Entity Abstraction in Visual Model-Based Reinforcement Learning

    Authors: Rishi Veerapaneni, John D. Co-Reyes, Michael Chang, Michael Janner, Chelsea Finn, Jiajun Wu, Joshua B. Tenenbaum, Sergey Levine

    Abstract: This paper tests the hypothesis that modeling a scene in terms of entities and their local interactions, as opposed to modeling the scene globally, provides a significant benefit in generalizing to physical tasks in a combinatorial space the learner has not encountered before. We present object-centric perception, prediction, and planning (OP3), which to the best of our knowledge is the first full… ▽ More

    Submitted 6 May, 2020; v1 submitted 28 October, 2019; originally announced October 2019.

    Comments: Accepted at CoRL 2019

  15. arXiv:1811.07882  [pdf, other

    cs.LG cs.AI cs.CL cs.HC

    Guiding Policies with Language via Meta-Learning

    Authors: John D. Co-Reyes, Abhishek Gupta, Suvansh Sanjeev, Nick Altieri, Jacob Andreas, John DeNero, Pieter Abbeel, Sergey Levine

    Abstract: Behavioral skills or policies for autonomous agents are conventionally learned from reward functions, via reinforcement learning, or from demonstrations, via imitation learning. However, both modes of task specification have their disadvantages: reward functions require manual engineering, while demonstrations require a human expert to be able to actually perform the task in order to generate the… ▽ More

    Submitted 29 January, 2019; v1 submitted 19 November, 2018; originally announced November 2018.

    Comments: Accepted at ICLR 2019

  16. arXiv:1806.02813  [pdf, other

    cs.LG cs.AI stat.ML

    Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings

    Authors: John D. Co-Reyes, YuXuan Liu, Abhishek Gupta, Benjamin Eysenbach, Pieter Abbeel, Sergey Levine

    Abstract: In this work, we take a representation learning perspective on hierarchical reinforcement learning, where the problem of learning lower layers in a hierarchy is transformed into the problem of learning trajectory-level generative models. We show that we can learn continuous latent representations of trajectories, which are effective in solving temporally extended and multi-stage problems. Our prop… ▽ More

    Submitted 7 June, 2018; originally announced June 2018.

    Comments: Accepted at ICML 2018

  17. arXiv:1703.01260  [pdf, other

    cs.LG

    EX2: Exploration with Exemplar Models for Deep Reinforcement Learning

    Authors: Justin Fu, John D. Co-Reyes, Sergey Levine

    Abstract: Deep reinforcement learning algorithms have been shown to learn complex tasks using highly general policy classes. However, sparse reward problems remain a significant challenge. Exploration methods based on novelty detection have been particularly successful in such settings but typically require generative or predictive models of the observations, which can be difficult to train when the observa… ▽ More

    Submitted 27 May, 2017; v1 submitted 3 March, 2017; originally announced March 2017.