Web

  1. World of Bits: An Open-Domain Platform for Web-Based Agents. ICML 2017

    Tianlin (Tim) Shi, Andrej Karpathy, Linxi (Jim) Fan, Jonathan Hernandez, Percy Liang pdf, 2017

  2. Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration. ICLR 2018

    Evan Zheran Liu, Kelvin Guu, Panupong Pasupat, Tianlin Shi, Percy Liang pdf, 2018.2

  3. WebNav: A New Large-Scale Task for Natural Language based Sequential Decision Making. Arxiv 2018

    Daniel Khashabi, Tushar Khot, Ashish Sabharwal, Peter Clark, Oyvind Tafjord, Eric Gribko, Michal Guerquin, Aniruddha T. Kembhavi pdf, 2018.12

  4. WebGPT: Large-Scale Web Navigation with Reinforcement Learning. Arxiv Kelvin Guu, Michael Tung, Panupong Pasupat, Ming-Wei Chang pdf, 2021.7

  5. MINOS: Multimodal Intelligence for Navigation in Open Spaces. Arxiv Barbara Menta, Amos Storkey, Artur d’Avila Garcez, Jeffery Kinnison, Judy Hoffman pd, 2022.1

  6. A data-driven approach for learning to control computers. PLMR

    Peter C Humphreys, David Raposo, Toby Pohlen, Gregory Thornton, Rachita Chhaparia, Alistair Muldal, Josh Abramson, Petko Georgiev, Alex Goldin, Adam Santoro, Timothy Lillicrap pdf, 2022.2

  7. WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents. Arxiv

    Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan pdf, 2022.7

  8. Multimodal Web Navigation with Instruction-Finetuned Foundation Models. ICLR 2023 Workshop ME-FoMo

    Hiroki Furuta, Ofir Nachum, Kuang-Huei Lee, Yutaka Matsuo, Shixiang Shane Gu, Izzeddin Gur pdf, 2023.5

  9. Hierarchical Prompting Assists Large Language Model on Web Navigation. ACL 2023 NLRSE workshop

    Abishek Sridhar, Robert Lo, Frank F. Xu, Hao Zhu, Shuyan Zhou pdf, 2023.5

  10. Mind2Web: Towards a Generalist Agent for the Web. Arxiv Xiang Deng, Yu Gu, Boyuan Zheng, Shijie Chen, Samuel Stevens, Boshi Wang, Huan Sun, Yu Su pdf, 2023.6

  11. A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis Arxiv Izzeddin Gur, Hiroki Furuta, Austin Huang, Mustafa Safdari, Yutaka Matsuo, Douglas Eck, Aleksandra Faust pdf, 2023.7

  12. WebArena: A Realistic Web Environment for Building Autonomous Agents Arxiv Shuyan Zhou, Frank F. Xu, Hao Zh+, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig pdf, 2023.7

  13. LASER: LLM Agent with State-Space Exploration for Web Navigation Arxiv Kaixin Ma, Hongming Zhang, Hongwei Wang, Xiaoman Pan, Dong Yu pdf, 2023.9

  14. GPT-4V(ision) is a Generalist Web Agent, if Grounded Arxiv Boyuan Zheng, Boyu Gou, Jihyung Kil, Huan Sun, Yu Su pdf, 2024.1

  15. WebLINX: Real-World Website Navigation with Multi-Turn Dialogue Arxiv Xing Han Lù, Zdeněk Kasner, Siva Reddy pdf, 2024.2

  16. VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding? Arxiv Junpeng Liu, Yifan Song, Bill Yuchen Lin, Wai Lam, Graham Neubig, Yuanzhi Li, Xiang Yue pdf, 2024.4

  17. SteP: Stacked LLM Policies for Web Actions Arxiv

    Paloma Sodhi, S.R.K. Branavan, Yoav Artzi, Ryan McDonald pdf, 2024.4

  18. WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents Arxiv Michael Lutz, Arth Bohra, Manvel Saroyan, Artem Harutyunyan, Giovanni Campagna pdf, 2024.4

  19. MMInA: Benchmarking Multihop Multimodal Internet Agents Arxiv Ziniu Zhang, Shulin Tian, Liangyu Chen, Ziwei Liu pdf, 2024.4

  20. WebCanvas: Benchmarking Web Agents in Online Environments Arxiv Yichen Pan, Dehan Kong, Sida Zhou, Cheng Cui, Yifei Leng, Bing Jiang, Hangyu Liu, Yanyi Shang, Shuyan Zhou, Tongshuang Wu, Zhengyang Wu pdf, 2024.6