OSWorld
  • Installation
  • User Guides
  • Community
    • FAQ
    • To Contribute
    • OSWorld Authors
    • Projects using OSWorld
  • Reference
OSWorld
  • Community
  • Projects using OSWorld
  • View page source

Projects using OSWorld

We thank the trust from following projects for using OSWorld to accelerate the progress of multimodal agents!

  • Cradle: Empowering Foundation Agents Towards General Computer Control

  • Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

  • Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding

  • CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents

  • Windows Agent Arena: a benchmark for AI agents acting on your computer

  • Agent S: An Open Agentic Framework that Uses Computers Like a Human

  • Model Card Addendum: Claude 3.5 Haiku and Upgraded Claude 3.5 Sonnet

  • OS-ATLAS: A Foundation Action Model For Generalist GUI Agents

  • Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

  • Ponder & Press: Advancing Visual GUI Agent towards General Computer Control

  • Aria-UI: Visual Grounding for GUI Instructions

…

Previous Next

© Copyright 2024, XLANG NLP Lab.

Built with Sphinx using a theme provided by Read the Docs.