Projects using OSWorld ===================== We thank the trust from following projects for using OSWorld to accelerate the progress of multimodal agents! - `Cradle: Empowering Foundation Agents Towards General Computer Control `_ - `Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? `_ - `Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding `_ - `CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents `_ - `Windows Agent Arena: a benchmark for AI agents acting on your computer `_ - `Agent S: An Open Agentic Framework that Uses Computers Like a Human `_ - `Model Card Addendum: Claude 3.5 Haiku and Upgraded Claude 3.5 Sonnet `_ - `OS-ATLAS: A Foundation Action Model For Generalist GUI Agents `_ - `Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction `_ - `Ponder & Press: Advancing Visual GUI Agent towards General Computer Control `_ - `Aria-UI: Visual Grounding for GUI Instructions `_ ...