Skip to content
@xlang-ai

XLANG NLP Lab

Developing embodied AI agents that empower users to use language to interact with digital and physical environments to carry out real-world tasks.

Welcome to the Executable Language Grounding (XLANG) Lab! We are part of the HKU NLP Group at the University of Hong Kong. XLang focuses on building language model agents that transform (“grounding”) language instructions into code or actions executable in real-world environments, including databases (data agent), web applications (plugins/web agent), and the physical world (robotic agent) etc,. It lies at the heart of language model agents or natural language interfaces that can interact with and learn from these real-world environments to facilitate human interaction with data analysis, web applications, and robotic instruction through conversation. Recent advances in XLang incorporate techniques such as LLM + external tools, code generation, semantic parsing, and dialog or interactive systems.

Pinned Loading

  1. OSWorld OSWorld Public

    [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

    Python 1.6k 170

  2. aguvis aguvis Public

    Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

    Python 204 12

  3. OpenAgents OpenAgents Public

    [COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild

    Python 4.1k 456

  4. instructor-embedding instructor-embedding Public

    [ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings

    Python 1.9k 143

  5. text2reward text2reward Public

    [ICLR 2024 Spotlight] Code for the paper "Text2Reward: Reward Shaping with Language Models for Reinforcement Learning"

    Jupyter Notebook 146 8

  6. DS-1000 DS-1000 Public

    [ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".

    Python 233 26

Repositories

Showing 10 of 21 repositories
  • BRIGHT Public

    BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

    xlang-ai/BRIGHT’s past year of commit activity
    Python 73 CC-BY-4.0 6 1 1 Updated Feb 12, 2025
  • Spider2 Public

    [ICLR 2025] Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

    xlang-ai/Spider2’s past year of commit activity
    HTML 321 Apache-2.0 36 28 0 Updated Feb 12, 2025
  • OSWorld Public

    [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

    xlang-ai/OSWorld’s past year of commit activity
    Python 1,601 Apache-2.0 170 40 0 Updated Feb 10, 2025
  • computer-agent-arena-hub Public

    Computer Agent Arena Hub: Compare & Test AI Agents on Crowdsourced Real-World Computer Use Tasks

    xlang-ai/computer-agent-arena-hub’s past year of commit activity
    Python 10 0 0 0 Updated Feb 10, 2025
  • verl Public Forked from volcengine/verl

    veRL: Volcano Engine Reinforcement Learning for LLM

    xlang-ai/verl’s past year of commit activity
    Python 1 Apache-2.0 251 0 0 Updated Jan 27, 2025
  • xlang-ai.github.io Public

    The official website of xlang.ai

    xlang-ai/xlang-ai.github.io’s past year of commit activity
    TypeScript 2 0 0 0 Updated Jan 25, 2025
  • instructor-embedding Public

    [ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings

    xlang-ai/instructor-embedding’s past year of commit activity
    Python 1,910 Apache-2.0 143 31 4 Updated Jan 15, 2025
  • aguvis Public

    Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

    xlang-ai/aguvis’s past year of commit activity
    Python 204 12 13 0 Updated Jan 14, 2025
  • text2reward Public

    [ICLR 2024 Spotlight] Code for the paper "Text2Reward: Reward Shaping with Language Models for Reinforcement Learning"

    xlang-ai/text2reward’s past year of commit activity
    Jupyter Notebook 146 8 2 0 Updated Dec 17, 2024
  • EVOR Public
    xlang-ai/EVOR’s past year of commit activity
    Python 55 Apache-2.0 6 3 0 Updated Dec 15, 2024

Top languages

Loading…

Most used topics

Loading…