LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent
The article presents LLM-Grounder, a new method for 3D visual grounding that uses a large language model (LLM) to decompose complex natural language queries into semantic constituents. The LLM then…
Continue reading