Attended NeurIPS 2024 in Vancouver and presented two papers:

  • SpatialEval: We took a fresh look at how language models and vision-language models handle spatial reasoning. The twist? We tested them on our new benchmark SpatialEval across TQA, VQA, and VTQA. Found some pretty surprising results!
  • GAD: Ever wondered if constrained decoding changes how LLMs actually behave? We proved it does - and came up with the first solution to fix the distribution distortion problem.