September 2024 - May 2025

Computer Vision Research

at UCSD Early Research Scholars Program (ERSP)

During my computer vision research at UCSD, I contributed to developing innovative methodologies for 3D scene understanding, evaluation, and synthesis by integrating large language models (LLMs) and vision-language models (VLMs) with a team of two other undergraduate researchers. Our work focused on leveraging these tools to address spatial reasoning challenges and optimize 3D scene layouts.

Key Contributions

  • Enhanced the SceneProgLLM framework by integrating advanced APIs like Anthropic and Ollama and enabling support for local images in scene rendering pipelines.
  • Collaborated on designing Domain-Specific Languages (DSLs) for scene synthesis to provide structured interactivity between LLMs and 3D databases like 3DFront.
  • Designed and implemented prompts to evaluate the impact of different ceiling light positions on 3D scene illumination, leveraging LLMs to generate scored image outputs with detailed reasoning.
  • Implemented functions in Blender for light placement optimization and explored both heuristic-based approaches and reinforcement learning techniques.

Methodology & Approach

Our research involved designing strategies to communicate global and local frames of reference to LLMs for tasks like object localization and lighting direction analysis. We integrated computer vision tools like SAM2, Detectron2, and CLIP into a visual evaluator pipeline to improve semantic filtering and depth estimation for scene evaluation. This work included experimenting with structured outputs under varying conditions and exploring the trade-offs between engineered DSLs and LLM-designed DSLs.

Research Showcase

Final Research Poster

We presented our findings at the Undergrad Engineering Research Symposium, summarizing our methodology for using agentic VLMs for 3D scene evaluation.

Final Research Poster for ERSP

Symposium & Team Photos

Our work involved regular team meetings, brainstorming sessions, and presenting at the final symposium.

ERSP team meeting screenshot 1

Research Discussion

ERSP team meeting screenshot 2

Planning Research Direction

ERSP team meeting screenshot 3

Analyzing Results

ERSP team meeting screenshot 4

Whiteboard Brainstorming

ERSP team with certificates

Symposium Completion

Presenting research poster

Poster Presentation

Technologies & Tools

  • Python
  • Blender
  • LangChain
  • CLIP
  • SAM2
  • Detectron2
  • LLaMA
  • OpenAI APIs
  • Docker
  • GitHub
← Back to All Experiences