Projects
Projects
A web app that clones the voice of target Nepali speaker and synthesizes the input text to speech in target speaker's voice.
An embodied agent navigating a 3D Gaussian splat environment using Unity's physics engine, guided by natural language instructions processed through a combined LLM-VLM model.