Vision

Topic related to vision

Simulating the Real World: Survey & Resources, which contains our survey "Simulating the Real World: A Unified Survey of Multimodal Generative Models" and Awesome-Text2X-Resources. Watch this repository for the latest updates! ๐Ÿ”ฅ

25814

Let Home Assistant see!

86368
Python

Plugin that lets you ask questions about your documents including audio and video files.

34144
Python

OpenKAI: A modern framework for unmanned vehicle and robot control

24595
C

PhotonVision is the free, fast, and easy-to-use computer vision solution for the FIRST Robotics Competition.

346240
Java

Stream-Omni is a GPT-4o-like language-vision-speech chatbot that simultaneously supports interaction across various modality combinations.

1716
Python

This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.

1108
Python

Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment: Robotic Learning Datasets and RT-X Models"

21323
Python

Recrafting Video Ads with Generative AI

12638
TypeScript

๐Ÿฆ„ Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data for end-to-end AI benchmarking

19958
Python

Topic Statistics

Related Topics