Multimodal Learning

Intuitively, humans understand and infer among multiple types of data such as image, video and text, which are often represented by vector subjecting to very different distributions. How to integrate multimodal data for understanding the world is critical to boosting the power of general AI.

Avatar
Zhiqian Chen
Ph.D. Candidate in Computer Science

Zhiqian Chen is a Ph.D. candidate at Department of Computer Science, Virginia Tech, focusing on AI and interdisciplinary research.