M2tr: Multi-modal multi-scale transformers for deepfake detection J Wang, Z Wu, W Ouyang, X Han, J Chen, YG Jiang, SN Li Proceedings of the 2022 international conference on multimedia retrieval …, 2022 | 237 | 2022 |
Omnivl: One foundation model for image-language and video-language tasks J Wang, D Chen, Z Wu, C Luo, L Zhou, Y Zhao, Y Xie, C Liu, YG Jiang, ... Advances in neural information processing systems 35, 5696-5710, 2022 | 131 | 2022 |
Objectformer for image manipulation detection and localization J Wang, Z Wu, J Chen, X Han, A Shrivastava, SN Lim, YG Jiang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 114 | 2022 |
Depth guided adaptive meta-fusion network for few-shot video recognition Y Fu, L Zhang, J Wang, Y Fu, YG Jiang Proceedings of the 28th ACM International Conference on Multimedia, 1142-1151, 2020 | 87 | 2020 |
Efficient video transformers with spatial-temporal token selection J Wang, X Yang, H Li, L Liu, Z Wu, YG Jiang European Conference on Computer Vision, 69-86, 2022 | 62 | 2022 |
To see is to believe: Prompting gpt-4v for better visual instruction tuning J Wang, L Meng, Z Weng, B He, Z Wu, YG Jiang arXiv preprint arXiv:2311.07574, 2023 | 45 | 2023 |
Chatvideo: A tracklet-centric multimodal and versatile video understanding system J Wang, D Chen, C Luo, X Dai, L Yuan, Z Wu, YG Jiang arXiv preprint arXiv:2304.14407, 2023 | 37 | 2023 |
Look Before You Match: Instance Understanding Matters in Video Object Segmentation J Wang, D Chen, Z Wu, C Luo, C Tang, X Dai, Y Zhao, Y Xie, L Yuan, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 31 | 2023 |
Ft-tdr: Frequency-guided transformer and top-down refinement network for blind face inpainting J Wang, S Chen, Z Wu, YG Jiang IEEE Transactions on Multimedia 25, 2382-2392, 2022 | 24 | 2022 |
Omnitracker: Unifying object tracking by tracking-with-detection J Wang, D Chen, Z Wu, C Luo, X Dai, L Yuan, YG Jiang arXiv preprint arXiv:2303.12079, 2023 | 14 | 2023 |
Fighting malicious media data: A survey on tampering detection and deepfake detection J Wang, Z Li, C Zhang, J Chen, Z Wu, LS Davis, YG Jiang arXiv preprint arXiv:2212.05667, 2022 | 6 | 2022 |
MouSi: Poly-Visual-Expert Vision-Language Models X Fan, T Ji, C Jiang, S Li, S Jin, S Song, J Wang, B Hong, L Chen, ... arXiv preprint arXiv:2401.17221, 2024 | 5 | 2024 |
OmniVid: A Generative Framework for Universal Video Understanding J Wang, D Chen, C Luo, B He, L Yuan, Z Wu, YG Jiang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 4 | 2024 |
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation J Wang, Y Jiang, Z Yuan, B Peng, Z Wu, YG Jiang arXiv preprint arXiv:2406.09399, 2024 | 2 | 2024 |