6 13 6

Renrui

ZrrSkywalker

https://github.com/ZrrSkywalker

ZrrSkywalker

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper 4 days ago

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

commented on a paper 4 days ago

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

upvoted a paper 25 days ago

Artificial Hippocampus Networks for Efficient Long-Context Modeling

View all activity

Organizations

authored 20 papers over 1 year ago

MAVIS: Mathematical Visual Instruction Tuning

Paper • 2407.08739 • Published Jul 11, 2024 • 33

LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

Paper • 2407.07895 • Published Jul 10, 2024 • 42

LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

Paper • 2303.16199 • Published Mar 28, 2023 • 4

Retrieving-to-Answer: Zero-Shot Video Question Answering with Frozen Large Language Models

Paper • 2306.11732 • Published Jun 15, 2023

Not All Features Matter: Enhancing Few-shot CLIP with Adaptive Prior Refinement

Paper • 2304.01195 • Published Apr 3, 2023

MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection

Paper • 2203.13310 • Published Mar 24, 2022

ImageBind-LLM: Multi-modality Instruction Tuning

Paper • 2309.03905 • Published Sep 7, 2023 • 17

Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners

Paper • 2303.02151 • Published Mar 3, 2023

PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning

Paper • 2211.11682 • Published Nov 21, 2022

Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis

Paper • 2303.08134 • Published Mar 14, 2023

Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders

Paper • 2212.06785 • Published Dec 13, 2022

Decorate the Newcomers: Visual Domain Prompt for Continual Test Time Adaptation

Paper • 2212.04145 • Published Dec 8, 2022

EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding

Paper • 2209.14941 • Published Sep 29, 2022

MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning

Paper • 2310.03731 • Published Oct 5, 2023 • 29

Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature Mimicking

Paper • 2303.05475 • Published Mar 9, 2023

SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models

Paper • 2311.07575 • Published Nov 13, 2023 • 15

A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise

Paper • 2312.12436 • Published Dec 19, 2023 • 15

Renrui

AI & ML interests

Recent Activity

Organizations

ZrrSkywalker's activity