Skip to content

vllm.transformers_utils.configs.colmodernvbert

Configuration for ColModernVBERT visual document retrieval model.

ColModernVBERT combines SigLIP vision encoder + ModernBERT text encoder with a pixel shuffle connector and ColBERT-style 128-dim per-token embeddings.

Reference: https://huggingface.co/ModernVBERT/colmodernvbert-merged