This is a simple baseline (ESRGAN) trained using synthetic data from our CVPR paper MARCONet. This model is trained on Chinese and English Characters. When the degradation is not severe, it may also ...
Four graduate students sheltered in place for 90 minutes inside the building where a shooting occurred at Brown University. The students hid under desks in a locked office after receiving text alerts ...
Abstract: Pre-trained vision-language models (VLMs) are the de-facto foundation models for various downstream tasks. However, scene text recognition methods still prefer backbones pre-trained on a ...
Abstract: Text-supervised semantic segmentation is a novel research topic that allows semantic segments to emerge with image-text contrasting. However, pioneering methods could be subject to ...