transcv creates vision transformers for visual recognition which can be pre-trained using self-supervised learning
pip install transcv
Along with transcv, we also need fastai and nbdev. So, it is recommended to use :
pip install fastai nbdev transcv -q --upgrade
from transcv.visrectrans import VisRecTrans
vis_rec_ob = VisRecTrans('vit_small_patch16_224', 10, False)
model = vis_rec_ob.create_model()
vis_rec_ob.initialize(model)
embed_callback = vis_rec_ob.get_callback()
Now, the model
, along with the embed_callback
, can be used with the Learner class, of fastai, and can be fine-tuned on any image classification dataset. For the details of the visual recognition part, please see VisRecTrans
.
from transcv.swin import SwinT
swint_ob = SwinT('swin_base_patch4_window7_224', pretrained = False, num_classes = 10)
swin_model = swint_ob.get_model()
assert isinstance(swin_model, nn.Sequential)
For self-supervised learning tutorials, please see Self-supervised learning with ViT
.