- No inheritance, encapsulation and immutability, making code more readable
- Remove mask token and encode visible patches only, inspired from MAE and MSN
- Switch from batch-level randomization (shuffling and masking) to sample-level, and remove shuffle ratio for simplicity
- Compute loss during forward