Training a Vision Language Model as Smartphone Assistant

5 mins read