← Back to context

Comment by teleforce

2 years ago

The transformer model and architecture is not limited to text-based token input but again I'm not the expert on how this new LLM model namely Gemini are being implemented, and whether the text-based token is necessary. For Gemini, if Google has truly cracked the native multi-modal input without the limitation of text-based input then it's really novel and revolutionary as they claimed it to be.