Did you even look at that? Your own link disproves your claim. It’s just a general AI model that powers a variety of tasks, and is integrated into apps.
You are aware that those are often called LMMs, Large Multimodal Model. And one of the modes that makes it multi-modal is Language. All LMMs are or contain an LLM.
But thank you for moving the goalposts and making it clear you don’t know what you’re talking about and have no interest in an honest discussion. Goodbye.
Did you even look at that? Your own link disproves your claim. It’s just a general AI model that powers a variety of tasks, and is integrated into apps.
What else would it be except an llm? What do you think model means?
…what do you think LLM means?
Large language model
Large language model.
You are aware AI is used for more than just reading and generating text?
You are aware that those are often called LMMs, Large Multimodal Model. And one of the modes that makes it multi-modal is Language. All LMMs are or contain an LLM.
LLMs are not called LMMs, they’re called LLMs LOL
But thank you for moving the goalposts and making it clear you don’t know what you’re talking about and have no interest in an honest discussion. Goodbye.
https://github.com/haotian-liu/LLaVA
I don’t think Google actually uses LLava but the concept is the same. The data gets converted into text for the model to process.
How do you convert text to images?
When are you going to admit you have no idea what you are talking about?
An LLM literally is a “general AI model that powers a variety of tasks”.
When are you going to admit you have no idea what you are talking about?
An LLM literally is not a “general AI model”, it’s a Large Language Model, as in it processes language.
I’m going to be honest, I actually know a lot more than I can say on this matter. But believe me Gemini Nano is a multimodal LLM.
I spoke to Google engineers about this a few months ago: