TapTechNews October 5th news, the tech media NeoWin released a blog post yesterday (October 4th), reporting that Google is about to commercialize the Gemini 1.5 Flash 8B model, becoming Google's cheapest AI model.
TapTechNews reported in August this year that Google launched 3 experimental models of Gemini, among which Gemini 1.5 Flash 8B is a smaller-sized model of Gemini 1.5 Flash, with 8 billion parameters, specifically designed for multimodal tasks, including large-capacity tasks and long-text summarization tasks.
Compared to the original Gemini 1.5 Flash, Gemini 1.5 Flash 8B has lower latency and is especially suitable for tasks such as chatting, transcription, and long-text translation.
Another highlight of Gemini 1.5 Flash 8B is its affordable price. The relevant billing will take effect on Monday, October 14th. TapTechNews attaches the following relevant information:
Under a context window of less than 128K, the cost per million tokens input prompt word is $0.0375 (about 0.26 RMB currently).
Under a context window of less than 128K, the cost per million tokens output prompt word is $0.15 (about 1.1 RMB currently).
Under a context window of less than 128K, the cost per million tokens cache prompt word is $0.01 (about 0.071 RMB currently).
As a comparison, the cost per million output tokens of the Gemini 1.5 Flash model is $0.3, and this price was implemented on August 12th, 2024, meaning that the price of the new version of Gemini 1.5 Flash 8B is directly halved compared to the original.