LiquidAI Releases New Liquid Foundation Models with Non-Transformer Architecture

TapTechNews October 2nd news, the LiquidAI company, which was just established last year, released three Liquid Foundation Models (LFM) on September 30th, namely LFM-1.3B, LFM-3.1B, and LFM-40.3B. These models all adopt a non-Transformer architecture and claim to outperform Transformer models of the same scale in benchmark tests.

 LiquidAI Releases New Liquid Foundation Models with Non-Transformer Architecture_0

TapTechNews noticed that currently, the industry mainly uses the Transformer architecture in deep learning and natural language processing. This architecture mainly uses the self-attention mechanism to capture the relationship between words in the sequence, including models such as OpenAI's GPT, Meta's BART, and Google's T5, which are all based on the Transformer architecture.

However,  LiquidAI goes against the trend. Its Liquid Foundation Model claims to have re-imagined the model architecture, reportedly deeply influenced by the concepts of traffic signal processing systems, numerical linear algebra, and focuses on versatility, able to model specific types of data while supporting the processing of content such as video, audio, text, time series, and traffic signals.

 LiquidAI Releases New Liquid Foundation Models with Non-Transformer Architecture_1

LiquidAI stated that compared to Transformer architecture models, the LFM model uses less RAM, especially in scenarios of handling a large amount of input content. Since Transformer architecture models need to save key-value (KV) caches when processing long inputs, and the cache will increase as the sequence length increases, resulting in more RAM usage as the input gets longer.

While the LFM model can avoid the above problems. The series of models can effectively compress the external input data and reduce the demand for hardware resources. Under the same hardware conditions, these three models can handle longer sequences than industry competitors.

Referring to the first batch of three models released by LiquidAI, among them, LFM-1.3B is designed for resource-constrained environments, while LFM-3.1B is optimized for edge computing, and LFM-40.3B is an expert hybrid model (MoE), and this version is mainly suitable for scenarios such as mathematical computing and traffic signal processing.

These models perform relatively prominently in the processing of general knowledge and professional knowledge, can efficiently handle long text tasks, and can also handle mathematical and logical reasoning tasks. Currently, this model mainly supports English, but also provides limited support for Chinese, French, German, Spanish, Japanese, Korean, and Arabic.

According to LiquidAI, LFM-1.3B has defeated other leading models with a 1B parameter scale in many benchmark tests, including Apple's OpenELM, Meta's Llama3.2, Microsoft's Phi1.5, and Stability's StableLM2, marking the first time that a non-GPT architecture model has significantly surpassed a Transformer model.

In the case of LFM-3.1B, this model can not only surpass various Transformer models, hybrid models, and RNN models of the 3B scale, but even surpass the previous-generation 7B and 13B scale models in specific scenarios. It has currently defeated Google's Gemma2, Apple's AFMEdge, Meta's Llama3.2, and Microsoft's Phi-3.5, etc.

LFM-40.3B emphasizes the balance between model scale and output quality, but this model has some limitations. Although it has 40 billion parameters, only 12 billion parameters are enabled during inference. LiquidAI claims that imposing relevant limitations is because the model output quality is already sufficient, and in this case, imposing corresponding limitations can instead improve the model efficiency and reduce the hardware configuration required for the model to run.

 LiquidAI Releases New Liquid Foundation Models with Non-Transformer Architecture_2

Likes