Table 4
Comparison of mainstream AI models for scientific and industrial applications and their potential applications in LIBs research
| Model category | Core algorithm | Inference paradigm | Distinctive features | Representative applications | Data requirement | Potential applications in LIBs |
| Transformer | • Self-attention, feedforward networks (FFN) | • Parallel processing, autoregressive inference | • Encoder-decoder structure, strong long-range dependency modeling, high computational cost | • Natural language processing (NLP), vision transformers (ViT), multimodal learning | • Requires large-scale datasets, applicable to diverse modalities (text, image,audio) | • Process sequential relationships in complex battery data, enhance predictive modeling for material properties, degradation patterns, and optimization strategies |
| BERT | • Pretraining-finetuning, masked language model (MLM) | • Context-aware masked token prediction | • Encoder-only architecture, bidirectional contextual understanding, optimized for text comprehension | • Machine reading comprehension, sentiment analysis, named entity recognition | • Trained on arge-scale unsupervised text corpora, requires task-specific finetuning | • Extract key information from scientific literature, patents, and technical documents to assist in material discovery, process optimization, and R&D decision-making |
| GPT series | • Autoregressive language modeling, transformer decoder | • Token-by-token generation, probabilistic sampling | • Decoder-only model, strong generative capability, constrained by prior context | • Text generation (ChatGPT), conversational AI, code synthesis, creative content generation | • Pretrained on vast web-scale text, fine-tuning enhances controllability and factual accuracy | • Generate structured technical reports, automate experimental documentation, suggest optimized experimental designs, and support AI-driven material screening |
| T5 | • Text-to-text, sequence-to-sequence (Seq2Seq) | • Converts all NLP tasks into text generation | • Full encoder-decoder framework, reformulates NLP tasks into a unified text generation problem | • Machine translation, text summarization, knowledge extraction, question answering | • Span-corruption-based pretraining, adaptable to diverse text-based applications | • Standardize battery simulation and experimental data formats, improve data interoperability, and enhance analysis consistency across different datasets |
| Hybrid models | • Transformer + CNN/RNN/GNN (multimodal Fusion) | • Task-adaptive, hierarchical reasoning | • Integrates multiple architectures, enhances multimodal representation learning, improves computational efficiency | • Speech recognition (Whisper), video analysis, multimodal information retrieval | • Requires cross-domain labeled datasets, optimized for heterogeneous data fusion | • Integrate multimodal data (e.g., material composition, electrochemical performance, and manufacturing parameters) to optimize battery design and performance prediction |
| Diffusion models | • Variational inference, Markov process | • Iterative denoising-based generative modeling | • Learns data distribution via noise transformation, generates high-fidelity synthetic outputs | • Image and video generation (stable diffusion, midjourney), 3D modeling, medical image synthesis | • Computationally intensive, requires extensive high-quality labeled data | • Generate novel battery material structures, enhance simulation data diversity, predict molecular configurations, and improve experimental efficiency by reducing trial-and-error cycles |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.
