School of Engineering, Computing and Mathematics

Optimizing Tensor Train Decomposition in DNNs for RISC-V Architectures Using Design Space Exploration and Compiler Optimizations

Theologos Anthimopoulos, School of Informatics, Aristotle University of Thessaloniki
Milad Kokhazadeh, School of Informatics, Aristotle University of Thessaloniki
Vasilios Kelefouras, School of Engineering, Computing and Mathematics
Benjamin Himpel, Informatics, Hochschule Reutlingen
Georgios Keramidas, School of Informatics

ORCID

Vasilios Kelefouras: 0000-0001-9591-913X

Abstract

Deep neural networks (DNNs) have become indispensable in many real-life applications like natural languageprocessing, and autonomous systems. However, deploying DNNs on resource-constrained devices, e.g., inRISC-V platforms, remains challenging due to the high computational and memory demands of fully connected(FC) layers, which dominate resource consumption. Low-rank factorization (LRF) offers an effective approachto compressing FC layers, but the vast design space of LRF solutions involves complex tradeoffs amongFLOPs, memory size, inference time, and accuracy, making the LRF process complex and time-consuming. Thisarticle introduces an end-to-end LRF design space exploration methodology and a specialized design tool foroptimizing FC layers on RISC-V processors. Using Tensor Train Decomposition (TTD) offered by TensorFlowT3F library, the proposed work prunes the LRF design space by excluding first, inefficient decomposition shapesand second, solutions with poor inference performance on RISC-V architectures. Compiler optimizations arethen applied to enhance custom T3F layer performance, minimizing inference time and boosting computationalefficiency. On average, our TT-decomposed layers run 3× faster than IREE and 8× faster than Pluto on thesame compressed model. This work provides an efficient solution for deploying DNNs on edge and embeddeddevices powered by RISC-V architectures.

DOI Link

10.1145/3768624

Publication Date

2025-10-24

Publication Title

ACM Transactions on Embedded Computing Systems

Volume

Issue

ISSN

1539-9087

Acceptance Date

2025-08-31

Deposit Date

2025-12-09

Additional Links

https://dl.acm.org/doi/10.1145/3768624, https://dl.acm.org/doi/full/10.1145/3768624

First Page

Last Page

Recommended Citation

Anthimopoulos, T., Kokhazadeh, M., Kelefouras, V., Himpel, B., & Keramidas, G. (2025) 'Optimizing Tensor Train Decomposition in DNNs for RISC-V Architectures Using Design Space Exploration and Compiler Optimizations', ACM Transactions on Embedded Computing Systems, 24(6), pp. 1-34. Available at: 10.1145/3768624

Download

COinS

School of Engineering, Computing and Mathematics

Optimizing Tensor Train Decomposition in DNNs for RISC-V Architectures Using Design Space Exploration and Compiler Optimizations

ORCID

Abstract

DOI Link

Publication Date

Publication Title

Volume

Issue

ISSN

Acceptance Date

Deposit Date

Additional Links

First Page

Last Page

Recommended Citation

Search

Browse

About

Links

School of Engineering, Computing and Mathematics

Optimizing Tensor Train Decomposition in DNNs for RISC-V Architectures Using Design Space Exploration and Compiler Optimizations

Authors

ORCID

Abstract

DOI Link

Publication Date

Publication Title

Volume

Issue

ISSN

Acceptance Date

Deposit Date

Additional Links

First Page

Last Page

Recommended Citation

Share

Search

Browse

About

Links