Mehdi Goli
Mehdi Goli
Principal Software Engineer - AI parallelisation
Verified email at codeplay.com
Title
Cited by
Cited by
Year
Parallel patterns for heterogeneous CPU/GPU architectures: Structured parallelism from cluster to cloud
S Campa, M Danelutto, M Goli, H González-Vélez, AM Popescu, ...
Future Generation Computer Systems 37, 354-366, 2014
252014
Heterogeneous algorithmic skeletons for fast flow with seamless coordination over hybrid architectures
M Goli, H González-Vélez
2013 21st Euromicro International Conference on Parallel, Distributed, and …, 2013
252013
A new vertical fragmentation algorithm based on ant collective behavior in distributed database systems
M Goli, SMTR Rankoohi
Knowledge and Information Systems 30 (2), 435-455, 2012
202012
Streaming dynamic coarse-grained CPU/GPU workloads with heterogeneous pipelines in FastFlow
M Goli, MT Garba, H Gonzláez–Vélez
2012 IEEE 14th International Conference on High Performance Computing and …, 2012
152012
Accelerated machine learning using TensorFlow and SYCL on OpenCL Devices
M Goli, L Iwanski, A Richards
Proceedings of the 5th International Workshop on OpenCL, 1-4, 2017
142017
Mapping parallel programs to heterogeneous CPU/GPU architectures using a Monte Carlo Tree Search
M Goli, J McCall, C Brown, V Janjic, K Hammond
2013 IEEE Congress on Evolutionary Computation, 2932-2939, 2013
122013
SYCL-BLAS: leveraging expression trees for linear algebra
JI Aliaga, R Reyes, M Goli
Proceedings of the 5th International Workshop on OpenCL, 1-5, 2017
92017
N‐body computations using skeletal frameworks on multicore CPU/graphics processing unit architectures: an empirical performance evaluation
M Goli, H González–Vélez
Concurrency and Computation: Practice and Experience 26 (4), 972-986, 2014
92014
Visioncpp: A sycl-based computer vision framework
M Goli
Proceedings of the 4th International Workshop on OpenCL, 1-4, 2016
62016
Autonomic coordination of skeleton-based applications over CPU/GPU multi-core architectures
M Goli, H González–Vélez
International Journal of Parallel Programming 45 (2), 203-224, 2017
32017
Cross-Platform Performance Portability Using Highly Parametrized SYCL Kernels
J Lawson, M Goli, D McBain, D Soutar, L Sugy
arXiv preprint arXiv:1904.05347, 2019
22019
TensorFlow Acceleration on ARM Hikey Board
M Goli, L Iwanski, J Lawson, U Dolinsky, A Richards
Proceedings of the International Workshop on OpenCL, 1-4, 2018
22018
Formalised composition and interaction for heterogeneous structured parallelism
M Goli, H González-Vélez
International Journal of Parallel Programming 46 (1), 120-151, 2018
22018
Autonomic behavioural framework for structural parallelism over heterogeneous multi-core systems.
M Goli
22015
OpenCL Acceleration for TensorFlow
M Goli, L Iwanski, J Lawson, U Dolinsky, A Richards
arXiv preprint arXiv:1605.02688, 1-3, 2018
12018
SYCL-BLAS: Combining expression trees and kernel fusion on heterogeneous systems
JI Aliaga, R Reyes, M Goli
12017
Toward Performance Portability of Highly Parametrizable TRSM Algorithm Using SYCL
T Sabino, M Goli
International Workshop on OpenCL, 1-10, 2021
2021
Towards Cross-Platform Performance Portability of DNN Models using SYCL
M Goli, K Narasimhan, R Reyes, B Tracy, D Soutar, S Georgiev, ...
2020 IEEE/ACM International Workshop on Performance, Portability and …, 2020
2020
Programming Heterogeneous Parallel Machines Using Refactoring and Monte–Carlo Tree Search
C Brown, V Janjic, M Goli, J McCall
International journal of parallel programming 48 (4), 583-602, 2020
2020
The system can't perform the operation now. Try again later.
Articles 1–19