| Peer-Reviewed

Complex Performance Modeling of Parallel Algorithms

Received: 14 July 2014     Accepted: 18 July 2014     Published: 31 July 2014
Views:       Downloads:
Abstract

Parallel principles are the most effective way how to increase parallel computer performance and parallel algorithms (PA) too. In this sense the paper is devoted to a complex performance evaluation of chosen PA. At first the paper describes very shortly PA and then it summarized basic concepts for performance evaluation of PA. To illustrate the analyzed evaluation concepts the paper considers in its experimental part the results for real analyzed examples of discrete fast Fourier transform (DFFT). These illustration examples we have chosen first due to its wide application in scientific and engineering fields and second from its representation of similar group of PA. The basic form of parallel DFFT is the one-dimensional (1-D), unordered, radix–2 algorithm which uses divide and conquer strategy for its parallel computation. Effective PA of DFFT tends to computing one – dimensional FFT with radix greater than two and computing multidimensional FFT by using the polynomial transfer methods. In general radix - q DFFT is computed by splitting the input sequence of size s into q sequences each of them in size n/q, computing faster their q smaller DFFT’s, and then combining the results. So we do it for actually dominant asynchronous parallel computers based on Network of workstations (NOW) and Grid systems.

Published in American Journal of Networks and Communications (Volume 3, Issue 5-1)

This article belongs to the Special Issue Parallel Computer and Parallel Algorithms

DOI 10.11648/j.ajnc.s.2014030501.12
Page(s) 15-28
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2014. Published by Science Publishing Group

Keywords

Parallel Computer, NOW, Grid, Parallel Algorithm (PA), Matrix PA, Decomposition, Performance Modeling, Optimization, Issoeficiency Function, Numerical Integration, Discrete Fast Fourier Transform (DFFT), Overhead Function

References
[1] Abderazek A. B., Multicore systems on chip - Practical Software/Hardware design, Imperial college press, pp. 200, 2010
[2] Arora S., Barak B., Computational complexity - A modern Approach, Cambridge University Press, pp. 573, 2009
[3] Casanova H., Legrand A., Robert Y., Parallel algorithms, CRC Press, 2008
[4] Desel J., Esperza J., Free Choise Petri Nets, Cambridge University Press, UK, pp. 256, 2005
[5] Dubois M., Annavaram M., Stenstrom P., Parallel Computer Organization and Design, pp. 560, 2012
[6] Dubhash D.P., Panconesi A., Concentration of measure for the analysis of randomized algorithms, Cambridge University Press, UK, 2009
[7] Edmonds J., How to think about algorithms, Cambridge University Press, UK, pp. 472, 2010
[8] Gelenbe E., Analysis and synthesis of computer systems, Imperial College Press, pp. 324, 2010
[9] Hager G., Wellein G., Introduction to High Performance Computing for Scientists and Engineers, pp. 356, July 2010
[10] Hanuliak P., Analytical method of performance prediction in parallel algorithms, The Open Cybernetics and Systemics Journal, Vol. 6, Bentham, UK, pp. 38-47, 2012
[11] Hanuliak M., Unified analytical models of parallel and distributed computing, American J. of Networks and Communication, Science PG, Vol. 3, No. 1, USA, pp. 1-12, 2014
[12] Hanuliak M., Hanuliak I., To the correction of analytical models for computer based communication systems, Kybernetes, Vol. 35, No. 9, UK, pp. 1492-1504, 2006
[13] Hanuliak M., Performance modeling of Nov and Grid parallel computers, AD ALTA – Vol. 3, issue 2, Hradec Kralove, Czech republic, pp. 91-96, 2013
[14] Harchol-BalterMor, Performance modeling and design of computer systems, Cambridge University Press, UK, pp. 576, 2013
[15] Hwang K. and coll., Distributed and Parallel Computing, Morgan Kaufmann, pp. 472, 2011
[16] John L. K., Eeckhout L., Performance evaluation and benchmarking, CRC Press, 2005
[17] Kshemkalyani A. D., Singhal M., Distributed Computing, University of Illinois, Cambridge University Press, UK, pp. 756, 2011
[18] Kirk D. B., Hwu W. W., Programming massively parallel processors, Morgan Kaufmann, pp. 280, 2010
[19] Kostin A., Ilushechkina L., Modeling and simulation of distributed systems, Imperial College Press, pp. 440, 2010
[20] Kumar A., Manjunath D., Kuri J., Communication Networking , Morgan Kaufmann, pp. 750, 2004
[21] Kushilevitz E., Nissan N., Communication Complexity, Cambridge University Press, UK, pp. 208, 2006
[22] Kwiatkowska M., Norman G., and Parker D., PRISM 4.0: Verification of Probabilistic Real-time Systems, In Proc. of 23rd CAV’11, Vol. 6806, Springer, pp. 585-591, 2011
[23] Le Boudec Jean-Yves, Performance evaluation of computer and communication systems, CRC Press, pp. 300, 2011
[24] McCabe J., D., Network analysis, architecture, and design (3rd edition), Elsevier/ Morgan Kaufmann, pp. 496, 2010
[25] Meerschaert M., Mathematical modeling (4-th edition), Elsevier, pp. 384, 2013
[26] Misra Ch. S.,Woungang I., Selected topics in communication network and distributed systems, Imperial college press, pp. 808, 2010
[27] Miller S., Probability and Random Processes, 2nd edition, Academic Press, Elsevier Science, pp. 552, 2012
[28] Peterson L. L., Davie B. C., Computer networks – a system approach, Morgan Kaufmann, pp. 920, 2011
[29] Resch M. M., Supercomputers in Grids, Int. J. of Grid and HPC, No.1, pp. 1-9, 2009
[30] Riano l., McGinity T.M., Quantifying the role of complexity in a system´s performance, Evolving Systems, Springer Verlag, pp. 189–198, 2011
[31] Ross S. M., Introduction to Probability Models, 10th edition, Academic Press, Elsevier Science, pp. 800, 2010
[32] Takahashi D., Kanada Y.: High-performance radix-2, 3 and 5 parallel 1-D complex FFT algorithms for distributed-memory parallel computers, J. of Supercomputing, 15, Kluwer Academic Publishers, The Netherlands, pp. 207-228, 2000
[33] Wang L., Jie Wei., Chen J., Grid Computing: Infrastructure, Service, and Application, CRC Press, 2009 www pages
[34] www.top500.org
[35] www.intel.com
[36] www.spec.org.
Cite This Article
  • APA Style

    Peter Hanuliak, Juraj Hanuliak. (2014). Complex Performance Modeling of Parallel Algorithms. American Journal of Networks and Communications, 3(5-1), 15-28. https://doi.org/10.11648/j.ajnc.s.2014030501.12

    Copy | Download

    ACS Style

    Peter Hanuliak; Juraj Hanuliak. Complex Performance Modeling of Parallel Algorithms. Am. J. Netw. Commun. 2014, 3(5-1), 15-28. doi: 10.11648/j.ajnc.s.2014030501.12

    Copy | Download

    AMA Style

    Peter Hanuliak, Juraj Hanuliak. Complex Performance Modeling of Parallel Algorithms. Am J Netw Commun. 2014;3(5-1):15-28. doi: 10.11648/j.ajnc.s.2014030501.12

    Copy | Download

  • @article{10.11648/j.ajnc.s.2014030501.12,
      author = {Peter Hanuliak and Juraj Hanuliak},
      title = {Complex Performance Modeling of Parallel Algorithms},
      journal = {American Journal of Networks and Communications},
      volume = {3},
      number = {5-1},
      pages = {15-28},
      doi = {10.11648/j.ajnc.s.2014030501.12},
      url = {https://doi.org/10.11648/j.ajnc.s.2014030501.12},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajnc.s.2014030501.12},
      abstract = {Parallel principles are the most effective way how to increase parallel computer performance and parallel algorithms (PA) too. In this sense the paper is devoted to a complex performance evaluation of chosen PA. At first the paper describes very shortly PA and then it summarized basic concepts for performance evaluation of PA. To illustrate the analyzed evaluation concepts the paper considers in its experimental part the results for real analyzed examples of discrete fast Fourier transform (DFFT). These illustration examples we have chosen first due to its wide application in scientific and engineering fields and second from its representation of similar group of PA. The basic form of parallel DFFT is the one-dimensional (1-D), unordered, radix–2 algorithm which uses divide and conquer strategy for its parallel computation. Effective PA of DFFT tends to computing one – dimensional FFT with radix greater than two and computing multidimensional FFT by using the polynomial transfer methods. In general radix - q DFFT is computed by splitting the input sequence of size s into q sequences each of them in size n/q, computing faster their q smaller DFFT’s, and then combining the results. So we do it for actually dominant asynchronous parallel computers based on Network of workstations (NOW) and Grid systems.},
     year = {2014}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Complex Performance Modeling of Parallel Algorithms
    AU  - Peter Hanuliak
    AU  - Juraj Hanuliak
    Y1  - 2014/07/31
    PY  - 2014
    N1  - https://doi.org/10.11648/j.ajnc.s.2014030501.12
    DO  - 10.11648/j.ajnc.s.2014030501.12
    T2  - American Journal of Networks and Communications
    JF  - American Journal of Networks and Communications
    JO  - American Journal of Networks and Communications
    SP  - 15
    EP  - 28
    PB  - Science Publishing Group
    SN  - 2326-8964
    UR  - https://doi.org/10.11648/j.ajnc.s.2014030501.12
    AB  - Parallel principles are the most effective way how to increase parallel computer performance and parallel algorithms (PA) too. In this sense the paper is devoted to a complex performance evaluation of chosen PA. At first the paper describes very shortly PA and then it summarized basic concepts for performance evaluation of PA. To illustrate the analyzed evaluation concepts the paper considers in its experimental part the results for real analyzed examples of discrete fast Fourier transform (DFFT). These illustration examples we have chosen first due to its wide application in scientific and engineering fields and second from its representation of similar group of PA. The basic form of parallel DFFT is the one-dimensional (1-D), unordered, radix–2 algorithm which uses divide and conquer strategy for its parallel computation. Effective PA of DFFT tends to computing one – dimensional FFT with radix greater than two and computing multidimensional FFT by using the polynomial transfer methods. In general radix - q DFFT is computed by splitting the input sequence of size s into q sequences each of them in size n/q, computing faster their q smaller DFFT’s, and then combining the results. So we do it for actually dominant asynchronous parallel computers based on Network of workstations (NOW) and Grid systems.
    VL  - 3
    IS  - 5-1
    ER  - 

    Copy | Download

Author Information
  • Dubnica Technical Institute, Sladkovicova 533/20, Dubnica nad Vahom, 018 41, Slovakia

  • Dubnica Technical Institute, Sladkovicova 533/20, Dubnica nad Vahom, 018 41, Slovakia

  • Sections