Python 分布式計算模塊:Parallel

jopen 11年前發布 | 48K 次閱讀 Parallel Python開發

Parallel Python是一個Python模塊,它提供了Python代碼的在SMP(系統有多個處理器或內核)和集群(計算機通過網絡連接)上并行執行的機制。能夠將計算壓力分布到多核CPU或集群的多臺計算機上,能夠非常方便的在內網中搭建一個自組織的分布式計算平臺。先從多核計算開始,普通的Python應用程序只能夠使用一個CPU進程,而通過Parallel Python能夠很方便的將計算擴展到多個CPU進程。

特性:

  • Parallel execution of python code on SMP and clusters
  • Easy to understand and implement job-based parallelization technique (easy to convert serial application in parallel)
  • Automatic detection of the optimal configuration (by default the number of worker processes is set to the number of effective processors)
  • Dynamic processors allocation (number of worker processes can be changed at runtime)
  • Low overhead for subsequent jobs with the same function (transparent caching is implemented to decrease the overhead)
  • Dynamic load balancing (jobs are distributed between processors at runtime)
  • Fault-tolerance (if one of the nodes fails tasks are rescheduled on others)
  • Auto-discovery of computational resources
  • Dynamic allocation of computational resources (consequence of auto-discovery and fault-tolerance)
  • SHA based authentication for network connections
  • Cross-platform portability and interoperability (Windows, Linux, Unix, Mac OS X)
  • Cross-architecture portability and interoperability (x86, x86-64, etc.)
  • </ul>

    Example #1: sum_primes.py  
    
    #!/usr/bin/python
    # File: sum_primes.py
    # Author: VItalii Vanovschi
    # Desc: This program demonstrates parallel computations with pp module
    # It calculates the sum of prime numbers below a given integer in parallel
    # Parallel Python Software: http://www.parallelpython.com
    
    import math, sys, time
    import pp
    
    def isprime(n):
        """Returns True if n is prime and False otherwise"""
        if not isinstance(n, int):
            raise TypeError("argument passed to is_prime is not of 'int' type")
        if n < 2:
            return False
        if n == 2:
            return True
        max = int(math.ceil(math.sqrt(n)))
        i = 2
        while i <= max:
            if n % i == 0:
                return False
            i += 1
        return True
    
    def sum_primes(n):
        """Calculates sum of all primes below given integer n"""
        return sum([x for x in xrange(2,n) if isprime(x)])
    
    print """Usage: python sum_primes.py [ncpus]
        [ncpus] - the number of workers to run in parallel, 
        if omitted it will be set to the number of processors in the system
    """
    
    # tuple of all parallel python servers to connect with
    ppservers = ()
    #ppservers = ("10.0.0.1",)
    
    if len(sys.argv) > 1:
        ncpus = int(sys.argv[1])
        # Creates jobserver with ncpus workers
        job_server = pp.Server(ncpus, ppservers=ppservers)
    else:
        # Creates jobserver with automatically detected number of workers
        job_server = pp.Server(ppservers=ppservers)
    
    print "Starting pp with", job_server.get_ncpus(), "workers"
    
    # Submit a job of calulating sum_primes(100) for execution. 
    # sum_primes - the function
    # (100,) - tuple with arguments for sum_primes
    # (isprime,) - tuple with functions on which function sum_primes depends
    # ("math",) - tuple with module names which must be imported before sum_primes execution
    # Execution starts as soon as one of the workers will become available
    job1 = job_server.submit(sum_primes, (100,), (isprime,), ("math",))
    
    # Retrieves the result calculated by job1
    # The value of job1() is the same as sum_primes(100)
    # If the job has not been finished yet, execution will wait here until result is available
    result = job1()
    
    print "Sum of primes below 100 is", result
    
    start_time = time.time()
    
    # The following submits 8 jobs and then retrieves the results
    inputs = (100000, 100100, 100200, 100300, 100400, 100500, 100600, 100700)
    jobs = [(input, job_server.submit(sum_primes,(input,), (isprime,), ("math",))) for input in inputs]
    for input, job in jobs:
        print "Sum of primes below", input, "is", job()
    
    print "Time elapsed: ", time.time() - start_time, "s"
    job_server.print_stats()
    
    # Parallel Python Software: http://www.parallelpython.com

    項目主頁:http://www.baiduhome.net/lib/view/home/1383358139896

 本文由用戶 jopen 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!