Archive for July, 2009

Filed Under (Pylons) by Marcin Kuźmiński on July-30-2009

In python a nice function divmod does the trick.

For example:

def sec2min(sec):
    return divmod(sec,60)

now sec2min(181) outputs (3,1) 3minutes and 1 seconds ,you can obtain hours,days etc using divmod :)



Filed Under (Django, Python) by Łukasz Balcerzak on July-29-2009

With almost 4 months late here it comes, Django 1.1. I’ve been waiting to see this at the final stage, as we use Django in our company, and I was sometimes forced to code down things that are already boundled now, like sql aggregation functionality for instance (well, in fact I just wrote a little django app using full force of SQLAlchemy instead of Django ORM). Additionally, some security patch was released yesterday, which is included with 1.1 too.

You may read more at release notes page . Stay tuned with Django and minor swing ;-)



Filed Under (Multiprocessing, Python) by Marcin Kuźmiński on July-17-2009

I’d started to play with multiprocessing module that came up with python 2.6. Multiprocessing module is very similar to threading (it has almost the same functions/classes that threading).

Here are by my opinion three advantages over threading.

  • Multiprocessing runs on processes not threads.
  • Overcomes the GIL (global interpreter lock) that threading is using by using sub processes.
  • Processes can be synchronized even remotely so we could write a concurrent calculations over the network

I made a simple example class that uses multiprocessing module to scan ports, just for testing i made the same thing that Lukasz made using threading. This example however is not more efficient than the one presented in http://www.python-blog.com/2009/07/01/python-threaded/ but when we could replace the function check_port with more CPU consuming function we could end up with performance grater by the number of cpus/cores we have. For example in one of my projects recently i made a calculations for popular gambling game in Poland, MultiMulti. I’d made up a calculations of most repeating number in last 50 games. To calculate combination of 9’s over 20’s 50 times with threading i’d got around 1200s with multiprocessing i was able to calculate it in around 700 s with my Core2Duo CPU. I wish i had core2 quad to check the performance :) So i’f you need to make some heavy calculations multiprocessing can give you that performance.

Here’s the code and you can download the port descriptions file port_list to match port with description.

from multiprocessing import Process, Queue, cpu_count, Lock
import socket, sys

class PortScanner(object):
    ''' multiprocessing port scanner with port description'''

    def __init__(self, host = '' , port_range = (1, 100), nr_processes = cpu_count(), port_list_file = ''):
        '''
        port_range=(start,stop) default 1,100
        nr_processes = int default cpu_count() '''
        q = Queue()
        l = Lock()
        port_list = []

        try:
            for i in open(port_list_file).readlines():
                port_list.append([x.strip() for x in i.split('\t')])

        except IOError:
            print 'no port list file specified'
            pass

        for _ in xrange(port_range[0], port_range[1]):
            q.put((host, _))

        #to stop all processes we have to put STOP to queue and break the loop for each process
        for _ in xrange(nr_processes):
            q.put('STOP')

        for _ in xrange(nr_processes):
            p = Process(target = self.check_port, args = (q, l, port_list))
            p.start()

    def check_port(self, q, l, port_list):
        ''' worker class invoked by process '''
        while True:
            queue_ret = q.get()

            if queue_ret == 'STOP':
                break

            s = socket.socket()

            try:

                s.connect((queue_ret))

                #lock for uncorrupted printing to console
                l.acquire()
                print "[INFO] %s on port %s is open" % (queue_ret)

                if len(port_list) > 1:
                    for i in port_list:
                        if int(i[0]) == int(queue_ret[1]):
                            for _ in i:sys.stdout.write(_ + " ")
                            print "\n\n"

                l.release()
            except socket.error:
                #print "[WARNING] %s on port %s is closed" % (queue_ret)
                pass
            s.close()

if __name__ == "__main__":

    PortScanner(host = 'example.com', port_range = (1, 60), nr_processes = 40, port_list_file = 'port_list.data')


Filed Under (Python, Threads) by Marcin Kuźmiński on July-13-2009

I always wanted to know if there a way to find out how many cpu’s does the machine have. For example to run n number of processes based on number of cores, recently i was exploring multiprocessing module and i found that there is a method that show  up number of cpu’s / cores that the machine have.

import multiprocessing

multiprocessing.cpu_count()
# this shows up 2 for my core2duo e9300

#so now if we iterate this like that

for i in xrange(multiprocessing.cpu_count()):
    print 'core',i+1

#this prints
#core 1
#core 2


Filed Under (Python, Threads) by Marcin Kuźmiński on July-7-2009

If you’re looking how to start with threads in python i found a great pdf to start with.

Download the Python-Threads pdf. Whitch includes the basic, and some more advanced information and tutorials about the threads in python.

In the pdf you can useful find info about:

  • thread and threading modules
  • queue
  • locks
  • gil
  • events
  • debugging threads


Filed Under (Python) by Marcin Kuźmiński on July-7-2009

I found a really great beginners tutorial for Python i need to share with others. It’s python by examples.

Found at http://www.lightbird.net/py-by-example/

I read many tutorials and sites dedicated to python, but this one is absolutely a must if you’re a python beginner. It’s loaded with simple examples, and descriptions. It briefly shows you the basics of python , each function is nicely explained with few examples. If you’re interested in learning Python this is the place to start. I personally found few things i did not use and know in python like shelve or inspect.

Here is the huge list of what’s on the page:

Regards M.



Filed Under (Python) by Marcin Kuźmiński on July-6-2009

Thanks to the http://addedbytes.com i found a nice python cheat sheet. You can download it here-> python-cheat-sheet-v1

Highly recommended to keep on the desk.



Filed Under (Python, Threads) by Łukasz Balcerzak on July-1-2009

Recently I’ve run into some optimizations problem which I once hoped I wouldn’t had to face. After some research on the subject I decided to split some process using threads. I hadn’t have much experience in that area except some classes at PJIIT (Polish-Japanese Institute of Information Technology) so start was painful… but not for long (as with everything in Python).

So I decided to share my thoughts with everyone who would ever try to use threads in Python programming.

Let’s imagine simple case: you want to check some host if it has opened any port from range 1-1000. We create simple script (use whatever host you like)

import socket
import time

def is_port_open(host, port):
    """
    Takes host param as string and port param as int.
    Returns true if port is open and false otherwise.
    """
    #print "[DEBUG] Checking host %s:%s" % (host, port)
    s = socket.socket()
    is_open = True
    try:
        s.connect( (host, port) )
    except socket.error:
        is_open = False
    s.close()
    return is_open

def main():
    start = time.time()

    host = 'some.host'
    ports = range(1,1000) 

    print "[INFO] Will now check host %s for ports between %s and %s" % \
        (host, ports[0], ports[-1]) 

    for port in ports:
        if is_port_open(host, port):
            print "[INFO] Port %s on host %s is open." % (port, host)

    print "Elapsed Time: %s" % (time.time() - start)

if __name__ == '__main__':
    main()

How long was it? Well, try this then:

import threading
import socket
import time
import Queue

THREAD_NUMBER = 20

def is_port_open(host, port):
    """
    Takes host param as string and port param as int.
    Returns true if port is open and false otherwise.
    """
    #print "[DEBUG] Checking host %s:%s" % (host, port)
    s = socket.socket()
    is_open = True
    try:
        s.connect( (host, port) )
    except socket.error:
        is_open = False
    s.close()
    return is_open

class PortChecker(threading.Thread):
    def __init__(self, queue):
        threading.Thread.__init__(self)
        self.queue = queue

    def run(self):
        while True:
            host, port = self.queue.get()
            if is_port_open(host, port):
                print "[INFO] Port %s on host %s is open." % (port, host)
            self.queue.task_done()

def main():
    start = time.time()

    host = 'some.host'
    ports = range(1,1000)
    queue = Queue.Queue()

    for i in xrange(THREAD_NUMBER):
        pc = PortChecker(queue)
        pc.setDaemon(True)
        pc.start()

    print "[INFO] Will now check host %s for ports between %s and %s" % \
        (host, ports[0], ports[-1]) 

    for port in ports:
        queue.put( (host, port) )

    queue.join()
    print "Elapsed Time: %s" % (time.time() - start)

if __name__ == '__main__':
    main()

Try with different port range/thread number. Ok, this is just an example. But it should give you an idea how to handle with some process that is done inside a loop. You have to define your “handler” (thread that will process on some object(s)), create empy queue and fill it with objects. Try to use only 1 thread (set THREAD_NUMBER to 1) and look at the time it took to finish whole process.

I know that programmer’s time is much more expensive than processor’s time and I try not to break this rule. But sometimes (i.e. when your script runs for over 12 hours) it is very nice if you can cut the time you can generate results. Just remember to catch exceptions inside threads – it is very important as if exception is risen by thread, it will just die without any notification.

Well, thats it. I will proceed tomorrow with some example with SQLAlchemy. Good night folks.



Filed Under (Python) by Marcin Kuźmiński on July-1-2009

Here an example how to send a mail with attachement using built in smtp lib to multiple recipients. It’s really very simple in python.

Just take a look at the code bellow.

'simple smtp mailer with multiple recipients and file attachments'

from email.mime.multipart import MIMEMultipart
from email.mime.base import MIMEBase
from email.mime.text import MIMEText
from email.utils import COMMASPACE, formatdate
from email import encoders
import os
from smtplib import SMTP

class Mailer(object):
    ''' simple mailer  '''

    def __init__(self):

        self.mail_to = ['rec3@example.com', 'rec2@example.com', 'rec3@example.com']

        self.mail_from = 'mailer@python-blog.com'
        msg_en = "This is an automated message from python-blog.com\r\n"

        #path to file we want to attach
        msg_file_attachment = '/home/marcink/Desktop/pydev_icons.zip'
        #get the filename we need it for adding to mail header
        msg_file_name = os.path.basename(msg_file_attachment)

        smtp_serv = SMTP('mail.python-blog.com')

        smtp_serv.ehlo("simpleMailerHello.python-blog.com")

        #if server requires authorization you must provide login and password
        smtp_serv.login('mylogin', 'mypassword')
        date_ = formatdate(localtime = True)

        msg = MIMEMultipart()
        msg['From'] = self.mail_from
        msg['To'] = COMMASPACE.join(self.mail_to)
        msg['Date'] = date_
        msg['Subject'] = "example subject"

        #attach string message
        msg.attach(MIMEText(msg_en))

        #attach open encode the filename
        file_part = MIMEBase('application', "octet-stream")
        file_part.set_payload(open(msg_file_attachment, "rb").read())
        encoders.encode_base64(file_part)
        file_part.add_header('Content-Disposition', 'attachment; filename="%s"'
                       % msg_file_name)
        msg.attach(file_part)

        #sendmail and exit server
        smtp_serv.sendmail(self.mail_from, self.mail_to, msg.as_string())
        smtp_serv.quit()

if __name__ == "__main__":
    Mailer()
    print 'mail sent'