Wed, 11 Jun 2008
Encoding ampersands with Python
I need to replace ampersands in a text file with the HTML entity '&'. I could simply use Python's string replace
method, however, this will mess up my text if some of the ampersands have already been turned into HTML entities. The same is true
if I use regular expressions to match a single '&'. What I really need to do is replace an ampersand providing it is
not followed by 'amp;'.
Using negative lookahead assertion with our regular
expression is the answer. Negative lookahead is used when you want to match something not followed by something else. It starts
with (?! and finishes at the ).
Our expression now becomes: &(?!amp;) and means the text it contains, amp;, must not follow the
expression that preceeds it.
In this example I also added an expression to not match any HTML entity numbers as well.
>>> import re
>>> s = "<Title>Eugene's Software Emporium & Arcade</Title>"
>>> pattern = re.compile('&(?!#)(?!amp;)')
>>> if pattern.search(s):
... iterator = pattern.finditer(s)
... for match in iterator:
... print match.span()
...
(38, 39)
>>> s[match.start():match.end()]
'&'
>>>
posted: 22:13 | 0 comments | tags: python, programming, html
Mon, 26 May 2008
Python - Web application frameworks
The PythonInfo Wiki defines a a web framework as,
a collection of packages or modules which allow developers
to write Web applications or services without having to handle such low-level
details as protocols, sockets or process/thread management.
As a testiment to Python's power and simplicity it would seem that many developers have created
their own frameworks rather than use a solution already in existence. As a result one will find solutions
in various stages of development and feature implementation.
I have always tried to subscribe to the basic principle of using the right tool for the job.
With that in mind I have embarked on an exploratory journey to investigate some of Python's existing
Web frameworks with hopes of finding one that will work for a couple of big projects I have in the
works. My requirements are fairly simple; I do not want to learn a behemoth of an API that will take
months to figure out, yet I do not want something so simplistic that it will expect me to handle to many
low-level details. Finally, until I can bring my own server back online, the chosen framework needs to work
with my current hosting provider, DreamHost.
The five high-level frameworks I am looking at include:
- Django is a high-level Python Web framework that encourages
rapid development and clean, pragmatic design. Because Django was developed in a fast-paced newsroom
environment, it was designed to make common Web-development tasks fast and easy.
- TurboGears builds on other open source projects. In TurboGears,
CherryPy controllers sit at the hub of your project. This is the
biggest area for integration. Providing tools that allow the controllers to more easily work
with SQLObject databases, answer asynchronous calls from
MochiKit and render out completed
Kid templates is where the big win will come.
- Pylons combines the very best ideas from the worlds of Ruby,
Python and Perl, providing a structured but extremely flexible Python web framework. It's also one
of the first projects to leverage the emerging WSGI standard, which allows extensive re-use and
flexibility - but only if you need it. Out of the box, Pylons aims to make web development fast,
flexible and easy.
- Webware is a suite of Python packages and tools
for developing object-oriented, web-based applications. The suite uses well known design patterns
and includes a fast Application Server, Servlets, Python Server Pages (PSP), Object-Relational Mapping,
Task Scheduling, Session Management, and many other features. Webware is very modular and easily extended.
- Zope is an open source application server for building content
management systems, intranets, portals, and custom applications. The Zope community consists of hundreds
of companies and thousands of developers all over the world, working on building the platform and Zope
applications. Zope is written in Python.
posted: 13:48 | 0 comments | tags: python, programming, frameworks
Tue, 06 Nov 2007
Python - Recursive Directory Crawl Using Generators
I was looking for an os.walk example to crawl through a file system and found
the locate function below on ActiveState's Python Cookbook site.
I incorporated it into a simple routine that dumps the output to an XML file that can then be
transformed using XSLT to sort and tally the results.
#!/usr/bin/env python
import os
import fnmatch
import time
from xml.dom import minidom
def locate(pattern, root=os.curdir):
for path, dirs, files in os.walk(os.path.abspath(root)):
for filename in fnmatch.filter(files, pattern):
yield os.path.join(path, filename)
def main():
doc = minidom.Document()
files = doc.createElement("files")
doc.appendChild(files)
comment = doc.createComment("Size attribute is reported in bytes.")
files.appendChild(comment)
for i, file in enumerate(locate("*.*", "\\\\SERVER\\Share")):
try:
item = doc.createElement("filename")
item.setAttribute("id", "%s" % (i))
item.setAttribute("path", file)
item.setAttribute("ext", os.path.splitext(file)[1].lower())
item.setAttribute("size", "%s" % os.stat(file).st_size)
item.setAttribute("last_modified", time.ctime(os.stat(file).st_mtime))
files.appendChild(item)
except OSError, e:
print "%s => %s" % (file, e.strerror)
fp = open('myfiles.xml', 'w')
doc.writexml(fp, "", " ", "\n", "iso-8859-1")
fp.close()
return
if __name__ == "__main__":
main()
The locate function takes two parameters; the first is a file pattern
to match and the second is the directory to start the crawl from.
posted: 23:25 | 0 comments | tags: programming, python, xml
Sat, 24 Mar 2007
Python - SOAPpy and HTTP Authentication
The other week I was wanting to use a SOAP web service that was protected by http
basic authentication. I could not find a way to do the authentication with SOAPpy.
I looked everywhere for an example before I stumbled upon a version of the
below code in an archived newsgroup post.
from SOAPpy import Config, HTTPTransport, SOAPAddress, WSDL
class myHTTPTransport(HTTPTransport):
username = None
passwd = None
@classmethod
def setAuthentication(cls,u,p):
cls.username = u
cls.passwd = p
def call(self, addr, data, namespace, soapaction=None, encoding=None,
http_proxy=None, config=Config):
if not isinstance(addr, SOAPAddress):
addr=SOAPAddress(addr, config)
if self.username != None:
addr.user = self.username+":"+self.passwd
return HTTPTransport.call(self, addr, data, namespace, soapaction,
encoding, http_proxy, config)
if __name__ == '__main__':
wsdlFile = 'http://localhost/soap/wsdl/'
myHTTPTransport.setAuthentication('gollum', 'myprecious')
server = WSDL.Proxy(wsdlFile, transport=myHTTPTransport)
print server.ApiVersion()
It works because you can specify your own transport to the WSDL.Proxy using
Python's **kw feature. The original author subclassed the default transport
in Client.HTTPTransport and added a static class method to supply the basic
authentication.
posted: 23:09 | 0 comments | tags: programming, python, soap, soappy
Sat, 26 Mar 2005
Python Tutorial 2.4 in eBook Format
I installed Plucker on my Palm IIIxe last week. Plucker is an offline web and and ebook viewer for Palm based handhelds. The parser that comes with Plucker can create ebooks from a number of sources including RSS, RDF, text files and HTML to name a few. On the Plucker web site there is a samples page which provides several example Plucker documents to download.
I was eager to try out the parser on some offline documentation I am currently reading. The current Python Tutorial seemed like a good choice as I reference it a fair amount at the moment. Following the Plucker documentation for creating an ebook I soon had a reasonable ebook copy I could upload to my PDA. After previewing it I decided that the original HTML source document would need to be modified a bit to better present the content on the PDA.
I stripped out the graphical header and footer from the original tutorial, leaving just the textual links for navigation. I ran the parser again with the following options,
Spider.py -v --no-urlinfo --noimages --zlib-compression \
-H Python-Docs-2.4/tut/tut.html -N "Python 2.4 Tutorial" -f PyTut_24
and it produced a resulting Palm Database (.PDB) format file which was only 108K in size.
If you are using Linux and have the pilot link conduits installed, you can sync it to your Palm PDA using the following syntax,
pilot-xfer -p /dev/pilot -i PyTut_24.pdb
The Python Tutorial 2.4 ebook can be downloaded below.
posted: 02:00 | 0 comments | tags: ebooks, palm, python
Fri, 11 Mar 2005
Pippy: Python for the Palm
The other night I was browsing the internet searching
for some Python stuff and I stumbled upon this.
Pippy is a port of Python 1.5.2+ to the PalmOS. After reading the web site I
went looking for my old Palm IIIxe handheld, dropped in some fresh AAA's and plugged it into the cradle which was attached to my Fedora
Core 3 box. After transfering the two .prc files over to the Palm I had a working subset of Python on my old PDA. The developers even
included a feature that allows one to write your own Python modules in the memo pad application and then import them into the
interpreter.
posted: 00:54 | 0 comments | tags: palm, python, software
Sat, 08 Jan 2005
Fedora Core 3 and PyOpenGL 2.0 Installation
According to the website, PyOpenGL "is a cross platform Python binding
to OpenGL and related APIs." Since I have started playing around a bit with Python this
new year I was curious to see how the language could be used for graphics, and specifically, 3D graphic development. In the past I have
tinkered with OpenGL a bit, however, it has usually been by means of programming under
C and using SDL as the library.
I was a little daunted after viewing the installation page
and seeing what needed to be done, however, being new to Python I was pleasantly suprised to find that most of the OpenGLContext dependencies could simply
being installed by entering,
python setup.py install
after untarring the archives into a temporary directory. It doesn't get much easier than that.
Pay close attention to the requirement for GLUT 3.7. On Fedora Core 3
RedHat has taken to installing freeglut,
which is a completely open sourced alternative to the OpenGL Utility Toolkit (GLUT). In theory, it should be a fairly complete drop-in replacement for GLUT,
however, I could not get PyOpenGL to install with it on my system. I read a few posts mostly pertaining to Fedora Core 2 and similar issues and with the
latest alpha release of PyOpenGL-2.0.2.01.tar.gz the author has attempted to have
the installation procedure detect the presence of freeglut, however, I was not able to get it to work.
In the end I uninstalled freeglut, downloaded and installed GLUT 3.7 from source
and then reinstalled the latest alpha release of PyOpenGL. After receiving no further errors, I went into "/usr/lib/python2.3/site-packages/OpenGL/Demo/GLUT" and typed in,
python gears.py
and voila! I had hardware accelerated rotating gears doing about 1400+ fps on my Pentium III 866.
posted: 00:54 | 0 comments | tags: fedora, linux, pyopengl, python