Wed, 11 Jun 2008

Encoding ampersands with Python

I need to replace ampersands in a text file with the HTML entity '&'. I could simply use Python's string replace method, however, this will mess up my text if some of the ampersands have already been turned into HTML entities. The same is true if I use regular expressions to match a single '&'. What I really need to do is replace an ampersand providing it is not followed by 'amp;'.

Using negative lookahead assertion with our regular expression is the answer. Negative lookahead is used when you want to match something not followed by something else. It starts with (?! and finishes at the ).

Our expression now becomes: &(?!amp;) and means the text it contains, amp;, must not follow the expression that preceeds it.

In this example I also added an expression to not match any HTML entity numbers as well.

>>> import re
>>> s = "<Title>Eugene&#039;s Software Emporium & Arcade</Title>"
>>> pattern = re.compile('&(?!#)(?!amp;)')
>>> if pattern.search(s):
...   iterator = pattern.finditer(s)
...   for match in iterator:
...     print match.span()
... 
(38, 39)
>>> s[match.start():match.end()]
'&'
>>> 


posted: 22:13 | 0 comments | tags: , ,


Mon, 26 May 2008

Python - Web application frameworks

The PythonInfo Wiki defines a a web framework as,

a collection of packages or modules which allow developers to write Web applications or services without having to handle such low-level details as protocols, sockets or process/thread management.

As a testiment to Python's power and simplicity it would seem that many developers have created their own frameworks rather than use a solution already in existence. As a result one will find solutions in various stages of development and feature implementation.

I have always tried to subscribe to the basic principle of using the right tool for the job. With that in mind I have embarked on an exploratory journey to investigate some of Python's existing Web frameworks with hopes of finding one that will work for a couple of big projects I have in the works. My requirements are fairly simple; I do not want to learn a behemoth of an API that will take months to figure out, yet I do not want something so simplistic that it will expect me to handle to many low-level details. Finally, until I can bring my own server back online, the chosen framework needs to work with my current hosting provider, DreamHost.

The five high-level frameworks I am looking at include:

  • Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design. Because Django was developed in a fast-paced newsroom environment, it was designed to make common Web-development tasks fast and easy.
  • TurboGears builds on other open source projects. In TurboGears, CherryPy controllers sit at the hub of your project. This is the biggest area for integration. Providing tools that allow the controllers to more easily work with SQLObject databases, answer asynchronous calls from MochiKit and render out completed Kid templates is where the big win will come.
  • Pylons combines the very best ideas from the worlds of Ruby, Python and Perl, providing a structured but extremely flexible Python web framework. It's also one of the first projects to leverage the emerging WSGI standard, which allows extensive re-use and flexibility - but only if you need it. Out of the box, Pylons aims to make web development fast, flexible and easy.
  • Webware is a suite of Python packages and tools for developing object-oriented, web-based applications. The suite uses well known design patterns and includes a fast Application Server, Servlets, Python Server Pages (PSP), Object-Relational Mapping, Task Scheduling, Session Management, and many other features. Webware is very modular and easily extended.
  • Zope is an open source application server for building content management systems, intranets, portals, and custom applications. The Zope community consists of hundreds of companies and thousands of developers all over the world, working on building the platform and Zope applications. Zope is written in Python.



posted: 13:48 | 0 comments | tags: , ,


Tue, 06 Nov 2007

Python - Recursive Directory Crawl Using Generators

I was looking for an os.walk example to crawl through a file system and found the locate function below on ActiveState's Python Cookbook site. I incorporated it into a simple routine that dumps the output to an XML file that can then be transformed using XSLT to sort and tally the results.

#!/usr/bin/env python

import os
import fnmatch
import time
from xml.dom import minidom

def locate(pattern, root=os.curdir):
    for path, dirs, files in os.walk(os.path.abspath(root)):
        for filename in fnmatch.filter(files, pattern):
            yield os.path.join(path, filename)
 
def main():
    doc = minidom.Document()
    files = doc.createElement("files")
    doc.appendChild(files)
    comment = doc.createComment("Size attribute is reported in bytes.")
    files.appendChild(comment)
    
    for i, file in enumerate(locate("*.*", "\\\\SERVER\\Share")):
        try:
            item = doc.createElement("filename")
            item.setAttribute("id", "%s" % (i))
            item.setAttribute("path", file)
            item.setAttribute("ext", os.path.splitext(file)[1].lower())
            item.setAttribute("size", "%s" % os.stat(file).st_size)
            item.setAttribute("last_modified", time.ctime(os.stat(file).st_mtime))
            files.appendChild(item)
        except OSError, e:
            print "%s => %s" % (file, e.strerror)

    fp = open('myfiles.xml', 'w')
    doc.writexml(fp, "", "  ", "\n", "iso-8859-1")
    fp.close()
    
    return    

if __name__ == "__main__":
    main()

The locate function takes two parameters; the first is a file pattern to match and the second is the directory to start the crawl from.



posted: 23:25 | 0 comments | tags: , ,


Sat, 24 Mar 2007

Python - SOAPpy and HTTP Authentication

The other week I was wanting to use a SOAP web service that was protected by http basic authentication. I could not find a way to do the authentication with SOAPpy. I looked everywhere for an example before I stumbled upon a version of the below code in an archived newsgroup post.

from SOAPpy import Config, HTTPTransport, SOAPAddress, WSDL

class myHTTPTransport(HTTPTransport):
    username = None
    passwd = None
    
    @classmethod
    def setAuthentication(cls,u,p):
        cls.username = u
        cls.passwd = p
          
    def call(self, addr, data, namespace, soapaction=None, encoding=None,
             http_proxy=None, config=Config):
        
        if not isinstance(addr, SOAPAddress):
            addr=SOAPAddress(addr, config)
            
        if self.username != None:
            addr.user = self.username+":"+self.passwd
            
        return HTTPTransport.call(self, addr, data, namespace, soapaction,
                                  encoding, http_proxy, config)
    

if __name__ == '__main__':
    wsdlFile = 'http://localhost/soap/wsdl/'
    myHTTPTransport.setAuthentication('gollum', 'myprecious')
    server = WSDL.Proxy(wsdlFile, transport=myHTTPTransport)
    print server.ApiVersion()

It works because you can specify your own transport to the WSDL.Proxy using Python's **kw feature. The original author subclassed the default transport in Client.HTTPTransport and added a static class method to supply the basic authentication.



posted: 23:09 | 0 comments | tags: , , ,


Thu, 15 Mar 2007

IBM's developerWorks - JavaScript and Ajax Tutorial Series

One of my often visited bookmarks is IBM's developerWorks site. The site is virtual library of technical information and tutorials.

In September 2006, Brett McLaughlin, Author and Editor with O'Reilly Media Inc, concluded his six part series on JavaScript, Ajax and the Document Object Model (DOM). Part one of the series starts with a quick-paced introduction to what Ajax is and how it works, follows with the use of the XMLHttpRequest object for Web requests and understanding the HTTP status codes it returns. The remaining parts of the series focus on how to mix JavaScript and the DOM to create interactive Ajax applications.

It is a great series that I still reference now and then when in the midst of a project.



posted: 21:23 | 0 comments | tags: , ,


Tue, 14 Mar 2006

Formating Numbers in a Text Box

I recently needed to control the format that was being entered into a text box on a form. I needed the user to enter a currency amount, however, I did not want dollar signs, commas or decimal places to be entered. I simply wanted the raw number.

This is one way to do it.

<html>
<head>
<title>Number Format</title>

<script language="JavaScript">
function formatNumber(e, field) {
  key=e.keyCode;
  // Allow use of return, left and right cursor keys
  if ((key==13)||(key==37)||(key==39)) return;
  temp=field.value.replace(/[^0-9]|^0+/g,"");
  field.value=temp;
}
</script>

</head>

<body>
 <form name="form1" method="post" action="">
   Enter Salary:
     <input type="text" name="amount" id="amount"
onKeyUp="formatNumber(event, this)">
 </form>
</body>
</html>

The script is called on the "onKeyUp" event of the input field and strips out any character that is not a number.



posted: 00:38 | 0 comments | tags: ,


Mon, 13 Mar 2006

URL Rewriting in JavaScript

At work I am constantly flipping between a production and a development web server. The live site would have a URL like, "www.mysite.com" and the development site one like, "dev.mysite.com". I was getting tired of removing the "www" from the address and replacing it with "dev" and back again for all the different URLs. This script is a result of that. By simply clicking a saved bookmark on my browser toolbar I can automatically switch back and forth between the production and development servers with ease.

Right-click on the following link, DevSwitch, and select either "Bookmark This Link..." in FireFox or "Add to Favorites..." in Internet Explorer.

The Bookmarklet code is outlined below.

javascript:(function(){
    x=location.href;
    y=(x.indexOf('www')!=-1)?x.replace('www','dev'):x.replace('dev','www');
    location.href=y;
})()


posted: 23:58 | 0 comments | tags: , ,


Sun, 16 Oct 2005

XSLT to convert from a 24-hour timestamp

I had an XML file in which the timestamp for the file creation was in a 24-hour GMT format. I needed to convert the display date and time to be in the format of, "XXX as of 12:00 a.m., on Oct 17, 2005"

I found some of the code for converting the month name on the internet and the rest I hacked together to handle the 24-hour and GMT conversion.

Longer lines are split with a "\" character which should be removed before you use the code.

<!-- Start: XSLT date and time formatting template -->
<!-- e.g. 2005-07-18T17:59:30.187-08:00 -->

<xsl:template name="format-date-time">
  <xsl:param name="date" />
  <xsl:variable name="year" select="substring($date, 1, 4)" />
  <xsl:variable name="month" select="substring($date, 6, 2)" />
  <xsl:variable name="day" select="substring($date, 9, 2)" />
  <xsl:variable name="day2" select="translate($day, '0', '')" />
  <xsl:variable name="monthName" \ 
    select="substring('JanFebMarAprMayJunJulAugSebOctNovDec', \
    substring-before(substring-after($date,'-'),'-')*3-2,3)" />
  <xsl:variable name="hour24" select="substring($date, 12, 2) - 8" />
  <xsl:variable name="minute" select="substring($date, 15, 2)" />
  <xsl:variable name="second" select="substring($date, 18, 2)" />

  <xsl:variable name="hour12">
    <xsl:choose>
      <xsl:when test="$hour24 < 0">
        <xsl:value-of select="12 + $hour24" />
      </xsl:when>
      <xsl:when test="$hour24 = 0">
        <xsl:value-of select="12" />
      </xsl:when>
      <xsl:when test="$hour24 = 12">
        <xsl:value-of select="$hour24" />
      </xsl:when>
      <xsl:when test="$hour24 > 12">
        <xsl:value-of select="$hour24 - 12" />
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$hour24" />
      </xsl:otherwise>
    </xsl:choose>
  </xsl:variable>
  
  <xsl:variable name="meridiem">
    <xsl:choose>
      <xsl:when test="$hour24 < 0">p.m.</xsl:when>
      <xsl:when test="$hour24 >= 12">p.m.</xsl:when>
      <xsl:otherwise>a.m.</xsl:otherwise>
    </xsl:choose>
  </xsl:variable>

  <xsl:value-of select="concat($hour12, ':', $minute, ':', $second, ' ', \
    $meridiem, ' on ', $monthName, ' ', $day2, ', ', $year)" />
  
</xsl:template>

<!-- End: XSLT date and time formatting template -->


posted: 00:24 | 0 comments | tags: , ,


Fri, 06 May 2005

C# XML data export from MySQL

I originally wrote this in C# at work under Visual Studio to connect to a Microsoft SQL Server on my workstation. One of the tables contained 117 records of archived articles for the web site. We needed to extract the data and output it into an XML file. Once I got it working I decided to see how well it would compile and run under Mono, an open source development platform based on the .NET framework, and have it connect instead to a MySQL database.

In order to have my .NET application connect to MySQL instead of the Microsoft SQL Server I needed to download and install the MySQL Connector/Net driver. The MySQL Connector/Net is a fully-managed ADO.NET driver written in 100% pure C#. It was then simply a case of adding the new MySQL Namespace to the code and substituting the different Microsoft specific classes for the MySQL ones.

This is the resulting code,

// Filename: ExportXMLData.cs

using System;
using System.Data;
using MySql.Data;
using MySql.Data.MySqlClient;

namespace ExportXMLData
{
  class ExportXML
  {
    static void Main(string[] args)
    {
      ExportXML exportXML = new ExportXML();
      exportXML.Run();
    }
                
    private void Run()
    {
      // Change the variables to reflect values needed for
      // your computer and database properties.
      string Database = "";
      string Server = "localhost";
      string User = "";
      string Pass = "";
      string TableName = "";
      string XMLRootNodeName = "Root";
      string OutputFileName = "output.xml";

      string conn = 
	"Database=" + Database + ";" + 
	"Server=" + Server + ";" +
	"Uid=" + User + ";" +
	"Pwd=" + Pass;

      MySqlConnection connection = new MySqlConnection(conn);
      MySqlDataAdapter adapter = new MySqlDataAdapter();
      adapter.TableMappings.Add("Table", TableName);
      connection.Open();
      MySqlCommand query = new MySqlCommand("SELECT * FROM "
					    + TableName, connection);
      query.CommandType = CommandType.Text;
      adapter.SelectCommand = query;
      DataSet ds = new DataSet(XMLRootNodeName);
      adapter.Fill(ds);
      connection.Close();

      ds.WriteXml(OutputFileName, XmlWriteMode.WriteSchema);
    }
  }
}

You will need to compile the code as follows so it finds the necessary libraries,

mcs ExportXMLData.cs -r System.Data -r MySql.Data

You can then run the C# program with,

mono ExportXMLData.exe

The result should be an XML file in your working directory containing your database table structure and data.

In a later post I will show you how you can use XSL Transformations (XSLT) to apply a stylesheet and change the display formatting of the XML file itself.



posted: 23:37 | 0 comments | tags: , , ,


© 2008 PlatosCave.net