Converting Simple XML to JSON in Python

Converting simple XML to JSON with some Google like extensions (http://code.google.com/apis/gdata/docs/json.html) like $t for text content and replace ‘:’ in element name with ‘$’. It is fairly simple here is how I do it.

import xml.parsers.expat
import sys,string

# 3 handler functions
def start_element(name, attrs):
    #sys.stdout.write 'Start element:', name, attrs
	sys.stdout.write('{"'+string.replace(name,':','$')+'":{')
	for attr in attrs.keys():
		sys.stdout.write('"'+attr+'":"'+attrs[attr]+'",')
def end_element(name):
    #sys.stdout.write 'End element:', name
	sys.stdout.write('}')
def char_data(data):
    #sys.stdout.write 'Character data:', repr(data)
	if data and (data!=""):
		sys.stdout.write( '"$t":{"'+data+'"}' )

p = xml.parsers.expat.ParserCreate()

p.StartElementHandler = start_element
p.EndElementHandler = end_element
p.CharacterDataHandler = char_data

p.Parse("""<?xml version="1.0"?>
<ns:parent id="top"><child1 name="paul">Text goes here</child1><child2 name="fred">More text</child2></ns:parent>""", 1)

Google App Engine : Full Text Search support in pre-release version

Full Text Search in Google’s app engine is what limiting me from adopting app engine at a first place, may be others feels this as limitation too. But, Then something interesting happened two Google guys presented in Google IO the upcoming feature of the App Engine – Full Text Search. They presented what I needed most. They are Google they had to provide search capabilities, that’s what issues reported by one of user. I didn’t hear any word from them since then about Full Text Search. May be they are planning big opening for this big feature.

But,  2-3 days earlier they released pre-released version in their opensource app engine SDK. But, they didn’t mention anywhere in the changes or release notes that it contain search capabilities. Yesterday, I explored the python SDK and found that there is new search libraries and apis in ext folder (google/appengine/ext/search). Then I created the document for python SDK through epydoc. And found this doc of search package (google.appengine.ext.search),
Full text indexing and search, implemented in pure python.
Defines a SearchableModel subclass of db.Model that supports full text indexing and search, based on the datastore's existing indexes.

Don't expect too much. First, there's no ranking, which is a killer drawback. There's also no exact phrase match, substring match, boolean operators, stemming, or other common full text search features. Finally, support for stop words (common words that are not indexed) is currently limited to English.

To be indexed, entities must be created and saved as SearchableModel instances, e.g.:


class Article(search.SearchableModel):
text = db.TextProperty()
...

article = Article(text=...)
article.save()

To search the full text index, use the SearchableModel.all() method to get an instance of SearchableModel.Query, which subclasses db.Query. Use its search() method to provide a search query, in addition to any other filters or sort orders, e.g.:


query = article.all().search('a search query').filter(...).order(...)
for result in query:
...

The full text index is stored in a property named __searchable_text_index.

Specifying multiple indexes and properties to index
---------------------------------------------------

By default, one index is created with all string properties. You can define multiple indexes and specify which properties should be indexed for each by overriding SearchableProperties() method of model.SearchableModel, for example:


class Article(search.SearchableModel):
@classmethod
def SearchableProperties(cls):
return [['book', 'author'], ['book']]

In this example, two indexes will be maintained - one that includes 'book' and 'author' properties, and another one for 'book' property only. They will be stored in properties named __searchable_text_index_book_author and __searchable_text_index_book respectively. Note that the index that includes all properties will not be created unless added explicitly like this:


@classmethod
def SearchableProperties(cls):
return [['book', 'author'], ['book'], search.ALL_PROPERTIES]

The default return value of SearchableProperties() is [search.ALL_PROPERTIES] (one index, all properties).

To search using a custom-defined index, pass its definition in 'properties' parameter of 'search':

Article.all().search('Lem', properties=['book', 'author'])

Note that the order of properties in the list matters.

Adding indexes to index.yaml
-----------------------------

In general, if you just want to provide full text search, you *don't* need to add any extra indexes to your index.yaml. However, if you want to use search() in a query *in addition to* an ancestor, filter, or sort order, you'll need to create an index in index.yaml with the __searchable_text_index property. For example:

- kind: Article
properties:
- name: __searchable_text_index
- name: date
direction: desc
...

Similarly, if you created a custom index (see above), use the name of the property it's stored in, e.g. __searchable_text_index_book_author.
Note that using SearchableModel will noticeable increase the latency of save() operations, since it writes an index row for each indexable word. This also means that the latency of save() will increase roughly with the size of the properties in a given entity. Caveat hacker!

Hoping this will help others and also encourage others to think and adopt Google app engine as more capable system to handle real world problems.

Python : Create directory if directory/folder(s) don’t exists already

In Python, I have to make sure, for quite a lot time, that some directory path exists before I can do some operation on that. So, here is the code in python that make sure some directory path or folder path exists before we do something on that directory.

It will create all the directories in the path, if they are not exists

import os, errno

def make_sure_dir_exists(path):
    if not os.path.exists(dir):
        try:
            os.makedirs(path)
        except OSError as exc: # Python &gt;2.5
            if exc.errno == errno.EEXIST:
                pass
            else: raise

Python : Create path or directories, if not exist

I came across to this question very often, that how to create a directory structure, if I know the exact path of file or directory hierarchy. That is not difficult but very time consuming if you don’t know where to find the resources. Let go to code we need,


def assure_path_exists(path):
        dir = os.path.dirname(path)
        if not os.path.exists(dir):
                os.makedirs(dir)

first line define python function/module assure_path_exists, which take one argument “path” of file or directory.
second line, ensure path contain directory only. it strips filename, if there is any.
third line, checks for whether this path (pointed by dir), exists or not
forth line, if path pointed by dir not exists, we create that path through os.makedirs

Done!!!