When I started learning Google App Engine, I misunderstood a fairly fundamental part of the datastore - Entity Groups. The documentation is not very clear, and over the weeks I've seen many questions asked in videos and forums that suggest I'm not the only one that misunderstood. I thought it was worth a blog post to explain.

Entity Groups are not Tables!

The misunderstanding is that people think of entity groups like table in SQL. This is not the case. If you create two entities that are of the same kind, by default, they do not belong to the same entity group. Unless you specifically choose to put them in the same group, all of your entities will be in a seperate entity group. This means if you have 1,000 entities of the same kind, you have 1,000 entity groups containing one entity each! This is not a bad thing, you shouldn't put things into the same entity groups unless you need to, since updates to an entity lock the whole entity group.

Putting Entities in the Same Entity Group

All entity groups have a root entity. You can't put two entities into the same entity group without one of them being the root entity (parent). To put two entities into the same entity group, you set the parent property, like this:

# Create the company
my_company = Company(name='Danny\'s Company')
my_company.put()

# Create the employee
me = Employee(
	parent=my_company,
	name='Danny Tuppeny'
)
me.put()

Because updates lock an entire entity group, you should only use them if you need to. If you're not going to need to update an employee and its company in the same transaction, you don't need to put them in the same entity group.

Entity Keys Include Their Parent Entity Keys

Something worth noting, is that the key for an entity includes the key of it's parent. That means you can do some clever things for performance purposes, such as "Relation Index Entities" as described by Brett Slatkin in Google IO 2009's Building Scalable Complex Apps video. Brett creates child entities for querying and converts their keys to their parents keys using the parent() method to then batch fetch the entities themselves.

Hopefully this clears things up a little. If it still leaves questions (or I've missed anything), please leave a comment below and I'll try to update the post.