A Python framework for graph databases.
Bulbs is an open-source Python persistence framework for graph databases and the first piece of a larger Web-development toolkit that will be released in the upcoming weeks.
It’s like an ORM for graphs, but instead of SQL, you use the graph-traveral language Gremlin to query the database.
This means your code is portable because you can to plug into different graph database backends without worrying about vendor lock in.
Bulbs was developed in the process of building Whybase, a startup that will open for preview later this year. Whybase needed a persistence layer to model its complex relationships, and Bulbs is an open-source version of that framework.
Here’s how you model domain objects:
# people.py from bulbs.model import Node, Relationship from bulbs.property import String, Integer, DateTime from bulbs.utils import current_datetime class Person(Node): element_type = "person" name = String(nullable=False) age = Integer() class Knows(Relationship): label = "knows" created = DateTime(default=current_datetime, nullable=False)
And here’s how you use the models to create and connect domain objects:
>>> from people import Person, Knows >>> from bulbs.neo4jserver import Graph >>> g = Graph() >>> g.add_proxy("people", Person) >>> g.add_proxy("knows", Knows) >>> james = g.people.create(name="James") >>> julie = g.people.create(name="Julie") >>> g.knows.create(james, julie)
Graph are an elegant way of storing relational data. Graphs are a fundamental data structure in computer science, and with graph databases you don’t have to mess with tables or joins – everything is explicitly joined.
For tabular data, relational databases are great, but the relational model doesn’t align well with object-orientated programming so you have an ORM layer that adds complexity to your code. And with relational databases, the complexity of your schema grows with the complexity of the data.
The graph-database model simplifies much of this and makes working with the modern-day social graph so much cleaner. Graphs allow you to do powerful things like find inferences inside the data in ways that would be hard to do with a relational database.
While you can model a graph in a relational database, anything that’s not a real graph database requires an external index lookup for each traversal step. With graph databases, each node carries a local set of indices of its adjacent nodes, and in large-scale traversals, bypassing the external index lookup allows your queries to run much faster.
Some really power tools are beginning to emerge in this space, and Forrester predicts the graph database space will grow in 2012.
Neo4j is an open-source graph database that can contain 32 billion nodes and traverse 2 million relationships per second on conventional hardware. It has been tested in production for over 10 years, and the community edition is now free.
Traversing graphs has been made simple by Gremlin, a wonderfully expressive, Turing-complete graph-programming language.
Gremlin is a domain-specific language for graph databases (like SQL for graphs), and it’s what you use to write queries in Bulbs. With Gremlin, you can do stuff like calculate PageRank in 2 lines.
To see the power of Gremlin, watch this 8 minute screencast by its creator, Marko Rodriguez: