GrapheneDB Blog

Updates from the GrapheneDB team

Neo Technology Just Closed $20M in Funding. Here’s Why It’s a Big Deal!

Neo Technology, makers of Neo4j, announced their $20M Series C round yesterday, bringing the company’s total investment to date to $44M. The funding announcement was covered by popular media such as Techcrunch, Venturebeat, Gigaom, Pando and Forbes. The official press release from Neo Technology is here.

Why graph databases

Companies have identified data-driven operations, and decisions, as keys to success. But we live in an increasingly connected world, and traditional databases can’t handle the amount of connected data that holds together our social, personal and professional networks.

Graph databases, on the other hand, outperform traditional databases by factor of thousands when managing complex, connected data. They have inherent advantages in an environment where everything is connected.

Neo Technology makes Neo4j, the market leader in graph databases. With its intuitive query language Cypher, the interactive browser interface introduced in 2.0, and the new ETL functionality introduced in 2.1, Neo4j is moving quickly and acquiring a noticeable head start and consolidating itself as the graph database for apps.

Graph databases are here to stay

Graph databases organize information more intuitively than traditional databases. Developers can query data and ask questions that traditional databases don’t. Simply put, they’re built with connectedness in mind.

Graph databases are here to stay, and will be very influential in the coming years. Forrester predicts 25% of the enterprises to be using graph databases by 2017 [1].

An ecosystem of providers around Neo4j is emerging, be it hosting, visualization,or complementary libraries. Over the coming years we’ll continue to see others emerging. Neo4j is a central piece of technology for many companies and these providers bridge certain gaps in the value-add chain that make graph databases easier to adopt, use and extract value from.

Neo Technology knows it, and so do its investors.This funding milestone is another sign of the graph database market picking up steam.

Taking Neo4j to the Cloud

At GrapheneDB, we are big believers in graph databases. Our goal is to make Neo4j technology accessible and easy for developers to use. That’s why we created our hosted database platform.

GrapheneDB is a part of the ecosystem that’s emerging around Neo4j technology. We took the core Neo4j technology, built an automated operations layer and a user-friendly interface to manage operational aspects, such as configurations, plugin management, and backups. By taking care of operations, our users are able to focus on developing graph-powered apps and increase their productivity, knowing that they we will make sure their database runs around the clock.

We provide different levels of service, from Hobby, to get started, to Production-grade plans with automatic backups and server monitoring. No matter the app, our platform helps you get graph database instances up and running.

Try us out for free

If data-driven operations and decisions are keys to success, then find out how graph databases can help you. Go to www.graphenedb.com and get started with one of our free plans.

[1] TechRadar(TM): Enterprise DBMS, Q1 2014. Forrester Research. 2-13-14

Importing Data Into Neo4j via CSV

At GrapheneDB, a question we get asked quite often from users is how to import data. Sample datasets are good, but loading your own data is even better. This post will explain how to import data from a CSV file into Neo4j. After outlining the steps to take, we list some special considerations for GrapheneDB users.

One of the most important steps when evaluating a new technology for your stack is importing existing data. CSV is one of the most popular standards for data exchange and most of the popular database engines support exporting data in CSV format.

Starting with 2.1, Neo4j includes a LOAD CSV [Neo4j Docs] Cypher clause for data import, which is a powerful ETL tool:

  • It can load a CSV file from the local filesystem or from a remote URI (i.e. S3, Dropbox, Github, etc.)
  • It can perform multiple operations in a single statement
  • It can be combined with USING PERIODIC COMMIT to group the operations on multiple rows in transactions to load large amounts of data [Neo4j Docs]
  • Input data is mapped directly into a complex graph structure as outlined by the user
  • It’s possible to manipulate or compute values in runtime
  • It allows merging existing data (nodes, relationships, properties) rather than just adding it to the store

Steps

Have your graph data model ready

Before running the import process you will need to know how you want to map your data onto the graph. What are the nodes and relationships, and which properties will they have?

Tune cache and heap configuration

Make sure to increase the heap size generously, specially if importing large datasets, and also make sure the file buffer caches fit the entire dataset.

You can estimate the size of your dataset on disk after the import by using the table in the official Neo4j docs.

Let’s assume we are going to store 100K nodes, 1M relationships and a fixed-size property per node/relationship (i.e. an integer number) :

  • Node store: 100,000 * 15B = 1.5 MB
  • Relationship store: 1,000,000 * 34B = 34MB
  • Property store: 1,100,000 * 41B = 45.1 MB

Those are the minimum values that we should use in your filebuffer cache configuration.

Set up indexes and constraints

Indexes will make lookups faster during and after the load process. Make sure to include an index for every property used to locate nodes in MERGE queries.

An index can be created with the CREATE INDEX clause. Example:

1
CREATE INDEX ON :User(name);

If a property must be unique, adding a constraint will also implicitly create an index. For example, if you we want to make sure we don’t store any duplicated user nodes, we could use a constraint for the email property.

1
CREATE CONSTRAINT ON (u:User) ASSERT u.email IS UNIQUE;

Loading and mapping data

The easiest way to load data from CSV is to use the LOAD CSV statement. It supports common options, such as accessing via column header or column index, configuring the terminator character and other common options. Please refer to the official docs for further details.

To speed up the process, make sure to use USE PERIODIC COMMIT, which will group multiple operations (by default 1000) into transactions and reduce the times Neo4j has to hit the disk to commit the changes.

1
2
LOAD CSV WITH HEADERS FROM "file:///tmp/users.csv" AS csvLine FIELDTERMINATOR ';'
MERGE (u:User { email: csvLine.email}) ON CREATE SET u.username = csvLine.username, u.name = csvLine.name;

Please note that values are read as Strings, so make sure you do format conversion where appropiate, i.e. toInt(csv.columns) when loading integer numbers.

The load process can be run from the Neo4j shell, either interactively, or by loading the Cypher code from a file using the option -file filename.

Alternatively, the code can be entered manually into the shell or the browser UI.

Considerations for GrapheneDB users

A few considerations when loading data into your GrapheneDB Neo4j instance:

  • caches and heap can only be configured on the Standard plans and higher. They are fixed on the lower-end plans
  • neo4j-shell does not support authentication and thus it can’t be used to load data into an instance hosted on GrapheneDB or otherwise secured with authentication credentials
  • when running the command from the browser UI, bear in mind Neo4j won’t be able to access your filesystem. You should provide a publicly available URL instead, i.e. a file hosted on AWS S3
  • for larger datasets, we recommed running the import process locally and once completed, perform a restore on your GrapheneDB instance

For a comprehensive tutorial, including tools to clean up the CSV files, common pitfalls and more advanced tools like the super fast batch importer please refer to this comprehensive CSV import guide.

Please don’t hesitate to post any comments or contact our support team if you are having issues loading data into your GrapheneDB instance.

GrapheneDB Concludes Beta Phase, Releases New Plans

GrapheneDB has concluded the beta stage of its product development.

Leaving beta stage means that the service’s core functionality has been tested and proven. A significant user base has adopted GrapheneDB’s platform for hosting managed graph databases.

As GrapheneDB continues to improve on current features and build additional ones, customers can register now, knowing they’ll get a reliable, straightforward hosting service for their graph databases.

“Success Kid” Photograph (c) Laney Griner / Used with Permission

“Our cloud hosting platform operates the largest fleet of managed Neo4j databases,” says Alberto Perdomo, GrapheneDB co-founder. “We built this platform as developers, for developers. There was no good, reliable hosting service for graph databases, so we built GrapheneDB.”

GrapheneDB’s flexible tier and plan system means that developers can test ideas before taking an app into production.

To learn more about GrapheneDB, visit graphenedb.com To view pricing and register a free account, visit http://www.graphenedb.com/pricing.html.

GrapheneDB Sponsors GraphConnect 2014

GrapheneDB is a proud sponsor of GraphConnect 2014, taking place Oct ober 21-22 in San Francisco. GraphConnect is the only conference that focuses on the rapidly growing world of graph databases and applications, and features Neo4j, the world’s leading graph database.

GraphConnect Logo

“Graph databases are the future. For highly connected data, they are thousands-of-times faster than other databases,” says Alberto Perdomo, co-founder of GrapheneDB. “GraphConnect 2014 brings together some of the most innovative developers working with graphs, so we’re very happy to be supporting it.”

GrapheneDB’s cloud hosting platform operates the largest fleet of managed Neo4j databases. Known for its reliability and flexibility, GrapheneDB’s service allows developers to build and scale apps with peace-of-mind.

GrapheneDB’s features and flexible plan structure support applications in any stage of the development cycle, from development to MVP, into production and then scaling as the app matures.

To get your tickets for the conference, visit graphconnect.com

To learn more about GrapheneDB, visit graphenedb.com. Getting started only takes seconds!

Our New Add-on Helps Heroku Users Build Applications on Top of Neo4j

To build and run powerful apps on top of Neo4j databases, you need a reliable partner to keep those databases running. GrapheneDB’s hosting service for the Neo4j graph database is now available to Heroku users. We’ve launched a beta version of our add-on that’s public and immediately available for use.

Access the public Docs at the Heroku Dev Center.

More about the add-on

  • It includes all the features offered by GrapheneDB.
  • By default, the add-on will provision Neo4j 2.0 instances with support for labels, the new browser and more.
  • Support for Gremlin is available for Neo4j 1.9 (Gremlin is not included in 2.0 releases anymore)

Migration from Neo4j add-on

If you want to migrate from the Neo4j add-on, follow these simple steps at our Heroku docs page.

 Why Use Graphs?

Graph databases help you recognize and exploit relationships among your data that are otherwise difficult to manage. They’re built specifically to treat relationships as first-class citizens and scale with predictable query time regardless of the overall size of the data set.

 Get Started

GrapheneDB makes it easy to set up and maintain your neo4J databases, so you can focus on building a powerful app. Contact us and learn how to get a dedicated instance for your Heroku app up-and-running.

Maintenance Next Monday July 22th 18:00 CEST

Affects: All customers using sandbox databases.

  • When: 2013-07-22 18:00 CEST (Central European Time)
  • Duration: 20 minutes
  • Expected downtime: 10 minutes

No changes required.

Reason: We are going to deploy new software on to the servers hosting the sandbox databases. A system reboot is required.

Additional details: No action is required by users. Current configuration will work as usual.

Maintenance Next Thursday June 20th 14:00-15:00 CEST

There will be a maintenance window next thursday 14:00-15:00 CEST (Berlin time). The actual downtime should be just around 10 minutes within that time frame.

We will migrate all sandbox databases to a new dedicated server hosted on SoftLayer. This migration should not have any negative impact on the performance or availability.

As usual, if you run into any issues or if you have any questions please get in touch with us at support@graphenedb.com.

Maintenance 17:00-18:00 CEST. Introducing Support for Neo4j 1.9 & 2.0

There will be maintenance window today at 17:00-18:00 CEST (Berlin time) today while we deploy the latest changes in our backend.

This is some of the new things you can expect:

Unique domains

Sandbox databases are getting unique domains, to avoid issues with stored authentication credentials on certain browsers.

You should update the connection settings of your app, although we will ensure compatibility with the old settings for a week.

Running Neo4j Community Edition 1.9.RC2 and 2.0.0-M02

Until now we have just been offering to run Community Edition 1.8.2 stable, but some of you have asked for support of other versions. Perhaps you can be interested in trying or benchmarking your app with different versions of Neo4j.

From today on we are introducing versions 1.9.RC2 (release candidate) and 2.0.0-M02 (developer preview), besides 1.8.2 stable.

As usual, if you run into any issues or if you have any questions please get in touch with us at support@graphenedb.com.

April News: Updates From the Last Few Weeks

Berlin & Graph DB meetup

I did a short trip to Berlin last month. During that trip I also attended a graph DB meetup where I gave a talk about polyglot persistence with Neo4j.

I’d like to thank Michael Hunger, Stefan Plantikow, Peter Neubauer and Pernilla Lindh for organizing the event, being a great host and most of all for the great conversation.

New features

This is a summary of the latest features that we have released.

Restore

We have introduced the restore feature around a month ago. You can do a DB restore using a zipped Neo4j DB file. This way, you can easily dump the contents of the DB on your local machine or a sample data set and take to your DB hosted on GrapheneDB,.

We will add support to export your DB using the same format in the next weeks.

Improved compatibility with drivers

We have a test suite to ensure that popular languages and drivers work seamlessly with GrapheneDB.

We have added a few new drivers to that test suite and also included configuration snippets for them in the dashboard. This doesn’t mean that other drivers won’t work, it’s just a list of drivers that we use to test against to make sure everything works fine.

For a complete list of our tested drivers please visit our FAQ section.

Vanilla Neo4j

One of the reasons that we have been very busy lately and also the most important feature is that we have migrated our architecture to host vanilla Neo4j instances rather than a wrapper around Neo4j.

This means you can expect the same features from GrapheneDB as you get from Neo4j Community Edition. We explicitely don’t support Gremlin and REST API Traversals because they rely on Groovy, and that could cause security issues on our shared machines for the sandbox plans. At the moment we are running Neo4j v1.8.2, but we also plan to allow other versions in the future.

Support for indexes

Index support has ben on our to do list for a while. As a nice side effect of the migration to vanilla Neo4j indexes are now fully supported.

NoSQL matters this week

I’m attending NoSQL matters in Cologne today and tomorrow. Feel free to ping me on twitter or mail me at alberto at graphenedb dot com if you want to talk.

Maintenance Window Today 7pm-8pm CEST

There will be maintenance window at 19:00-20:00 CEST (Berlin time) today while we deploy the new architecture and migrate all existing databases to the new sandbox plans running Neo4j 1.8.2.

What this means for you

  • As part of this change in the architecture and the DB instances the URLs to access your databases Neo4j REST API will change. After the migration you will need to login and get your new connection settings.
  • Your existing DB will be migrated without data loss to our new sandbox plans running vanilla Neo4j.
  • All new databases will be setup as vanilla Neo4j instances.
  • You can expect all features that you can get from Neo4j community edition, v 1.8.2. excluding: REST API Traversals and Gremlin. It’s worth mentioning that Cypher is a great replacement for both and fully supported.
  • You will have access to Neo4j Web Admin interface. You will find a link to it in your database dashboard after the migration is concluded.
  • Support of Neo4j REST drivers has been improved. Any driver compatible with Neo4j 1.8.2 Community Edition should work as well with GrapheneDB. If you do find an issue just let us know and we will fix it ASAP.

Documentation and support

We are in the process of updating our website and the developer docs to reflect this change. Please be patient if you can’t see the whole content yet.

As usual, if you run into any issues or if you have any questions please get in touch with us at support@graphenedb.com.