Payload size and large search functionality


#1

I have a conceptual question that someone with experience running graphcool in production or someone working for graphcool will hopefully know.

I have a Recipe type which has a many to many relationship with the Ingredient type and want to implement querying recipes by their ingredients so I can compute a list of recipes that will be easy to make with a given list of ingredients. The schema is basically already set up as an inverted index so the concept is sound but I wonder how it will scale.

Are there any known upper thresholds to payload size? Say I end up querying a list of 30 ingredients that each have thousands of recipes connected. Or will the shared and private cluster memory be the only limiting factor.

I know I can turn to things like Elasticsearch to linearly scale this type of application but I would like to know how far I could push the graphcool service.


#2

As long as you can express your queries with the filter api, you should have no problem querying that amount of data with Graphcool. We have customers operating millions of nodes without issue. Is that the case?

If for some reason you have to download all the data to the client to perform more complex calculations, then you will see performance degrade as the data size grows.

In either case, I would suggest generating a large body of test data for your data structure and experiment a little to get a feeling for performance. You should also be aware that there is a limit of 1000 nodes returned in a single request on the shared cluster, so if you really want to return all data you will have to implement pagination.


#3

Thanks, the 1000 node limit is good to know and I will take your advice on generating test data.