[Network error]: FetchError: request to http://localhost:4466/app failed, reason: socket hang up

prisma

#1

Hi,

I have a custom express endpoint in my index.js, along the lines of (pseudocode):

server.express.post('/upload', async function(req, res) {

// req contains a file with a list of values
// which are inserted into the prisma backend using
// the prisma api, like so:

for (record in req.data)
   db.mutation.updateStock(record)

});

so I’m able to upload data to through my app using e.g. curl:

curl -X POST 'http://localhost:4000/upload' -F "data=@file.tsv;" --header "Connection: keep-alive"

This works if file is small, like 1K records. For larger files, such as 5K the prisma server returns the error in the topic description. However, if I restart prisma, then it works for the first time, and then fails with the same error for subsequent files. The java process consumes about 0.8Gb of memory (why are you doing this in java??). Seems like a memory leak.

Thanks
Marco


#2

prisma cluster logs show this:

prisma-database_1 | {“key”:“error/unhandled”,“requestId”:“api:api:cjfwfqq1z1ylo09528iztfr6e”,“clientId”:“inventory@dev”,“payload”:{“exception”:“java.util.concurrent.RejectedExecutionException: Task slick.basic.BasicBackend$DatabaseDef$$anon$2@56b17ecc rejected from slick.util.AsyncExecutor$$anon$2$$anon$1@564fb656[Running, pool size = 10, active threads = 10, queued tasks = 1000, completed tasks = 137185]”,“query”:“mutation ($_data: StockUpdateInput!, $_where: StockWhereUniqueInput!) {\n updateStock(data: $_data, where: $_where) {\n id\n date\n onHand\n onOrder\n avgPVP\n }\n}\n”,“variables”:"{"_data":{“date”:“2018-04-10T23:00:00.000Z”,“onHand”:3},"_where":{“id”:“2195782_3266192”}}",“code”:“0”,“stack_trace”:“java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)\n java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)\n java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)\n slick.util.AsyncExecutor$$anon$2$$anon$3.execute(AsyncExecutor.scala:120)\n slick.basic.BasicBackend$DatabaseDef.runSynchronousDatabaseAction(BasicBackend.scala:233)\n slick.basic.BasicBackend$DatabaseDef.runSynchronousDatabaseAction$(BasicBackend.scala:231)\n slick.jdbc.JdbcBackend$DatabaseDef.runSynchronousDatabaseAction(JdbcBackend.scala:38)\n slick.basic.BasicBackend$DatabaseDef.runInContext(BasicBackend.scala:210)\n slick.basic.BasicBackend$DatabaseDef.runInContext$(BasicBackend.scala:142)\n slick.jdbc.JdbcBackend$DatabaseDef.runInContext(JdbcBackend.scala:38)\n slick.basic.BasicBackend$DatabaseDef.runInternal(BasicBackend.scala:78)\n slick.basic.BasicBackend$DatabaseDef.runInternal$(BasicBackend.scala:77)\n slick.jdbc.JdbcBackend$DatabaseDef.runInternal(JdbcBackend.scala:38)\n slick.basic.BasicBackend$DatabaseDef.run(BasicBackend.scala:75)\n slick.basic.BasicBackend$DatabaseDef.run$(BasicBackend.scala:75)\n slick.jdbc.JdbcBackend$DatabaseDef.run(JdbcBackend.scala:38)\n com.prisma.api.connector.mysql.database.DataResolverImpl.$anonfun$batchResolveByUnique$1(DataResolverImpl.scala:113)\n com.prisma.metrics.TimerMetric.timeFuture(Metrics.scala:111)\n com.prisma.api.connector.mysql.database.DataResolverImpl.performWithTiming(DataResolverImpl.scala:34)\n com.prisma.api.connector.mysql.database.DataResolverImpl.batchResolveByUnique(DataResolverImpl.scala:113)\n com.prisma.api.connector.mysql.database.DataResolverImpl.resolveByUnique(DataResolverImpl.scala:71)\n com.prisma.api.mutations.Update.dataItem$lzycompute(Update.scala:36)\n com.prisma.api.mutations.Update.dataItem(Update.scala:36)\n com.prisma.api.mutations.Update.prepareMutactions(Update.scala:39)\n com.prisma.api.mutations.ClientMutationRunner$.run(ClientMutationRunner.scala:17)\n com.prisma.api.schema.SchemaBuilderImpl.$anonfun$updateItemField$2(SchemaBuilder.scala:147)\n sangria.execution.Resolver.resolveField(Resolver.scala:1024)\n sangria.execution.Resolver.resolveSingleFieldSeq(Resolver.scala:236)\n sangria.execution.Resolver.$anonfun$resolveSeq$2(Resolver.scala:216)\n scala.concurrent.Future.$anonfun$flatMap$1(Future.scala:302)\n scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:37)\n scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)\n akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)\n akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:91)\n scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)\n scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:81)\n akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:91)\n akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)\n akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:43)\n akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)\n akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)\n akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)\n akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)”,“message”:“Task slick.basic.BasicBackend$DatabaseDef$$anon$2@56b17ecc rejected from slick.util.AsyncExecutor$$anon$2$$anon$1@564fb656[Running, pool size = 10, active threads = 10, queued tasks = 1000, completed tasks = 137185]”}}


#3

This is not a memory leak.

Mutations and queries result in “tasks” for the database/database driver. Tasks that aren’t worked on right now are put into a queue.

To protect the database, that queue can contain maximally 1000 elements. If you try to add more tasks than that, they will be rejected. That’s pretty common :slight_smile:

Note that one mutation or query most likely results in more than 1 task.

The solution is to not “bombard” your API with 5k mutations within ms. Rather, implement an approach where you restrict the maximum number of concurrent requests.

For example using Bluebird:

const LIMIT = 8 // example
await Bluebird.map(
  req.data, record => db.mutation.updateStock(record)
  { concurrency: LIMIT }
)

#4

Hi nilan, thanks for the reply.

I see. I still don’t understand why restarting the docker container would allow for “bombarding” the database driver once without issues. Maybe failed tasks are not being removed from the queue?

Anyway, thanks for the workaround!

Marco


#5

How long are you waiting in between two tries of the same script? Might be a memory/connection leak after all! Can you prepare a reproduction please?


#6
  1. Restart docker
  2. Call script once (OK)
  3. Call script again (Failed)
  4. Restart docker
  5. Call script once (OK)

There is no waiting between any of the steps.

I’ll try to make an example for you, but it won’t be easy since I cannot disclose the data, so I have to make it up.

Thanks


#7

Thanks!

Did you upgrade the dependencies in package.json, and are on the latest Prisma?
Especially graphql-yoga and prisma-binding are of interest here.


#8

Here it is:

"prisma-binding": "1.4.0"
"prisma": "1.0.8"
"graphql-yoga": "1.2.0",
"graphql-cli": "2.12.4",

I’m currently thinking that this might be a problem on the mysql side. I’ve noticed a big improvement when wrapping the updates in a BEGIN…END transaction (I had to use mysql2 bindings for that).

I will try to dig a bit more into this when I find the time.

Thanks


#9

Ok!

The latest versions are as follows:

"prisma-binding": "1.5.17",
"prisma": "1.6.3",
"graphql-yoga": "1.8.5",
"graphql-cli": "2.15.9",

So upgrading that should already result in improved performance. There a couple of performance improvements coming to mind that are relevant, for example this one in Prisma. Also, there is an upcoming performance improvement in prisma-binding which you also should check out once it’s released.