It might be good to have a status page, or some automatic monitoring of the number of 500 errors occurring. When the API servers act up, it's us they blame, not Netflix.
Brent, we actually have redundant monitoring. One monitors server error messages and the other pings API endpoints from the outside. These errors we're intermittent enough that they shouldn't trigger the external monitoring. They were also below the thresholds for our internal monitoring to kick in. We're adjusting the thresholds as we think this level of errors should have initiated investigation.
I'm seeing not titles appear in my application. Is the API service down, or has it changed?
Message edited by Brent 3 years ago
Brent – 3 years ago
Looking more closely, all calls are returning a 500 code, Internal Error.
mikey – 3 years ago
yeah it looks like one on our servers is misbehaving. We are restarting it, and it should be fixed shortly.. Thanks for the heads up!
=-mikey-=
Brent – 3 years ago
It might be good to have a status page, or some automatic monitoring of the number of 500 errors occurring. When the API servers act up, it's us they blame, not Netflix.
mikey – 3 years ago
we do have automatic monitoring in place, I'm not sure why our monitoring didn't pick this up.
Michael Hart – 3 years ago
Brent, we actually have redundant monitoring. One monitors server error messages and the other pings API endpoints from the outside. These errors we're intermittent enough that they shouldn't trigger the external monitoring. They were also below the thresholds for our internal monitoring to kick in. We're adjusting the thresholds as we think this level of errors should have initiated investigation.