Advanced Search
Search Results
6 total results found
Concepts in Fault Tolerance
Concepts relevant to Fault Tolerance, Reliability and Resiliency.
gRPC and Fault Tolerance
Using gRPC to create fault tolerant and resilient systems
etcd
etcd is a fault tolerant, fully consistent distributed key value store
Network Partition
Consider a set of machines connected in a network of some arbitrary topology, with the implicit expectation that every machine in the set can talk to any other machine in the set. Note we normally refer to a process in one machine sending messages to a proces...
gRPC deadlines and retries
To have transparent retries on RPCs from a gRPC client, it is desirable to configure a deadline for the RPC, eg, 10 seconds. Either programmatically or via a service_config.json file (timeout parameter). Enabling automated retries during that period, so that ...
Introduction to etcd
Etcd is a distributed, reliable key-value store. Etcd can be used as a cornerstone service to implement highly available distributed systems such as Kubernetes. It is open source and available from GitHub. Automated failover, consensus, and etcd Historically...