As a truly distributed system, Cassandra does not use a master. All clusters have equal permissions and can process every database request, which significantly increases performance. Data is distributed across nodes. The system can also be easily scaled by simply adding more nodes. After installing Cassandra, all you have to do is distribute the configuration files to the new nodes. Cassandra provides tools for this.
Apache Cassandra features a configurablereplication system to ensure resilience and recovery of data in the event of a failure. Fault tolerance is minimized because the data is automatically replicated between the nodes. Failed nodes can be easily replaced. The system remains available for requests at all times.
Cassandra also offers high availability and partition tolerance. According to the CAP theorem in computer science, it is impossible to guarantee consistency, availability, and partition tolerance at the same time. Consistency, meaning that all nodes see the same data at all times, has the lowest priority in many big data systems. After a failure, consistency can be quickly restored through data recovery, whereas the other two properties must be ensured at all times.
Cassandra databases support the MapReduce programming model developed by Google for calculations involving large amounts of data in distributed systems. The proprietary query language CQL (Cassandra Query Language) is designed especially for the data structures of Cassandra.