{"id":160,"date":"2020-08-30T23:07:58","date_gmt":"2020-08-30T23:07:58","guid":{"rendered":"https:\/\/www.ridgeline-analytics.com\/?p=160"},"modified":"2020-09-30T02:54:25","modified_gmt":"2020-09-30T02:54:25","slug":"what-is-apache-cassandra","status":"publish","type":"post","link":"https:\/\/www.ridgeline-analytics.com\/index.php\/2020\/08\/30\/what-is-apache-cassandra\/","title":{"rendered":"What is Apache Cassandra?"},"content":{"rendered":"\n<p>Apache Cassandra is a distributed, highly scalable,&nbsp; high performance NoSQL database.&nbsp; It offers several advantages over traditional relational database management systems (RDBMS), particularly in write-intensive, globally distributed, high availability situations that span geographies and datacenters.<\/p>\n\n\n\n<p>Apache Cassandra is Open Source and distributed under the Apache 2.0 license.&nbsp; Originally developed by Facebook, Cassandra is used by many global enterprises including Apple, Cisco, and Netflix.<\/p>\n\n\n\n<p>Unlike traditional RDBMS systems, Apache Cassandra is a NoSQL database.&nbsp; Rather than relying on related tables to describe data, Cassandra uses a simplified data storage architecture known as \u2018wide column\u2019.&nbsp; This allows for the simplicity of key value storage, but row data types can vary per row, allowing for the flexibility of tabular data storage.&nbsp; Cassandra also has the concept of \u2018Column Families\u2019 which allow grouping of columns into tables, but rows do not all need to contain the same columns.&nbsp; Key \/ Value storage is fundamental to NoSQL databases as they allow for fast indexing, writes, and retrieval.<\/p>\n\n\n\n<p>NoSQL databases forego complex transactions and guaranteed consistency in favor of a highly scalable, strongly or eventually consistent model which is well-suited to internet-scale applications.<\/p>\n\n\n\n<p>Apache Cassandra is masterless, meaning that all nodes in a Cassandra cluster are active and communicating with each other. Any node in the cluster can accept and serve requests, and in the event of a failure to a given node, traffic can be automatically redirected to another active node with no need for complex master &#8211; slave replication schemes.&nbsp; Cassandra automatically distributes and maintains data across the cluster with no need for complex sharding and disk partitioning.<\/p>\n\n\n\n<p>Additionally, Cassandra\u2019s replication approach is much simpler than multi-master or master-slave architectures.&nbsp; Once a replication schema is created, it is automatically managed across all nodes of the cluster without need for any additional administration.&nbsp;&nbsp;<\/p>\n\n\n\n<p>Cassandra also exposes an SQL-type query and management interface called Cassandra Query Language (CQL) which allows for developers and administrators to interface the system using familiar RDBMS queries.<\/p>\n\n\n\n<p><strong>Why use Apache Cassandra?<\/strong><\/p>\n\n\n\n<p>Due to it\u2019s highly distributed and fault tolerant nature, Apache Cassandra is well-suited to globally distributed, write and read-intensive applications that require high availability and high scalability. &nbsp; A few examples include Social Media data, IOT Sensor data, User Tracking and Messaging applications.<\/p>\n\n\n\n<p>A common reason to use Apache Cassandra is to locate highly available database clusters close to end users in a globally available application.&nbsp; Since Cassandra nodes can be replicated across any type of infrastructure, including private, public, and hybrid clouds, Cassandra is well suited to geographically distributed applications. &nbsp; Reads and writes can be delivered with low latency, close to the end user, and replicated throughout the cluster from any node.&nbsp; This is especially important in high throughput scenarios where locating data infrastructure close to the end user can result in a significant reduction in bandwidth costs.<\/p>\n\n\n\n<p>Additionally, Apache Cassandra is well suited for applications that may require significant scaling up or down.&nbsp; Adding and removing nodes in a Cassandra cluster is simple and requires no downtime.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Apache Cassandra is a distributed, highly scalable,&nbsp; high performance NoSQL database.&nbsp; It offers several advantages over traditional relational database management systems (RDBMS), particularly in write-intensive, globally distributed, high availability situations that span geographies and datacenters. Apache Cassandra is Open Source and distributed under the Apache 2.0 license.&nbsp; Originally developed by Facebook, Cassandra is used by &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.ridgeline-analytics.com\/index.php\/2020\/08\/30\/what-is-apache-cassandra\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;What is Apache Cassandra?&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/www.ridgeline-analytics.com\/index.php\/wp-json\/wp\/v2\/posts\/160"}],"collection":[{"href":"https:\/\/www.ridgeline-analytics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ridgeline-analytics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ridgeline-analytics.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ridgeline-analytics.com\/index.php\/wp-json\/wp\/v2\/comments?post=160"}],"version-history":[{"count":1,"href":"https:\/\/www.ridgeline-analytics.com\/index.php\/wp-json\/wp\/v2\/posts\/160\/revisions"}],"predecessor-version":[{"id":161,"href":"https:\/\/www.ridgeline-analytics.com\/index.php\/wp-json\/wp\/v2\/posts\/160\/revisions\/161"}],"wp:attachment":[{"href":"https:\/\/www.ridgeline-analytics.com\/index.php\/wp-json\/wp\/v2\/media?parent=160"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ridgeline-analytics.com\/index.php\/wp-json\/wp\/v2\/categories?post=160"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ridgeline-analytics.com\/index.php\/wp-json\/wp\/v2\/tags?post=160"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}