Syntax for creating keyspace in Cassandra isCREATE KEYSPACE <identifier> WITH <properties>
In Cassandra, a collection of rows is referred as "column family".
DataStaxOpsCenter: It is an internet-based management and monitoring solution for Cassandra cluster and DataStax. It is free to download and includes an additional Edition of OpsCenter.
SPM: SPM primarily administers Cassandra metrics and various OS and JVM metrics. It also monitors Hadoop, Spark, Solr, Storm, zookeeper and other Big Data platforms besides Cassandra.
ALTER KEYSPACE can be used to change properties such as the number of replicas and the durable_write of a keyspace.
Node: A node is a single machine running Cassandra.
Cluster: A cluster is a collection of nodes that contains similar types of data together.
Datacenter: A datacenter is a useful component when serving customers in different geographical areas. Different nodes of a cluster can be grouped into different data centers.
Cassandra-CQL collection is used to store multiple values in single variable. Cassandra facilitates you to use CQL collections in following ways:
* List: List is used when the order of the data needs to be maintained, and a value is to be stored multiple times (holds the list of unique elements).
* SET: SET is used for group of elements to store and returned in sorted orders (holds repeating elements).
* MAP: MAP is a data type used to store a key-value pair of elements.
Apache Hadoop, File Storage, Grid Compute processing via Map Reduce.
Apache Hive, SQL like interface on top of hadoop.
Apache Hbase, Column Family Storage built like BigTable
Apache Cassandra, Column Family Storage build like BigTable with Dynamo topology and consistency.
Database replication is the frequent electronic copying data from a database in one computer or server to a database in another so that all users share the same level of information.
Cassandra stores replicas on multiple nodes to ensure reliability and fault tolerance. A replication strategy determines the nodes where replicas are placed. The total number of replicas across the cluster is referred to as the replication factor. A replication factor of 1 means that there is only one copy of each row on one node. A replication factor of 2 means two copies of each row, where each copy is on a different node. All replicas are equally important; there is no primary or master replica. As a general rule, the replication factor should not exceed the number of nodes in the cluster. However, you can increase the replication factor and then add the desired number of nodes later.
Data center is a collection of related nodes.
Commit log is a crash-recovery mechanism in Cassandra. Every write operation is written to the commit log.
Bloom filter is an off-heap data structure to check whether there is any data available in the SSTable before performing any I/O disk operation.
In zero consistency the write operations will be handled in the background, asynchronously. It is the fastest way to write data.
Kundera is an object-relational mapping (ORM) implementation for Cassandra which is written using Java annotations.
Starting Cassandra involves connecting to the machine where it is installed with the proper security credentials, and invoking the cassandra executable from the installation's binary directory. An example of starting Cassandra on Mac could be:sudo /Applications/Cassandra/apache-cassandra-1.1.1/bin/cassandra