CouchDB



1. What is CouchDB?  

Short Answer

CouchDB is an open source database developed by Apache software foundation. The focus is on the ease of use, embracing the web. It is a NoSQL document store database.

It uses JSON, to store data (documents), java script as its query language to transform the documents, http protocol for api to access the documents, query the indices with the web browser.It was started by Damien Katz in 2005. It is a multi master application released in 2005 and it became an apache project in 2008.

Long Answer

CouchDB is a database that completely embraces the web. Store your data with JSON documents. Access your documents and query your indexes with your web browser, via HTTP. Index, combine, and transform your documents with JavaScript. CouchDB works well with modern web and mobile apps. You can even serve web apps directly out of CouchDB. And you can distribute your data, or your apps, efficiently using CouchDB’s incremental replication.

CouchDB supports master-master setups with automatic conflict detection.
CouchDB comes with a suite of features, such as on-the-fly document transformation and real-time change notifications, that makes web app development a breeze. It even comes with an easy to use web administration console. You guessed it, served up directly out of CouchDB!

We care a lot about distributed scaling. CouchDB is highly available and partition tolerant, but is also eventually consistent. And we care a lot about your data. CouchDB has a fault-tolerant storage engine that puts the safety of your data first.



2. What Language is CouchDB Written in?  

Short Answer

Erlang.

Long Answer

Erlang, a concurrent, functional programming language with an emphasis on fault tolerance. Early work on CouchDB was started in C++ but was replaced by Erlang OTP platform. Erlang has so far proven an excellent match for this project.

CouchDB’s default view server uses Mozilla’s Spidermonkey JavaScript library which is written in C. It also supports easy integration of view servers written in any language.



3. History of CouchDB?  

Couch is an acronym for cluster of unreliable commodity hardware.The CouchDB project was created in April 2005 by Damien Katz, former Lotus Notes developer at IBM. He self-funded the project for almost two years and released it as an open source project under the GNU General Public License.

In February 2008, it became an Apache Incubator project and was offered under the Apache License instead. A few months after, it graduated to a top-level project. This led to the first stable version being released in July 2010.

In early 2012, Katz left the project to focus on Couchbase Server.

Since Katz's departure, the Apache CouchDB project has continued, releasing 1.2 in April 2012 and 1.3 in April 2013. In July 2013, the CouchDB community merged the codebase for BigCouch, Cloudant's clustered version of CouchDB, into the Apache project. The BigCouch clustering framework is prepared to be included in an upcoming release of Apache CouchDB.

Native clustering is supported at version 2.0.0. And the new Mango Query Server provides a simple JSON-based way to perform CouchDB queries without JavaScript or MapReduce.



4. Basics of CouchDB?  

  1. Items stored in the database is documents.
  2. Each documents has units called, fields , and fields are key and value pair.
  3. Each document has two kind of fields. metafields and datafields , metafields include information,like id and revision number ( a must ) of the document and datafields are user defined fields. Two mandatory metafields are id and revision represented by keys "_id" and "_rev",respectively.
  4. Metafields are idententified by _ as prefix , datafields are not allowed to have _ as prefix
  5. Each document has an unique id to identify them "_id" being the key and the value of each document must be unique.
  6. Each document has an revision, number. A revision identifies how many times the document has been modified. with key as "_rev" and the value is auto generated value with a number and _ prefixed to it ( metafield ) for example 1-967a00dff5e02add41819138abb3284d means 1st version and 2-d9bca076e962cb389b7601ca9cfda2f9 means 2nd revisoin.
  7. Document can many metafields and datafields.
  8. Each document is JSON objects.
  9. Documents in CouchDB can have attachments as fields , for example if you wish to store an image in the document , it is possible , the value is stored as base64-encoded. the key is attachments.
  10. Database operations are done through REST Style http requests.
  11. The http requests can be made by any http request clients which supports GET,PUT,DELETE and POST.
  12. CouchDB ships with an web interface called Futon, by which you can manage / operate on the databases You can create,delete,update retrieve database and its documents in place.
  13. CouchDb's responses are in JSON format.



5. Why CouchDB?  

CouchDB have an HTTP-based REST API, which helps to communicate with the database easily. And the simple structure of HTTP resources and methods (GET, PUT, DELETE) are easy to understand and use.

As we store data in the flexible document-based structure, there is no need to worry about the structure of the data.

Users are provided with powerful data mapping, which allows querying, combining, and filtering the information.

CouchDB provides easy-to-use replication, using which you can copy, share, and synchronize the data between databases and machines.



6. Data Models of CouchDB?  

  • Database is the outermost data structure/container in CouchDB.
  • Each database is a collection of independent documents.
  • Each document maintains its own data and self-contained schema.
  • Document metadata contains revision information, which makes it possible to merge the differences occurred while the databases were disconnected.
  • CouchDB implements multi version concurrency control, to avoid the need to lock the database field during writes.


7. What is Not?  

  • A relational database.
  • A replacement for relational databases.
  • An object-oriented database.Or more specifically,meant to function as a seamless persistence layer for an OO programming language.


8. Is CouchDB ready for Production?  

Yes,There are many companies using CouchDB.



9. Why Does CouchDB Not Use Mnesia?  

Several reasons:
  • The first is a storage limitation of 2 gig per file.
  • The second is that it requires a validation and fixup cycle after a crash or power failure, so even if the size limitation is lifted, the fixup time on large files is prohibitive.
  • Mnesia replication is suitable for clustering, but not disconnected, distributed edits. Most of the “cool” features of Mnesia aren’t really useful for CouchDB.
  • Also Mnesia isn’t really a general-purpose, large scale database. It works best as a configuration type database, the type where the data isn’t central to the function of the application, but is necessary for the normal operation of it. Think things like network routers, HTTP proxies and LDAP directories, things that need to be updated, configured and reconfigured often, but that configuration data is rarely very large.


10. How do I use transactions with CouchDB?  

Short Answer

CouchDB uses an “Optimistic concurrency” model. In the simplest terms, this just means that you send a document version along with your update, and CouchDB rejects the change if the current document version doesn’t match what you’ve sent.

You can re-frame many normal transaction based scenarios for CouchDB. You do need to sort of throw out your RDBMS domain knowledge when learning CouchDB, though.

It’s helpful to approach problems from a higher level, rather than attempting to mold Couch to a SQL based world.

Long Answer

Keeping track of inventoryThe problem you outlined is primarily an inventory issue. If you have a document describing an item, and it includes a field for “quantity available”, you can handle concurrency issues like this:
  • Retrieve the document, take note of the _rev property that CouchDB sends along
  • Decrement the quantity field, if it’s greater than zero
  • Send the updated document back, using the _rev property
  • If the _rev matches the currently stored number, be done!
  • If there’s a conflict (when _rev doesn’t match), retrieve the newest document version

In this instance, there are two possible failure scenarios to think about. If the most recent document version has a quantity of 0, you handle it just like you would in a RDBMS and alert the user that they can’t actually buy what they wanted to purchase. If the most recent document version has a quantity greater than 0, you simply repeat the operation with the updated data, and start back at the beginning. This forces you to do a bit more work than an RDBMS would, and could get a little annoying if there are frequent, conflicting updates.

Now, the answer I just gave presupposes that you’re going to do things in CouchDB in much the same way that you would in an RDBMS. I might approach this problem a bit differently:

I’d start with a “master product” document that includes all the descriptor data (name, picture, description, price, etc). Then I’d add an “inventory ticket” document for each specific instance, with fields for product_key and claimed_by. If you’re selling a model of hammer, and have 20 of them to sell, you might have documents with keys like hammer-1, hammer-2, etc, to represent each available hammer.

Then, I’d create a view that gives me a list of available hammers, with a reduce function that lets me see a “total”. These are completely off the cuff, but should give you an idea of what a working view would look like.

Map
function(doc)
{
if (doc.type == ‘inventory_ticket’ && doc.claimed_by == null ) {
emit(doc.product_key, { ‘inventory_ticket’ :doc.id, ‘_rev’ : doc._rev });
}
}

This gives me a list of available “tickets”, by product key. I could grab a group of these when someone wants to buy a hammer, then iterate through sending updates (using the id and _rev) until I successfully claim one (previously claimed tickets will result in an update error).

Reduce
function (keys, values, combine) {
return values.length;
}

This reduce function simply returns the total number of unclaimed inventory_ticket items, so you can tell how many “hammers” are available for purchase.

Caveats

This solution represents roughly 3.5 minutes of total thinking for the particular problem you’ve presented. There may be better ways of doing this! That said, it does substantially reduce conflicting updates, and cuts down on the need to respond to a conflict with a new update. Under this model, you won’t have multiple users attempting to change data in primary product entry.



11. How do you compare MongoDB, CouchDB and CouchBase?  

MongoDB and CouchDB are document oriented database.

MongoDB and CouchDB are the most typical representative of the open source NoSQL database.

They have nothing in common other than are stored in the document outside.

MongoDB and CouchDB, the data model interface, object storage and replication methods have many different.



12. How is PouchDB different from CouchDB?  

PouchDB is also a CouchDB client, and you should be able to switch between a local database or an online CouchDB instance without changing any of your application’s code.

However, there are some minor differences to note:

View Collation – CouchDB uses ICU to order keys in a view query; in PouchDB they are ASCII ordered.

View Offset – CouchDB returns an offset property in the view results. In PouchDB, offset just mirrors the skip parameter rather than returning a true offset.



13. So is CouchDB now going to written in Java?  

Erlang is a great fit for CouchDB and I have absolutely no plans to move the project off its Erlang base. IBM/Apache’s only concerns are we remove license incompatible 3rd party source code bundled with the project, a fundamental requirement for any Apache project. So some things may have to replaced in the source code (possibly Mozilla Spidermonkey), but the core Erlang code stays.

An important goal is to keep interfaces in CouchDB simple enough that creating compatible implementations on other platforms is feasible. CouchDB has already inspired the database projects RDDB and Basura. Like SQL databases, I think CouchDB needs competition and a ecosystem to be viable long term. So Java or C++ versions might be created and I would be delighted to see them, but it likely won’t be me who does it.



14. What does IBM’s involvement mean for CouchDB and the community?  

The main consequences of IBM’s involvement are:
  • – The code is now being apache licensed, instead of GPL.
  • – Damien is going to be contributing much more time!


15. Mention the main features of CouchDB?  

JSON Documents – Everything stored in CouchDB boils down to a JSON document.

RESTful Interface – From creation to replication to data insertion, every management and data task in CouchDB can be done via HTTP.

N-Master Replication – You can make use of an unlimited amount of ‘masters’, making for some very interesting replication topologies.

Built for Offline – CouchDB can replicate to devices (like Android phones) that can go offline and handle data sync for you when the device is back online.

Replication Filters – You can filter precisely the data you wish to replicate to different nodes.



16. What is the use of CouchDB?  

CouchDB allows you to write a client side application that talks directly to the Couch without the need for a server side middle layer, significantly reducing development time. With CouchDB, you can easily handle demand by adding more replication nodes with ease. CouchDB allows you to replicate the database to your client and with filters you could even replicate that specific user’s data.

Having the database stored locally means your client side application can run with almost no latency. CouchDB will handle the replication to the cloud for you. Your users could access their invoices on their mobile phone and make changes with no noticeable latency, all whilst being offline. When a connection is present and usable, CouchDB will automatically replicate those changes to your cloud CouchDB.

CouchDB is a database designed to run on the internet of today for today’s desktop-like applications and the connected devices through which we access the internet.



17. What is CouchdbKit?  

Couchdbkit’s goal is to provide a framework for your Python application to access and manage Couchdb. It provides you a full featured and easy client to access and manage CouchDB. It allows you to manage a CouchDB server, databases, doc managements and view access. All objects mostly reflect python objects for convenience. Server and Databases objects could be used for example as easy as using a dict.



18. How much stuff can be stored in Couchdb?  

For node partitioning, basically unlimited. The practical scaling limits for a single database instance, are not yet known.



19. What does IBM’s involvement mean for Couchdb and community?  

The main impacts of IBM’s involvement are:
  • The Damien is going to bring about much more time.
  • Instead of GPL, the code is now being apache licensed.


20. What does Couch mean?  

It's an acronym,Cluster of Unreliable Commodity Hardware.This is a statement of Couch's long term goals of massive scalability and high reliability on fault-prone hardware.The distributed nature and flat address space of the database will enable node partitioning for storage scalability (with a map/reduce style query facility) and clustering for reliability and fault tolerance.



Java Interview Question

.Net Interview Question

PHP Interview Question

AngularJS Interview Questions