<?xml version="1.0"?>
<rss version="2.0" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:yt="http://gdata.youtube.com/schemas/2007" xmlns:atom="http://www.w3.org/2005/Atom">
   <channel>
      <title>DataStax</title>
      <description>Pipes Output</description>
      <link>http://pipes.yahoo.com/pipes/pipe.info?_id=7f54951efec56fa0eb58da70d542b1c9</link>
      <atom:link rel="next" href="http://pipes.yahoo.com/pipes/pipe.run?_id=7f54951efec56fa0eb58da70d542b1c9&amp;_render=rss&amp;page=2"/>
      <pubDate>Sun, 26 May 2013 07:38:21 +0000</pubDate>
      <generator>http://pipes.yahoo.com/pipes/</generator>
      <item>
         <title>DataStax Enterprise 3.0.2 Now Available</title>
         <link>http://www.datastax.com/dev/blog/datastax-enterprise-3-0-2-now-available</link>
         <description>&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/products/enterprise&quot;&gt;DataStax Enterprise&lt;/a&gt; 3.0.2 is now available for &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/download&quot;&gt;download&lt;/a&gt;.  Please see the &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/docs/datastax_enterprise3.0/dse_release_notes#datastax-enterprise-3-0-2&quot;&gt;release notes..&lt;/a&gt; for specific information about bug fixes and improvements.</description>
         <guid isPermaLink="false">http://www.datastax.com/?post_type=dev-post&amp;p=15883</guid>
         <pubDate>Thu, 23 May 2013 10:00:12 +0000</pubDate>
         <content:encoded><![CDATA[<p><a rel="nofollow" target="_blank" href="http://www.datastax.com/products/enterprise">DataStax Enterprise</a> 3.0.2 is now available for <a rel="nofollow" target="_blank" href="http://www.datastax.com/download">download</a>.  Please see the <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/datastax_enterprise3.0/dse_release_notes#datastax-enterprise-3-0-2">release notes</a> for specific information about bug fixes and improvements.</p>]]></content:encoded>
      </item>
      <item>
         <title>DataStax Enterprise 3.0.2 Now Available</title>
         <link>http://www.datastax.com/2013/05/datastax-enterprise-3-0-2-now-available</link>
         <description>&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/products/enterprise&quot;&gt;DataStax Enterprise&lt;/a&gt; 3.0.2 is now available for &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/download&quot;&gt;download&lt;/a&gt;.  Please see the &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/docs/datastax_enterprise3.0/dse_release_notes#datastax-enterprise-3-0-2&quot;&gt;release notes..&lt;/a&gt; for specific information about bug fixes and improvements.</description>
         <guid isPermaLink="false">http://www.datastax.com/?p=15885</guid>
         <pubDate>Thu, 23 May 2013 08:00:58 +0000</pubDate>
         <content:encoded><![CDATA[<p><a rel="nofollow" target="_blank" href="http://www.datastax.com/products/enterprise">DataStax Enterprise</a> 3.0.2 is now available for <a rel="nofollow" target="_blank" href="http://www.datastax.com/download">download</a>.  Please see the <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/datastax_enterprise3.0/dse_release_notes#datastax-enterprise-3-0-2">release notes</a> for specific information about bug fixes and improvements.</p>]]></content:encoded>
         <category>Blog Post - Corporate</category>
      </item>
      <item>
         <title>Ideology and Testing of a Resilient Driver</title>
         <link>http://www.datastax.com/dev/blog/ideology-and-testing-of-a-resilient-driver</link>
         <description>Recently, DataStax released the 1.0.0 release of our Java Driver for Apache Cassandra...</description>
         <guid isPermaLink="false">http://www.datastax.com/?post_type=dev-post&amp;p=15825</guid>
         <pubDate>Mon, 20 May 2013 23:55:32 +0000</pubDate>
         <content:encoded><![CDATA[<p>Recently, DataStax released the 1.0.0 release of our Java Driver for Apache Cassandra. For the Java Driver, there was a large focus on ensuring that optimizations were handled gracefully, data was successfully written and verified against a Cassandra instance, and most importantly, that the driver could handle real-world cluster changes and issues as intended.</p>
<p>We began testing our work in this area by starting with our load balancing policies. For the Java Driver, we use the LoadBalancingPolicy interface and provide three load balancing policies by default: RoundRobinPolicy (default), TokenAwarePolicy, and DCAwareRoundRobinPolicy.</p>
<h2>RoundRobinPolicy</h2>
<p>For the RoundRobinPolicy, testing was decently straightforward. We not only tested that each subsequent request was spread out amongst all active nodes, but also that the active node list was accurate and actively updated after topological changes. This can be done without worry of where the final location of the data will be, since Cassandra implements <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/1.2/cluster_architecture/about_client_requests">coordinator</a> roles when issuing requests.</p>
<p>The client can connect to any cluster member, which acts as a coordinator for the requests it receives. In the event that the request fails, or the acting coordinator node dies, a subsequent request using the RoundRobinPolicy will choose another peer node as a coordinator and should complete the request successfully once the specified ConsistencyLevels are met.</p>
<h2>TokenAwarePolicy</h2>
<p>The TokenAwarePolicy is a more interesting case since this policy also requires a child policy that dictates which nodes are local nodes and which are remote nodes, as far as the driver is concerned. For cases with a single datacenter, the TokenAwarePolicy chooses the primary replica to be the chosen coordinator in hopes of cutting down latency by avoiding the typical coordinator-replica hop.</p>
<p>To test this policy we tried multiple setups of different child policies as well as multiple datacenters, some of which were programmed to disappear altogether, simulating full datacenter outages. Although we did find possible minor issues in this area, the fail-over seemed to work great and as expected. If anything, fail-over optimizations may still be made, but two things are for certain: token aware optimization works and fail-over scenarios continued without a hiccup.</p>
<p>Cassandra&#8217;s <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/1.1/cluster_architecture/partitioning#data-distribution-in-the-ring">token</a> policy is handled innately, abstracted away from the user, in order to provide a simple and robust environment for both operations and developer teams. It&#8217;s this same simple and robust <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/1.2/cluster_architecture/data_distribution#consistent-hashing">consistent hashing</a> that allows for the TokenAwarePolicy to optimize requests by contacting the ideal node. And of course, if this ever fails, the specified child policy becomes the active LoadBalancingPolicy as the search for a replica that is both alive and responsive continues.</p>
<h2>DCAwareRoundRobinPolicy</h2>
<p>The DCAwareRoundRobinPolicy takes the same fundamentals as the RoundRobinPolicy, but introduces support for <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/1.2/cluster_architecture/data_distribution#networktopologystrategy">multiple datacenters</a>. To test this policy we followed suit with implementing full datacenter outage cases and fail-over assertions when testing this policy, much like our TokenAwarePolicy tests. Although this may not initially seem highly beneficial, the case when datacenters exist in different parts of the world highlights why this is the best policy when dealing with remote datacenters.</p>
<p>Let&#8217;s take the use case of a Cassandra cluster that is split up into 3 Amazon regions: US-East, US-West, and EU-West with a keyspace replication_strategy of {&#8216;US-East&#8217;: 3, &#8216;US-West&#8217;: 1, &#8216;EU-West&#8217;: 1}, where US-West and EU-West are designed to primarily be for backup or fail-over scenarios for the given keyspace.</p>
<p>In the ideal situation, you wouldn&#8217;t want to spread your requests across a WAN since a climb in latency will be apparent for requests that have to cross the Atlantic Ocean. Instead, you would want to constrain your application to contact only the nodes on the LAN to cut down on this latency.</p>
<p>This does not mean that data will only stay in US-East, for this specific replication_strategy, but as far as your driver is concerned, it&#8217;s the primary datacenter. In the background, data is being migrated in batch for every write without developer interaction. (Although from an operations standpoint, <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/1.0/operations/cluster_management#running-routine-node-repair">routine repairs</a> are required to guarantee consistency.)</p>
<p>Even if datacenters were not physically separated by great distances, it&#8217;s best to choose a single datacenter and continue to write to just that one. This ensures all requests made to the cluster, for a given session, will work over the same data. This is the way that fail-over works on the DCAwareRoundRobinPolicy as well. If US-East ever falls over, you&#8217;re guaranteed to choose a single datacenter as the interim primary datacenter until your primary datacenter returns, thus providing a higher level of consistency at a lower latency. The DCAwareRoundRobinPolicy takes all this into consideration so that even new users can easily store data efficiently, without worrying too much on &#8220;how&#8221; it&#8217;s being stored since this has already been implemented and provided as part of the driver.</p>
<h2>What&#8217;s Next?</h2>
<p>For our next major release, we&#8217;ve already begun our Jenkins integration which should help us ensure that our test suite, which already has a runtime of more than an hour, is run daily against our master branch. This will ensure continuous stability as we grow both our test code and our product code.</p>
<p>We will also be building and ensuring that a long-running duration test shows that the driver continues to be accurate, performant, and stable. Even though a collection of 81 tests, covering 76% of the code, takes a full hour to be validated, these tests are all running on small disposable clusters that are running very specific tests. When we start testing our driver against 72+ hour duration tests, in conjunction with simulated chaotic environments, we will continue to ensure that data is always valid on both sides of the pipe and that our fault-tolerant database has a fault-tolerant client driver that is run on the new binary protocol.</p>
<p>Do continue to look out for updates around our client drivers as we continue to grow our documentation, examples, and our <a rel="nofollow" target="_blank" href="http://datastax.com/careers">great team</a>.</p>
<p>And if all that&#8217;s not enough, you can always meet with us face-to-face at the <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013">Cassandra Summit 2013&#8242;s</a> &#8220;Birds of a Feather&#8221; and &#8220;Stump the Experts&#8221; sessions!</p>]]></content:encoded>
      </item>
      <item>
         <title>Three Must-See Exec Talks at the Cassandra Summit</title>
         <link>http://www.datastax.com/2013/05/three-must-see-exec-talks-at-the-cassandra-summit</link>
         <description>The move from “Why?” to “How do I?” is an interesting transition that happens with new technology...</description>
         <guid isPermaLink="false">http://www.datastax.com/?p=15776</guid>
         <pubDate>Mon, 13 May 2013 20:48:31 +0000</pubDate>
         <content:encoded><![CDATA[<p>The move from “Why?” to “How do I?” is an interesting transition that happens with new technology. NoSQL is making that shift now, with some IT leaders still needing an understanding of what NoSQL can do for them, while others are more concerned with wanting an implementation strategy for rolling out NoSQL in their enterprise.</p>
<p>The good news is that we’ve got both bases covered in the executive track of our upcoming <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013">Cassandra Summit</a>. This is new for us this year and is happening for a very important reason: you asked for it! At last year’s Summit, we fielded numerous requests for a track dedicated to the business side of NoSQL technology, and I guarantee that what we have in store for you at this year’s event won’t disappoint.</p>
<p>The fact that Cassandra has gone mainstream can be seen in the <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#speakers">speaker lineup</a> that makes up this year’s Summit. In the executive track, we have a great set of tech execs who will be telling you both why they chose NoSQL and how they successfully implemented it in their organization.</p>
<p>Although <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#schedule">all the talks</a> look great, I thought I’d call out three of the ones I won’t be missing:</p>
<ol>
<li><b>Taking Risks Without Risking Your Career, Christos Kalantzis, Netflix</b>. Netflix was just christened as the <a rel="nofollow" target="_blank" href="http://www.zdnet.com/the-biggest-cloud-app-of-all-netflix-7000014298/">biggest cloud application in the world</a>. Netflix also stores 95% of their data on Cassandra today, making the switch from Oracle a few years ago. How does a smart IT exec safely go from tried-and-true, but old-and-won’t-do to new technology that can both transform how a company does business and save a significant sum of money in the process? Few have the kind of success and tell the story better than Christos and Netflix, so who better to learn from?</li>
<li><b>Stop Crippling Your Business: Fundamental Considerations Everyone Needs to Know, Vincent Dell’Anno, Accenture; John Whittaker, Dell</b>. Few execs know how to evade the perils of not creating proper NoSQL standards and how to avoid the unnecessary red tape that can bog down IT projects better than the guys from Accenture and Dell.</li>
<li><b>It&#8217;s like your parents: Relational and NoSQL can co-exist, Sean Knapp, Ooyala</b>. Having a co-existence strategy for new and legacy technology is key when it comes to implementing something like NoSQL: use the right tech in the right place at the right time. When you distribute video content for the likes of ESPN and Rolling Stone, plus take in and analyze literally ¼ of all the video views on the Internet like Ooyala does, you learn a thing or two about how to smartly put the right puzzle pieces together so relational and non-relational play nice.</li>
</ol>
<p>Of course, there are other great exec talks from Splunk, Cowen, Jaspersoft, and more that will help you answer both the “Why?” and “How do I?” questions that IT leaders are asking about NoSQL. Check out the <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#schedule">full schedule</a> and make plans to be at this year’s Summit, which will be here before you know it. <a rel="nofollow" target="_blank" href="http://datastax.regsvc.com/E2">Register now</a> and save 25% with the promo code <b>SFSummit25</b>.</p>
<p>See you there!</p>]]></content:encoded>
      </item>
      <item>
         <title>Using the DataStax ODBC Driver for Apache Cassandra</title>
         <link>http://www.datastax.com/dev/blog/using-the-datastax-odbc-driver-for-apache-cassandra</link>
         <description>DataStax is pleased to make available an ODBC driver for &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra&quot;&gt;Apache Cassandra&lt;/a&gt; that can be used free of charge with both open source Cassandra and &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/what-we-offer/products-services/datastax-enterprise&quot;&gt;DataStax Enterprise..&lt;/a&gt;.</description>
         <guid isPermaLink="false">http://www.datastax.com/?post_type=dev-post&amp;p=15752</guid>
         <pubDate>Fri, 10 May 2013 13:29:05 +0000</pubDate>
         <content:encoded><![CDATA[<p>DataStax is pleased to make available an ODBC driver for <a rel="nofollow" target="_blank" href="http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra">Apache Cassandra</a> that can be used free of charge with both open source Cassandra and <a rel="nofollow" target="_blank" href="http://www.datastax.com/what-we-offer/products-services/datastax-enterprise">DataStax Enterprise</a>. Using the DataStax ODBC driver for Cassandra, you can connect to a database cluster with your favorite BI tools (e.g. Tableau, Microsoft Excel, etc.) or other development software and access data stored on Cassandra nodes.</p>
<p>The DataStax ODBC driver for Cassandra was developed by ODBC software leader Simba Corporation and is compliant with the latest ODBC 3.52 specification and runs on both 32 and 64-bit platforms.</p>
<p>Let’s take a quick walk through of how it works.</p>
<h3>Installing the DataStax ODBC Driver for Cassandra</h3>
<p>The DataStax ODBC driver for Cassandra is currently provided for Windows platforms, and can be <a rel="nofollow" target="_blank" href="http://www.datastax.com/download/clientdrivers">downloaded</a> from the DataStax website.</p>
<p>Installing the DataStax ODBC driver for Cassandra is simple. The only prerequisite needed for Windows is to have the Microsoft Visual C++ 2010 runtime installed for the appropriate platform (either 32 or 64-bit).</p>
<p>The installation of the driver is completed in just a few steps.</p>
<p><a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2013/05/c-odbc-install-start.png"><img class="aligncenter size-medium wp-image-15762" alt="c odbc install start" src="http://www.datastax.com/wp-content/uploads/2013/05/c-odbc-install-start-250x193.png"/></a></p>
<p><a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2013/05/c-odbc-install-directory.png"><img class="aligncenter size-medium wp-image-15761" alt="c odbc install directory" src="http://www.datastax.com/wp-content/uploads/2013/05/c-odbc-install-directory-250x193.png"/></a></p>
<p>Note that you may need to install both the 32 and 64-bit versions of the driver if, for example, you have certain tools that are 32-bit but are running on a 64-bit box.</p>
<h3>Configuring the DataStax ODBC Driver for Cassandra</h3>
<p>Configuring the DataStax ODBC driver for Cassandra is easy. On Windows, you invoke the ODBC Data Source Administrator utility and first validate that the driver is present:</p>
<p><a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2013/05/c-odbc-define-datasource-start.png"><img class="aligncenter size-medium wp-image-15760" alt="c odbc define datasource start" src="http://www.datastax.com/wp-content/uploads/2013/05/c-odbc-define-datasource-start-250x187.png"/></a></p>
<p>Then, you create either a User or System DSN (data source name) that will be used by front end business intelligence tools. You’ll need to enter a name for the datasource as well as the IP or host name of the Cassandra node you want to connect to, along with a keyspace that will be referenced by the driver. Once the information is entered, you can test your connection to ensure it connect to your cluster.</p>
<h3>Using the DataStax ODBC Driver for Cassandra</h3>
<p>Note that the current driver does not currently support version 3.0 of the Cassandra Query Language (CQL3), which is the default for Cassandra 1.2 and higher (an upcoming version of the driver will support CQL3). However, CQL2 is supported, so you will want to make sure that any objects you want to reference with the driver are created with CQL2.</p>
<p>The Windows installer for Cassandra 1.2 and higher uses CQL3 by default and creates a shortcut for the CQL utility that can be easily clicked on and run. To create a CQL utility shortcut that uses CQL2, just create a new shortcut with the following target executable (substituting your install location if you changed it from the default location):</p>
<p>&#8220;C:&#92;Program Files (x86)&#92;DataStax Community&#92;python&#92;python.exe&#8221; &#8220;C:&#92;Program Files (x86)&#92;DataStax Community&#92;apache-cassandra&#92;bin&#92;cqlsh&#8221; <b>-2</b></p>
<p>Once you’ve successfully configured your ODBC datasource, you can use it in your preferred BI tools to connect to and pull data back from Cassandra. For example, to use the DataStax ODBC driver for Cassandra with Microsoft Excel, you can use the data connection wizard to select your new ODBC datasource:</p>
<p><a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2013/05/c-odbc-excel-start.png"><img class="aligncenter size-medium wp-image-15758" alt="c odbc excel start" src="http://www.datastax.com/wp-content/uploads/2013/05/c-odbc-excel-start-250x186.png"/></a></p>
<p><a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2013/05/c-excel-choose-odbc.png"><img class="aligncenter size-medium wp-image-15757" alt="c excel choose odbc" src="http://www.datastax.com/wp-content/uploads/2013/05/c-excel-choose-odbc-250x171.png"/></a></p>
<p>And then, select one or more data objects to pull back data:</p>
<p><a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2013/05/c-excel-choose-odbc-dsn-name.png"><img class="aligncenter size-medium wp-image-15756" alt="c excel choose odbc dsn name" src="http://www.datastax.com/wp-content/uploads/2013/05/c-excel-choose-odbc-dsn-name-250x173.png"/></a></p>
<p><a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2013/05/c-excel-choose-odbc-table.png"><img class="aligncenter size-medium wp-image-15755" alt="c excel choose odbc table" src="http://www.datastax.com/wp-content/uploads/2013/05/c-excel-choose-odbc-table-250x174.png"/></a></p>
<p><a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2013/05/c-excel-populate.png"><img class="aligncenter size-medium wp-image-15754" alt="c excel populate" src="http://www.datastax.com/wp-content/uploads/2013/05/c-excel-populate-250x154.png"/></a></p>
<p>The ODBC driver also works with generic ODBC dev/query tools like Query Tool ODBC:</p>
<p><a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2013/05/c-tqodbc-tool1.png"><img class="aligncenter size-medium wp-image-15765" alt="c tqodbc tool" src="http://www.datastax.com/wp-content/uploads/2013/05/c-tqodbc-tool1-250x171.png"/></a></p>
<h3></h3>
<h3>Conclusions</h3>
<p>Currently the DataStax ODBC Driver for Cassandra is in beta, but will be GA shortly. For more information, visit the <a rel="nofollow" target="_blank" href="http://www.datastax.com/download/clientdrivers">client drivers downloads</a> page on the DataStax website.</p>]]></content:encoded>
         <category>Blog Post</category>
      </item>
      <item>
         <title>DataStax Java Driver: A new face for Cassandra</title>
         <link>http://www.datastax.com/dev/blog/new-datastax-drivers-a-new-face-for-cassandra</link>
         <description>Cassandra has always benefited from a great architecture, thought from the very beginning for scalability, performance and availability...</description>
         <guid isPermaLink="false">http://www.datastax.com/?post_type=dev-post&amp;p=15738</guid>
         <pubDate>Thu, 09 May 2013 22:32:02 +0000</pubDate>
         <content:encoded><![CDATA[<p>Cassandra has always benefited from a great architecture, thought from the very beginning for scalability, performance and availability. This is surely what has driven its tremendous success so far. Unfortunately, early versions of Cassandra also came with a rather complex interface and data model that were negatively impacting the learning curve for developers. To solve this issue the Apache Cassandra team came up with a great new language and abstraction: <a rel="nofollow" target="_blank" href="http://www.datastax.com/dev/blog/cql3-for-cassandra-experts">CQL3</a>. This clearly renewed the face of Cassandra, bringing a consistent interface across all languages and tools.</p>
<p>Thrift, the transport layer that Cassandra traditionally used for client-server communication, has been a great opportunity for the database in its early days since Thrift clients were readily available for most programming languages. This allowed the Apache Cassandra project to focus on the server side without spreading their resources on the client side. Nevertheless, as Cassandra became mature, Thrift turned out to be a limitation: communication limited to the request-response paradigm, no notifications, no streaming, client side interface made of generated code, etc. In Cassandra 1.2, a solution to this second problem was introduced with the <a rel="nofollow" target="_blank" href="http://www.datastax.com/dev/blog/binary-protocol">CQL Native Protocol</a>, a protocol designed exclusively for CQL3 and with enough flexibility to enable new features in Cassandra for the years to come.</p>
<p>With these two majors changes, an update was obviously necessary on the client side. The fact that we had a renewed interface and transport layer, together with the need for an environment free of any Thrift concepts, strongly advocated for a brand new line of drivers, which was aligned with the long-term strategy of CQL.</p>
<p><strong>Today, DataStax announces version 1.0.0 of a new <a rel="nofollow" target="_blank" href="https://github.com/datastax/java-driver">Java Driver</a></strong>, designed for CQL and based on years of experience within the Cassandra community. This Java driver is a first step; an object mapping and a JDBC extension will be available soon, drivers for C# and other major languages are on their way. Besides this new interface and API, this new driver comes with:</p>
<ul>
<li>Node discovery, load balancing, and fail-over are implemented in a standardized way across all languages</li>
<li>Asynchronous API, making it simple to send multiple requests in parallel</li>
<li>Query builder, which makes it possible to create queries programmatically</li>
<li>Tracing, a new feature in Cassandra 1.2, that will quickly become part of everyday tools for developers working with CQL</li>
</ul>
<p>In the weeks to come, several articles on this blog will cover some particular features of this Java Driver. Meanwhile, its <a rel="nofollow" target="_blank" href="http://www.datastax.com/doc-source/developer/java-driver/index.html">documentation</a> is readily available and you can start to use it right away using the following Maven dependency:</p>
<pre>&lt;dependency&gt;
   &lt;groupId&gt;com.datastax.cassandra&lt;/groupId&gt;
   &lt;artifactId&gt;cassandra-driver-core&lt;/artifactId&gt;
   &lt;version&gt;1.0.0&lt;/version&gt;
&lt;/dependency&gt;</pre>]]></content:encoded>
      </item>
      <item>
         <title>Free ODBC Drivers for Cassandra and Hadoop Now Available</title>
         <link>http://www.datastax.com/dev/blog/free-odbc-drivers-for-cassandra-and-hadoop-now-available</link>
         <description>I’m pleased to let you know that we’re now providing free ODBC drivers for Hadoop/Hive as well as Cassandra...</description>
         <guid isPermaLink="false">http://www.datastax.com/?post_type=dev-post&amp;p=15720</guid>
         <pubDate>Thu, 09 May 2013 18:22:18 +0000</pubDate>
         <content:encoded><![CDATA[<p>I’m pleased to let you know that we’re now providing free ODBC drivers for Hadoop/Hive as well as Cassandra.</p>
<p>Our <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/datastax_enterprise3.0/solutions/hive_odbc#hive-odbc">ODBC driver for Hive</a> has been available for a while now, however while we charged a small fee for it in the past, we’re now making it completely free for everyone to use.</p>
<p>Something new is our ODBC driver for Cassandra, which we’re providing in <strong>beta</strong> form right now for the Windows platform. Our new Cassandra driver conforms to the standard ODBC 3.52 standards, which practically means you can use it to connect to any open source Cassandra or <a rel="nofollow" target="_blank" href="http://www.datastax.com/what-we-offer/products-services/datastax-enterprise">DataStax Enterprise</a> cluster from many open source and proprietary BI, query, and ETL tools (e.g. Microsoft Excel, Tableau, MicroStrategy, etc.), and work with data in Cassandra.</p>
<p>Again, note that the Cassandra driver is beta at the moment, so be sure to read the accompanying documentation that comes in the installation package to understand the current limitations (e.g. no support for CQL3 at the moment; a future release will work with CQL3).</p>
<p>You can <a rel="nofollow" target="_blank" href="http://www.datastax.com/download/clientdrivers">download</a> the ODBC drivers now on our drivers downloads page, and don’t hesitate to contact us if you have any questions.</p>]]></content:encoded>
         <category>Blog Post</category>
      </item>
      <item>
         <title>The native CQL Java driver goes GA</title>
         <link>http://www.datastax.com/dev/blog/the-native-cql-java-driver-goes-ga</link>
         <description>Today DataStax announces the general availability of the native &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/docs/1.2/cql_cli/using_cql&quot;&gt;CQL..&lt;/a&gt; driver for Java.</description>
         <guid isPermaLink="false">http://www.datastax.com/?post_type=dev-post&amp;p=15711</guid>
         <pubDate>Thu, 09 May 2013 13:59:36 +0000</pubDate>
         <content:encoded><![CDATA[<p>Today DataStax announces the general availability of the native <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/1.2/cql_cli/using_cql">CQL</a> driver for Java.  This is a production-ready driver for Cassandra 1.2+ with no legacy baggage from Thrift or JDBC concepts that don&#8217;t translate well to Cassandra.</p>
<p>Highlights include:</p>
<ul>
<li>Full <a rel="nofollow" target="_blank" href="http://www.datastax.com/doc-source/developer/java-driver/index.html">documentation</a>
<li>Out-of-the-box best practices for node discovery, load balancing and fail over
<li>An asynchronous architecture provides simpler concurrency without any thread pool tuning
<li><a rel="nofollow" target="_blank" href="http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2">Tracing</a> support
<li>CQL with prepared statements is <a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2012/08/C2012-CQL-EricEvans.pdf">about 10% faster</a> than Thrift, and we expect that gap to widen as the native protocol matures
</ul>
<p>More qualitatively but perhaps even more important, this addresses the <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/The_Paradox_of_Choice:_Why_More_Is_Less">paradox of choice</a> we&#8217;ve had in the Cassandra Java world: multiple driver choices provide another barrier to newcomers, where each must evaluate the options for applicability to his project.  Having just done such an evaluation to settle on Cassandra itself, this is the last thing they want to spend time on.</p>
<p>And that&#8217;s the best-case scenario.  More often, a fragmented landscape leads to <a rel="nofollow" target="_blank" href="http://www.scsh.net/docu/post/sre.html">many solutions, each of which solve a different 80% of the problem</a>.  Better to have a single, well-thought-out solution, that lets people get started writing their application immediately.  The native CQL driver provides exactly that.</p>
<p>Get the native CQL driver <a rel="nofollow" target="_blank" href="https://github.com/datastax/java-driver">here</a>.  Want to learn more?  Check out Patrick McFadin&#8217;s <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#sessions">talks on data modeling and the new drivers</a> at the 2013 Cassandra Summit, June 11-12.  <a rel="nofollow" target="_blank" href="http://datastax.regsvc.com/E2">Register today</a> with the code <tt>SFSummit25</tt> for a 25% discount, and check out <a rel="nofollow" target="_blank" href="http://www.datastax.com/dev/blog/ten-talks-you-shouldnt-miss-at-the-cassandra-summit">the ten talks I&#8217;m most looking forward to.</a></p>]]></content:encoded>
      </item>
      <item>
         <title>You should try and get to the Cassandra summit next month — no, really.</title>
         <link>http://www.datastax.com/2013/05/you-should-try-and-get-to-the-cassandra-summit-next-month-no-really</link>
         <description>With all the hype and smoke-and-mirrors around big data, it is very hard to get the signal through the noise...</description>
         <guid isPermaLink="false">http://www.datastax.com/?p=15704</guid>
         <pubDate>Thu, 09 May 2013 07:00:32 +0000</pubDate>
         <content:encoded><![CDATA[<p>With all the hype and smoke-and-mirrors around big data, it is very hard to get the signal through the noise. At this year&#8217;s <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013">Cassandra Summit</a>, we&#8217;re going to help you get that signal loud and clear as you see firsthand how Cassandra is going mainstream.</p>
<p>I&#8217;ve been to a lot of conferences in my 20+ years in the database industry, and I cannot remember one that had a more impressive lineup of speakers, including: Accenture, Barracuda Networks, Blue Mountain Capital, Comcast, Constant Contact, Dell, eBay, Fusion-io, Intuit, Microsoft, Netflix, Sony, Splunk, Spotify, Walmart, and <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#speakers">many more</a>, all at one event to share their Cassandra experience.</p>
<p>And for the first time, we&#8217;ve created an &#8220;<a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#executive">Executive Track</a>&#8221; that will deal specifically with business and personal/career implications associated with implementing a new technology.</p>
<p>The audience will be diverse (developers, architects, DBAs, executives), the topics will be meaningful, and the talk will be straight.  We would love to have you there!  You can <a rel="nofollow" target="_blank" href="http://datastax.regsvc.com/E2">register here</a> and use the code <strong>SFSummit25</strong> for a 25% discount!</p>]]></content:encoded>
      </item>
      <item>
         <title>Announcing OpsCenter 3.1</title>
         <link>http://www.datastax.com/dev/blog/announcing-opscenter-3-1</link>
         <description>DataStax OpsCenter 3.1 is now available for &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/download/register/versions&quot;&gt;download..&lt;/a&gt;.  This release includes support for monitoring column families created with CQL3, the ability to provision clusters with vnodes enabled, many bug fixes, and other minor enhancements.</description>
         <guid isPermaLink="false">http://www.datastax.com/?post_type=dev-post&amp;p=15694</guid>
         <pubDate>Wed, 08 May 2013 23:21:50 +0000</pubDate>
         <content:encoded><![CDATA[<p>DataStax OpsCenter 3.1 is now available for <a rel="nofollow" target="_blank" href="http://www.datastax.com/download/register/versions">download</a>.  This release includes support for monitoring column families created with CQL3, the ability to provision clusters with vnodes enabled, many bug fixes, and other minor enhancements.<br />
<span id="more-15694"></span><br />
For more information, check out the <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/opscenter/release_notes#opscenter-3-1">release notes</a>.</p>
<p>OpsCenter Enterprise edition is free to try in development environments, so <a rel="nofollow" target="_blank" href="http://www.datastax.com/download/register">try it now</a>.</p>]]></content:encoded>
      </item>
      <item>
         <title>Ten talks you shouldn’t miss at the Cassandra Summit</title>
         <link>http://www.datastax.com/dev/blog/ten-talks-you-shouldnt-miss-at-the-cassandra-summit</link>
         <description>After &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/company/news-and-events/events/cassandrasummit2012/presentations&quot;&gt;last year&amp;#8217;s Summit..&lt;/a&gt;, the main improvement attendees wanted to see was an expansion to two days.</description>
         <guid isPermaLink="false">http://www.datastax.com/?post_type=dev-post&amp;p=15676</guid>
         <pubDate>Wed, 08 May 2013 18:49:45 +0000</pubDate>
         <content:encoded><![CDATA[<p>After <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2012/presentations">last year&#8217;s Summit</a>, the main improvement attendees wanted to see was an expansion to two days.  We listened, and we have hands down the most impressive lineup of talks I can remember seeing at <i>any</i> conference.  Accenture, Barracuda Networks, Blue Mountain Capital, Comcast, Constant Contact, Dell, eBay, Fusion-io, Intuit, Microsoft, Netflix, Sony, Splunk, Spotify, Walmart, and more, all at one event to share their Cassandra experience.</p>
<p>Here are some of the talks that I&#8217;m personally most excited to see:</p>
<ol>
<li><a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#intuit">Time for a new relationship &#8211; Intuit&#8217;s journey from RDBMS to Cassandra</a>: I&#8217;m leading with this one on purpose, because I can&#8217;t wait to see Intuit&#8217;s description from the trenches of migrating from an RDBMS to Cassandra. Remember when the conventional wisdom was that you couldn&#8217;t trust financial data to anything but a relational database?  Step one is not losing data, and durability has been <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/1.2/dml/about_writes">designed in</a> to Cassandra from the start, unlike <a rel="nofollow" target="_blank" href="http://www.datastax.com/dev/blog/2012-in-review-performance">much of the competition</a>.  Step two is living without the rest of ACID, and Cassandra experts like Matt Dennis have been <a rel="nofollow" target="_blank" href="http://www.slideshare.net/mattdennis/durability-durability-durability">showing how to apply eventual consistency to financial data</a> for some time. Understanding the advantages this generates for business is <a rel="nofollow" target="_blank" href="http://highscalability.com/blog/2013/5/1/myth-eric-brewer-on-why-banks-are-base-not-acid-availability.html">starting to hit the mainstream</a>.  Intuit&#8217;s talk will help lay this myth to rest for good.
<li><a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#building-a-scalable-time-series">Building a Scalable Time-Series Database with Cassandra</a>: Expect deep technical information on how to get the most performance possible out of a production workload, updated from Jake and Carl&#8217;s well-done <a rel="nofollow" target="_blank" href="http://www.datastax.com/dev/blog/my-top-five-talks-from-nyc-big-data-tech-day">talk from NYC*</a> earlier this year.  This talk at the Summit will include more performance tuning results, a discussion of upgrading Cassandra live, and generalizing their data store to be a generic object store instead of just time series data.  Also of note: this is one of the first production applications built entirely on <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/1.2/cql_cli/using_cql">CQL</a>.
<li><a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#buy-it-now">Buy It Now! Cassandra at eBay</a>: Jay gave one of the best talks at <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2012/presentations">last year&#8217;s Summit</a>.  I&#8217;m excited to see what his team has done since then and what lessons he has to share.  Related: <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#managing-cassandra-at-ebay-scale">Managing Cassandra at eBay Scale</a>.
<li><a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#darpa">Suicide Prevention Using Social Media and Cassandra</a>.  Big data has been dismissively summarized as figuring out how to get people to click on more ads.  Definitely a feel-good story for me to see a counterpoint like this.
<li><a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#mysql-to-cassandra">Migrating from MySQL to Cassandra</a>: Michael upgraded his production cluster at Barracuda Networks to Cassandra 1.2.0 the day it was released.  I&#8217;m pretty sure that&#8217;s what Nietzsche had in mind when he said, &#8220;That which does not kill me makes me stronger;&#8221; I&#8217;m looking forward to Michael&#8217;s war stories.
<li><a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#cassandra-at-a-net-shop">The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop</a>: I never thought I&#8217;d see the words &#8220;Hector,&#8221; &#8220;IKVM,&#8221; and &#8220;production&#8221; in the same sentence.  Someone should make this into a T-shirt.  It&#8217;s almost a shame that the <a rel="nofollow" target="_blank" href="https://github.com/datastax/csharp-driver">Native CQL .NET driver</a> is making this stack obsolete.
<li><a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#pushing-cassandras-boundaries">Pushing Cassandra&#8217;s Boundaries: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost</a>: Openwave has been pushing Cassandra hard with their messaging suite for a couple years now.  They&#8217;ve been especially active in pushing the envelope on dense, many-TB-per-machine deployments.  This is a space that&#8217;s seeing increasing attention this year, so I plan to pay attention to pioneers like Openwave.
<li><a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#cassandra-and-spark-shark">Real-time Analytics Using Cassandra and Spark/Shark</a>: Ooyala is another experienced Cassandra shop, and has built their own inputformat that understands their custom indexing solution.  The Shark driver then uses that.  Very interesting, especially if you&#8217;re looking for examples of what you can do with open APIs and some custom code.  Ooyala is also presenting the talk on <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#linux-and-cassandra-tuning">Linux and Cassandra Tuning</a>.
<li><a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#state-of-cql">The State of CQL: A deep dive into modern CQL</a>, by Sylvain Lebresene, the man who has written most of it.  This is the <a rel="nofollow">future of Cassandra</a>.  Also recommended: Patrick McFadin&#8217;s talk, <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#worlds-next-top-data-model">The World&#8217;s Next Top Data Model</a>.  Patrick has been teaching CQL data modeling for longer than anyone and is on a mission to demystify CQL from a user&#8217;s point of view; after this session, you will be ready to be dangerous with the new CQL drivers.
<li>Last but not least, <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#distributed-graph-computing">Distributed Graph Computing with Titan and Faunus</a>: graph computing is still a bit niche, but a growing one.  <a rel="nofollow" target="_blank" href="http://thinkaurelius.github.io/titan/">Titan</a> is a distributed graph database built on Cassandra, integrated with the popular <a rel="nofollow" target="_blank" href="http://www.tinkerpop.com/">TinkerPop</a> graph stack.  Titan is <a rel="nofollow" target="_blank" href="https://github.com/thinkaurelius/titan/wiki/Release-Notes">making rapid progress</a> and already has production users.
</ol>
<p>Cassandra is going mainstream, and we expect an even more diverse crowd at <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013">this year&#8217;s Summit</a>.  Besides Cassandra experts, we will see many new developers as well as relational DBAs, managers, and executives.  Check out the <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#sessions">sessions</a> and <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013#schedule">schedule</a> pages to see the content we have lined up for these audiences as well.  <a rel="nofollow" target="_blank" href="http://datastax.regsvc.com/E2">Register today</a> with the code <tt>SFSummit25</tt> for a 25% discount!</p>]]></content:encoded>
      </item>
      <item>
         <title>My five favorite talks from NYC* Big Data Tech Day</title>
         <link>http://www.datastax.com/dev/blog/my-top-five-talks-from-nyc-big-data-tech-day</link>
         <description>I flew to New York in March for the &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://datastax.com/nycassandra2013/&quot;&gt;NYC* Big Data Tech Day..&lt;/a&gt;.</description>
         <guid isPermaLink="false">http://www.datastax.com/?post_type=dev-post&amp;p=15634</guid>
         <pubDate>Wed, 01 May 2013 19:29:15 +0000</pubDate>
         <content:encoded><![CDATA[<p>I flew to New York in March for the <a rel="nofollow" target="_blank" href="http://datastax.com/nycassandra2013/">NYC* Big Data Tech Day</a>.  <a rel="nofollow" target="_blank" href="http://planetcassandra.org/Learn/CassandraSummit">All the talks are online</a>.  Here are my five favorite:</p>
<ul>
<li><a rel="nofollow" target="_blank" href="http://www.youtube.com/watch?v=Tg3dP2fZGSM">Graph-based Recommendation Systems at eBay</a> manages to give both a good overview of recommendation systems and an interesting Cassandra use case.  Frankly, my experience is that most &#8220;Cassandra in Field X&#8221; talks manage to give a good Cassandra talk, or a good introduction to X, but not both.  Thomas Pinckney&#8217;s talk proved a happy exception.  (During the Q&#038;A, you can hear me ask why eBay doesn&#8217;t use <a rel="nofollow" target="_blank" href="http://glinden.blogspot.com/2011/02/youtube-uses-amazons-recommendation.html">item-to-item collaborative filtering</a>, which to my novice&#8217;s perspective appears to be more popular currently.  I&#8217;ll leave you to the talk for Thomas&#8217;s answer.)
<li>If you&#8217;re at all interested in Storm, Kafka, or Elastic Search, I&#8217;d recommend <a rel="nofollow" target="_blank" href="http://www.youtube.com/watch?v=jBauMkzSgRQ">Brian O&#8217;Neill&#8217;s talk on integrating all three with Cassandra</a>.  I think Brian&#8217;s just a little ahead of the curve here, and you&#8217;re going to see a lot more from Storm and Kafka in particular this year.
<li>DataStax lost Jake Luciani as an employee to Blue Mountain Capital last year.  I was sorry to see him go, but this does leave the world with one more Cassandra committer building applications in the trenches. I love seeing that because these people have the mentality not just of &#8220;how can I work around this problem I ran into with Cassandra,&#8221; but also &#8220;how can I make Cassandra better to solve this problem for everyone?&#8221;  <a rel="nofollow" target="_blank" href="http://www.youtube.com/watch?v=nHes8XW1VHw">Jake&#8217;s talk with Carl Yesigian</a> is a great example of applying this to time series financial data on the cutting edge of Cassandra 1.2.  (And for a good overview of what&#8217;s new in 1.2, I&#8217;ll refer you to <a rel="nofollow" target="_blank" href="http://www.youtube.com/watch?v=psLjMsTkDIA">my own talk</a> from NYC*.)
<li><a rel="nofollow" target="_blank" href="http://www.youtube.com/watch?v=eCCRhMdQGkY">Nathan Milford&#8217;s talk on Cassandra administration</a> was a good change of pace from the talks focused on architects and developers.  Even if you aren&#8217;t at all interested in sysadmin procedures, though, you <i>need</i> to watch this for Nathan&#8217;s unique introduction.  I&#8217;ve never seen anything like it at a tech conference.
<li>Last but not least, you should definitely check out <a rel="nofollow" target="_blank" href="http://www.youtube.com/watch?v=6zv2wLklK0k">Michaël Figuière&#8217;s talk</a> on <a rel="nofollow" target="_blank" href="https://github.com/datastax/">DataStax&#8217;s new CQL drivers</a> for Java and C#.  CQL is a <a rel="nofollow" target="_blank" href="http://www.datastax.com/dev/blog/schema-in-cassandra-1-1">huge</a> <a rel="nofollow" target="_blank" href="http://www.datastax.com/dev/blog/cql3-for-cassandra-experts">step</a> <a rel="nofollow" target="_blank" href="http://www.datastax.com/dev/blog/cql3_collections">forward</a> and having drivers that speak it natively are going to make a night-and-day difference in Cassandra development productivity.  Check it out.
</ul>
<p>We already have more great talks lined up for the <a rel="nofollow" target="_blank" href="http://www.datastax.com/company/news-and-events/events/cassandrasummit2013">two-day Cassandra Summit in June</a>.  <a rel="nofollow" target="_blank" href="http://datastax.regsvc.com/E2">Register</a> with the code <tt>SFSummit25</tt> for a 25% discount!</p>]]></content:encoded>
      </item>
      <item>
         <title>Multi-threaded indexing in DSE Search 3.0.1</title>
         <link>http://www.datastax.com/dev/blog/multi-threaded-indexing-in-dse-search-3-0-1</link>
         <description>Until version 3.0, DSE Search provided the same index concurrency model as plain Apache Solr: all index update requests are handled by a single thread, which synchronously writes into Cassandra and then into the Lucene index via the integrated Cassandra secondary index mechanism. ..</description>
         <guid isPermaLink="false">http://www.datastax.com/?post_type=dev-post&amp;p=15628</guid>
         <pubDate>Wed, 01 May 2013 18:29:30 +0000</pubDate>
         <content:encoded><![CDATA[<p dir="ltr">Until version 3.0, DSE Search provided the same index concurrency model as plain Apache Solr: all index update requests are handled by a single thread, which synchronously writes into Cassandra and then into the Lucene index via the integrated Cassandra secondary index mechanism. This is all good: the client is on duty to provide the desired concurrency level by issuing multiple concurrent requests, and DSE Search follows by processing each request on its own thread.</p>
<p dir="ltr">But, there are cases when the model above doesn’t provide enough performance, more specifically:</p>
<ol>
<li>Reindexing: DSE Search data can be <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/datastax_enterprise3.0/solutions/dse_search_upload#using-reload-command-options">reindexed</a> “in place”, without forcing the client to resubmit data; this happens by going through Cassandra SSTables and sequentially indexing all rows.</li>
<li>Repairing: Cassandra data can be repaired via the <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/1.1/references/nodetool">nodetool</a> command, which triggers the reindexing of actually repaired rows by sequentially going through them.</li>
<li>Bulk loading: many users implement home-made bulk loading solutions, which usually provide limited concurrency.</li>
</ol>
<p dir="ltr">Multi-threaded indexing in DSE Search 3.0.1 comes to help in such a cases, also providing a general, improved, concurrent indexing model.</p>
<p dir="ltr">The new indexing model decouples writes to Cassandra from writes to the Lucene index, making the latter asynchronous. Conceptually speaking, it simply works as follows: when a document is inserted, its data is written into Cassandra and the indexing request is queued-up to be asynchronously processed by a pool of worker threads; at given intervals, when Memtable data is flushed from inside Cassandra, or a commit happens from inside Solr, the indexing queue is synchronously flushed, so that all in-flight indexing requests become visible and all committed data up to that point can be queried.</p>
<p dir="ltr">As simple as it sounds, when implementing asynchronous work models a few important problems have to be addressed:</p>
<ol>
<li>Concurrent execution of the logically same work unit: i.e., what if two asynchronous threads try to index the same document?</li>
<li>Flow control between work producers and consumers: i.e., what if indexing requests are submitted faster than they are processed?</li>
<li>Visibility and management: how can I know about my producers and consumers work? How can I tune it?</li>
</ol>
<p dir="ltr">Let’s see how DSE Search 3.0.1 solves all of them.</p>
<p dir="ltr">First, we implement <strong>per-document thread affinity</strong> to avoid concurrent indexing of the same document: we hash the document identifier and assign the document to an in-memory queue serving one and only one indexing thread; by doing so, we partition the work between different threads, and make sure to always process indexing requests for the same document from the same thread.</p>
<p dir="ltr"><strong>Flow control</strong> is implemented via automatic back-pressure: at Cassandra flush or Solr commit time, we compute some heuristics representing the current load of the indexing system, based on index processing time and indexing queue depth, and if a configurable threshold is met, we pause producers (that is, incoming indexing requests) until all accumulated in-flight indexing requests are processed: please note this is different from a normal “flush situation”, when newly arrived indexing requests are allowed to queue-up and in-flight requests are flushed only up to the current point in time.</p>
<p dir="ltr"><strong>Visibility and management</strong> are implemented via configuration switches and JMX mbeans. First, you can configure the maximum number of indexing threads per core with the following dse.yaml setting: <em>max_solr_concurrency_per_core</em>; the default value is computed based on the available number of CPUs (which is usually a good value to stick on), but you can configure it to best suit your indexing volume, as well as set to 1 to get back to the old synchronous behavior. Then, for each Solr core, you have several configuration knobs and monitoring gauges available via the <em>IndexPool-core_name</em> mbean (where <em>core_name</em> is an actual Solr core name) in the <em>com.datastax.bdp</em> domain, with the more relevant ones being:</p>
<ul>
<li><em>BackPressureThreshold</em>: the (average) max number of queued documents that will trigger the back-pressure mechanism discussed above; this is in other words a way to control and limit memory consumption of the whole indexing system.</li>
<li><em>MaxConcurrency</em>: the maximum number of indexing threads, between the previously mentioned <em>max_solr_concurrency_per_core</em> and 1; this way you can dynamically adjust your concurrency level, and even get back to the old synchronous model, without restarting.</li>
<li><em>QueueDepth</em>: current depth of all queues.</li>
<li><em>TaskProcessingTime</em>: time it took to process the latest indexing request, per queue/thread.</li>
<li><em>ProcessedTasks</em>: total number of processed tasks, per queue/thread.</li>
</ul>
<p dir="ltr">Finally, some performance considerations: in our tests we’ve got a 30%-40% performance increase in sequential indexing use cases like repair and reindex, but it’s all about <em>your</em> use case in the end. With the visibility and configurability discussed before you have (hopefully) all the right tools for a correct analysis and effective tuning, but if in need of any help, do not hesitate to reach us via our <a rel="nofollow" target="_blank" href="http://www.datastax.com/support-forums/">public forums</a>.</p>]]></content:encoded>
      </item>
      <item>
         <title>Hadoop, Security, and the Enterprise</title>
         <link>http://www.datastax.com/2013/04/hadoop-security-and-the-enterprise</link>
         <description>eWeek recently published an article/slide deck on &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.eweek.com/security/slideshows/hadoop-poses-a-big-data-security-risk-10-reasons-why/&quot;&gt;10 reasons why Hadoop poses a big data security  risk..&lt;/a&gt;.</description>
         <guid isPermaLink="false">http://www.datastax.com/?p=15617</guid>
         <pubDate>Tue, 30 Apr 2013 13:34:45 +0000</pubDate>
         <content:encoded><![CDATA[<p>eWeek recently published an article/slide deck on <a rel="nofollow" target="_blank" href="http://www.eweek.com/security/slideshows/hadoop-poses-a-big-data-security-risk-10-reasons-why/">10 reasons why Hadoop poses a big data security  risk</a>. As I mentioned a few months ago in a <a rel="nofollow" target="_blank" href="http://www.datastax.com/2013/02/a-closer-look-at-datastax-enterprise-3-0-part-1">blog post</a> that talked about our release of <a rel="nofollow" target="_blank" href="http://www.datastax.com/what-we-offer/products-services/datastax-enterprise">DataStax Enterprise</a> (DSE) 3.0, the fact that NoSQL databases are lax on security was something <a rel="nofollow" target="_blank" href="http://reports.informationweek.com/abstract/2/8758/Business-Continuity/strategy-why-nosql-equals-nosecurity*.html">getting attention</a> last year in the tech media. I’m happy to say that, with DSE 3.0, enterprise quality security in the NoSQL world is no longer an afterthought.</p>
<p>But the eWeek article demonstrates that the same concerns exist where Hadoop implementations are concerned. The article says: “It [Hadoop] was not written to support hardened security, compliance, encryption, policy enablement and risk management.”</p>
<p>Because DSE is a single integrated platform that includes <a rel="nofollow" target="_blank" href="http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra">Apache Cassandra</a> for online application use cases, <a rel="nofollow" target="_blank" href="http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-solr">Solr</a> for enterprise search, and <a rel="nofollow" target="_blank" href="http://www.datastax.com/what-we-offer/products-services/apache-hadoop">Hadoop</a> for batch analytics, we wanted to make sure we had the security bases covered in our platform for each technology. The good news for Hadoop users is that many of the security concerns called out by eWeek are handled in DSE.</p>
<p>For example, eWeek says, “Hadoop also doesn&#8217;t support encryption on nodes or on data in transit between nodes”. That’s not true in DSE. Because we use <a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2012/09/WP-DataStax-HDFSvsCFS.pdf">Cassandra for storage vs. HDFS</a>, the <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/datastax_enterprise3.0/security/ondisk_encryption">transparent data encryption</a> we offer in DSE applies to Hadoop data. Moreover, DSE also supplies <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/datastax_enterprise3.0/security/ssl_transport">client-to-node</a> and <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/datastax_enterprise3.0/security/ssl_node_to_node">node-to-node</a> encryption of data for Hadoop as well as Cassandra and Solr.</p>
<p>eWeek also states, “The distributed nature of Hadoop clusters also renders many traditional backup and recovery methods and policies ineffective. Companies using Hadoop need to replicate, back up and store data in a separate, secured environment.” In the same vein, they state later: “Traditional data security technologies have been built on the concept of protecting a single physical entity (like a database or server), not the uniquely distributed big data computing environments characterized by Hadoop clusters. Traditional security technologies are not effective in this type of distributed, large-scale environment.”</p>
<p>One of the nice things about the Hadoop component of DSE is that automatic redundancy and replication is built in to the platform itself, so all of the <a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2012/08/WP-IntrotoCassandra.pdf">goodness of Cassandra</a> – which is architected specifically for distributed, large-scale environments – is inherited on the Hadoop (and Solr) side. This equates into Hadoop data being easily replicated in one location or many; across one datacenter or multiple centers; across one cloud availability zone or several zones. Further, it means no single point of failure or write bottleneck as data can be written to and read in any location.</p>
<p>Backups aren’t hard either as all data is stored in Cassandra column families / tables, so typical snapshot backups and recovery tasks are uniform across a cluster.</p>
<p>So if you’re interested in easily integrating Hadoop batch analytics with your modern line-of-business applications and want to ensure both are secured, you should give DSE 3.0 a try. <a rel="nofollow" target="_blank" href="http://www.datastax.com/download">Download DSE</a>, which is completely free to use without restrictions in development environments (note that production deployments do require a software subscription) and see how it can satisfy both your big data needs and your requirements for security.</p>]]></content:encoded>
      </item>
      <item>
         <title>Can Cassandra Handle Your Cloud App? Ask Netflix.</title>
         <link>http://www.datastax.com/2013/04/can-cassandra-handle-your-cloud-app-ask-netflix</link>
         <description>A recent &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.zdnet.com/the-biggest-cloud-app-of-all-netflix-7000014298/&quot;&gt;article in ZDNet..&lt;/a&gt; called out Netflix as having the largest cloud app in the world.</description>
         <guid isPermaLink="false">http://www.datastax.com/?p=15542</guid>
         <pubDate>Mon, 22 Apr 2013 13:42:01 +0000</pubDate>
         <content:encoded><![CDATA[<p>A recent <a rel="nofollow" target="_blank" href="http://www.zdnet.com/the-biggest-cloud-app-of-all-netflix-7000014298/">article in ZDNet</a> called out Netflix as having the largest cloud app in the world. In addition to being the biggest single internet traffic source, Netflix is also #1 when it comes to being a pure cloud service.</p>
<p>Netflix delivers more than one billion video instances each month to its subscribers, and it does so with an architecture that expects failure and is designed from the ground up to handle it.  And what database does the largest cloud app on the planet use to power its media business and keep things running no matter what happens?</p>
<p>Apache Cassandra from DataStax.</p>
<p>ZDNet’s article describes Netflix’s architecture from a high level perspective, but to understand what Netflix does in more detail, you can view Adrian Cockcroft’s (director of architecture for Netflix) <a rel="nofollow" target="_blank" href="https://www.youtube.com/watch?v=Wo-zkUH1R8A&amp;feature=youtu.be">presentation</a> from our last conference. In addition, you should review their <a rel="nofollow" target="_blank" href="http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html">benchmark tests in the cloud</a> that demonstrate the linear scalability of Cassandra (also see <a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2013/02/WP-Benchmarking-Top-NoSQL-Databases.pdf">End Point’s benchmarks</a> that compare Cassandra to MongoDB and HBase in the cloud).</p>
<p>Whether it’s performance or continuous availability, Cassandra gives Netflix what it needs. As an example of the latter where database uptime is concerned, when Amazon experienced its much publicized outage last October, Cassandra helped Netflix to <a rel="nofollow" target="_blank" href="http://techblog.netflix.com/2012/10/post-mortem-of-october-222012-aws.html">never miss a beat</a>: “We configure all our clusters to use a replication factor of three, with each replica located in a different Availability Zone.  This allowed Cassandra to handle the outage remarkably well.  When a single zone became unavailable, we didn&#8217;t need to do anything.  Cassandra routed requests around the unavailable zone and when it recovered, the ring was repaired.”</p>
<p>So if you’re wondering if DataStax can tackle an application you’re considering for the cloud, you don’t have to look any further than Netflix for your answer. To understand how DataStax and Apache Cassandra can handle your cloud needs, download our <a rel="nofollow" target="_blank" href="http://www.datastax.com/wp-content/uploads/2013/02/WP-Benchmarking-Top-NoSQL-Databases.pdf">cloud white paper</a> and see the short video below.</p>
<p></p>]]></content:encoded>
      </item>
      <item>
         <title>NoSQL or Not?</title>
         <link>http://www.datastax.com/2013/04/nosql-or-not</link>
         <description>For some years now, I’ve employed what some might consider a very unorthodox approach to presenting the technology solutions that I help design and deliver: I always tell people – right up front – why they may not need the software I’m talking to them about...</description>
         <guid isPermaLink="false">http://www.datastax.com/?p=15534</guid>
         <pubDate>Fri, 19 Apr 2013 13:15:37 +0000</pubDate>
         <content:encoded><![CDATA[<p>For some years now, I’ve employed what some might consider a very unorthodox approach to presenting the technology solutions that I help design and deliver: <i>I always tell people – right up front – why they may not need the software I’m talking to them about.</i></p>
<p>Maybe it’s because I’ve been a database geek for so long and have always appreciated advice on where not to step when it comes to trying shiny new technology. Or, maybe it’s because I handled so many database software reviews for magazines in the past that I know one size never fits all, and that the claims trumpeted by software vendors about their technology making all your dreams come true just isn’t legit.</p>
<p>For these reasons and more, whenever I get a chance to talk to a group made up of both business and tech folks, I always try and tell them what I’d want to be told – and that equates to telling them why they may not need the software I’m representing.</p>
<p>This brings me to NoSQL. By all standards of measure, the NoSQL market is booming. <a rel="nofollow" target="_blank" href="http://siliconangle.com/blog/2013/03/28/oracles-is-in-big-trouble-big-data-is-to-blame/">Some reports</a> have the NoSQL market growing at an average of 60% per year.  Such information naturally causes IT professionals to wonder if they’re missing something by not implementing NoSQL technology in their environments.</p>
<p>Maybe they are, but maybe they’re not. The $64,000 question is: <i>how do you know?</i></p>
<p>Having been on the RDBMS side of the fence for so long, and having worked with and watched DataStax customers smartly implement NoSQL technology that actually makes a difference, I think I have a pretty good handle on the why’s and why not’s of NoSQL.</p>
<p>I’d like to invite you to join me for an upcoming webinar I’ve entitled “<a rel="nofollow" target="_blank" href="http://learn.datastax.com/WebinarHowToTellifYourBusinessNeedsNoSQL.html">How to Tell if Your Business Needs NoSQL</a>” where I’ll go over a series of questions that will help you determine whether you can benefit from NoSQL.</p>
<p>No fairies. No pixie dust. No magic. Just some honest scoop that will help you get your head around whether you can use NoSQL right now or not, with clear examples from real customers that demonstrate when NoSQL is actually needed.</p>]]></content:encoded>
      </item>
      <item>
         <title>DataStax 3.0.1 Now Available</title>
         <link>http://www.datastax.com/2013/04/datastax-3-0-1-now-available</link>
         <description>&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/products/enterprise&quot;&gt;DataStax Enterprise&lt;/a&gt; 3.0.1 is now available for &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/download&quot;&gt;download..&lt;/a&gt;. This release includes many improvements to DSE Search such as new Solr data type to Cassandra Validator mappings.</description>
         <guid isPermaLink="false">http://www.datastax.com/?p=15520</guid>
         <pubDate>Thu, 18 Apr 2013 20:33:04 +0000</pubDate>
         <content:encoded><![CDATA[<p><a rel="nofollow" target="_blank" href="http://www.datastax.com/products/enterprise">DataStax Enterprise</a> 3.0.1 is now available for <a rel="nofollow" target="_blank" href="http://www.datastax.com/download">download</a>. This release includes many improvements to DSE Search such as new Solr data type to Cassandra Validator mappings. Please see the <a rel="nofollow" target="_blank" href="http://www.datastax.com/docs/datastax_enterprise3.0/dse_release_notes#datastax-enterprise-3-0-1">release notes</a> for specific information about bug fixes and improvements.</p>]]></content:encoded>
         <category>Blog Post - Corporate</category>
      </item>
      <item>
         <title>Your enterprise is global and so are we. DataStax launches into Europe, Middle East and Africa</title>
         <link>http://www.datastax.com/2013/03/your-enterprise-is-global-and-so-are-we-datastax-launches-into-europe-middle-east-and-africa</link>
         <description>Today we announced DataStax EMEA, which offers sales, marketing and support services for DataStax Enterprise, our Apache Cassandra-based big data platform throughout Europe, the Middle East and Africa...</description>
         <guid isPermaLink="false">http://www.datastax.com/?p=15298</guid>
         <pubDate>Thu, 28 Mar 2013 16:37:40 +0000</pubDate>
         <content:encoded><![CDATA[<p>Today we announced DataStax EMEA, which offers sales, marketing and support services for DataStax Enterprise, our Apache Cassandra-based big data platform throughout Europe, the Middle East and Africa.</p>
<p>DataStax has quickly expanded over the past two years, growing from 26 employees and 27 customers at the end of 2011 to approaching 100 employees and 270+ customers by the end of 2012 along with many thousands of organizations and individuals working with open source Cassandra.</p>
<p>We had some fun with the <a rel="nofollow">press this week</a> and they seemed to like the story…</p>
<p>With a strong base of customers in the EMEA region combined with the exceptional demand from the Cassandra community the opening of an EMEA subsidiary headquartered in London to support this demand was a natural next step.</p>
<p>We have also seen strong interest from system integrators, solution providers and consultants in the region and working with this partner ecosystem is a key component of our go-to-market and customer services plan.</p>
<p>Here are some things you can do immediately if you have an interest in Cassandra.</p>
<ul>
<li>Cassandra groups have been established in all major cities and you should sign up to the one nearest to you at <a rel="nofollow" target="_blank" href="http://www.meetup.com">www.meetup.com</a>. You are going to get a chance to network with your peers and learn how Cassandra can help you with the data driven solutions you are looking to deploy yourself or at one of your customers.</li>
<li>If you would like to talk directly to the DataStax EMEA team about our commercial offering called DataStax Enterprise, its broad set of features and 24&#215;7 support, please contact us at customersevices@datastax.com.</li>
<li>If you are a system integrator, consulting firm or solution provider and would like to deliver DataStax Enterprise and Cassandra services and solutions please email the EMEA team at <a rel="nofollow" target="_blank" href="mailto:emea-info@datastax.com">emea-info@datastax.com</a>.</li>
<li>Certification training will soon to be available in the EMEA region and if you would like to register your interest please also email <a rel="nofollow" target="_blank" href="mailto:training@datastax.com">training@datastax.com</a>.</li>
<li>We are always on the look out for talent and if you would like to enquire about career opportunities within the EMEA team please send an email to <a rel="nofollow" target="_blank" href="mailto:jobs@datastax.com">jobs@datastax.com</a>.</li>
</ul>
<p>It is exciting to see so many established enterprises and young start-ups in the region already relying on Cassandra and DataStax Enterprise for their mission critical data driven applications and services and we look forward to being able to now work closely with them as they continue their journey in this new data driven world.</p>]]></content:encoded>
         <category>Blog Post - Corporate</category>
      </item>
      <item>
         <title>The Five Minute Interview – Datafiniti</title>
         <link>http://www.datastax.com/2013/03/the-five-minute-interview-datafiniti</link>
         <description>This article is one in a series of quick-hit interviews with companies using &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra&quot;&gt;Apache Cassandra&lt;/a&gt; and/or &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/what-we-offer/products-services/datastax-enterprise&quot;&gt;DataStax Enterprise..&lt;/a&gt; (DSE) for key parts of their business.  </description>
         <guid isPermaLink="false">http://www.datastax.com/?p=15254</guid>
         <pubDate>Wed, 27 Mar 2013 12:18:08 +0000</pubDate>
         <content:encoded><![CDATA[<p>This article is one in a series of quick-hit interviews with companies using <a rel="nofollow" target="_blank" href="http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra">Apache Cassandra</a> and/or <a rel="nofollow" target="_blank" href="http://www.datastax.com/what-we-offer/products-services/datastax-enterprise">DataStax Enterprise</a> (DSE) for key parts of their business.  For this interview, we talked with Shion Deysarkar who is the CEO at Datafiniti, along with Phil Coleman who’s the data team lead and Michael Pellon who runs operations.</p>
<p><b>DataStax</b>: Guys, thanks for making the time to talk with us today. Can you give us a quick overview of Datafiniti?</p>
<p><b>Datafiniti</b>:  Datafiniti is a search engine for data.  We’ve built a catalog of all structured data available on the Web; things like businesses, people, products, and more.  We keep a massive database built on DSE of all this information that can be searched and used to generate custom output that meets our customer’s inputs and criteria.</p>
<p>We were originally focused on Web crawling technologies at a different company for about three years and then we pivoted that into our new company, which is Datafiniti.  Our customers subscribe to our service and pay us based on the amount of data they receive from us.</p>
<p><b>DataStax</b>: How do you technically make all that happen?</p>
<p><b>Datafiniti</b>:  There are two major components to our stack. One is the crawling infrastructure and the other is the search part, all of which is hosted in our data center. Where the crawling part is concerned, we have a number of servers that connect to volunteer computers all over the world that help collect our data for us.</p>
<p>Our search component makes use of DataStax Enterprise with Cassandra and Solr, with an API that sits on top of that for customers to query our database.</p>
<p><b>DataStax</b>: Did you guys start out using NoSQL for your database or transition from an RDBMS?</p>
<p><b>Datafiniti</b>:  We didn’t consider relational technology, but started out by looking at all the various NoSQL options like HBase and search software like Elasticsearch.  We ended up deciding on Cassandra for its non-centralized approach and easy scaling. Cassandra allows us to store the big amounts of data that we need to consume and manage.</p>
<p>What was missing in Cassandra was the Google-type search functionality that we needed. When we saw that DataStax had integrated Solr with Cassandra in DataStax Enterprise, it was just a natural evolution for us to use it as our database.</p>
<p>Also, when we tested Solr in DataStax Enterprise, we saw that it worked and performed better than open source Solr, which was also a win for us.</p>
<p><b>DataStax</b>: Did anything else come into play with your decision making process?</p>
<p><b>Datafiniti</b>:  At the time, manageability was something important to us – we wanted a database that would be easy to install, manage, and grow. DataStax Enterprise was just the best option for what we do and need.</p>
<p>We also looked at various benchmarks and saw that Cassandra ran faster than the other options we were considering.</p>
<p><b>DataStax</b>: What are the some of the business benefits you’ve experienced with DataStax Enterprise?</p>
<p><b>Datafiniti</b>:  The primary benefit is that we’re able to deliver much faster search operations to our customers with DSE, and as everyone knows, customers don’t like to wait long when it comes to searching for what they want.</p>
<p><b>DataStax</b>: What advice what you give to people who are just starting out with NoSQL and/or DataStax Enterprise?</p>
<p><b>Datafiniti</b>:  Pay close attention to the demos and samples that ship with DSE because they will help you quickly get things set up and understand how things work. Also, make sure you understand how the individual components of DSE – Cassandra, Hadoop, and Solr – work independently of each other.</p>
<p>Lastly, it’s good to know up front how you go from Cassandra, which has a very flexible and fluid schema model to one that’s more restrictive like Solr.</p>
<p><b>DataStax</b>: Guys, thanks for the time.</p>
<p><b>Datafiniti</b>:  You bet.</p>
<p>For more information on Datafiniti, please visit: <a rel="nofollow" target="_blank" href="http://datafiniti.net/">http://datafiniti.net/</a>.</p>]]></content:encoded>
      </item>
      <item>
         <title>The Five Minute Interview – See.me</title>
         <link>http://www.datastax.com/2013/03/the-five-minute-interview-see-me</link>
         <description>This article is one in a series of quick-hit interviews with companies using &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/technologies/cassandra&quot;&gt;Apache Cassandra&lt;/a&gt; and/or &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.datastax.com/products/enterprise&quot;&gt;DataStax Enterprise..&lt;/a&gt; (DSE) for key parts of their business.  </description>
         <guid isPermaLink="false">http://www.datastax.com/?p=15055</guid>
         <pubDate>Wed, 20 Mar 2013 18:23:21 +0000</pubDate>
         <content:encoded><![CDATA[<p>
This article is one in a series of quick-hit interviews with companies using <a rel="nofollow" target="_blank" href="http://www.datastax.com/technologies/cassandra">Apache Cassandra</a> and/or <a rel="nofollow" target="_blank" href="http://www.datastax.com/products/enterprise">DataStax Enterprise</a> (DSE) for key parts of their business.  For this interview, we talked with Stephen Broner who is a senior developer at see.me.</p>
<p><b>DataStax</b>: We’re doing this interview live at our Cassandra New York Big Data event. Stephen, what are your impressions of the show so far?</p>
<p><b>Stephen</b>:  It&#8217;s been great. I&#8217;ve gotten to meet with a lot of unique representatives from some major brands like eBay and others. I&#8217;ve learned a lot from the folks over on the front lines working with Cassandra directly and I&#8217;ve had a chance to send some ideas across and get some very helpful feedback.</p>
<p><b>DataStax</b>: What does see.me do?</p>
<p><b>Stephen</b>:  See.me is an online community for creatives. So whether you&#8217;re a photographer, model, musician or designer you can join our community and support other creative, be supportive and pretty soon earn money from your passions.</p>
<p><b>DataStax</b>: So let&#8217;s talk a little about Cassandra, I understand you have a MySQL background; why the switch?</p>
<p><b>Stephen</b>:  At see.me we&#8217;re being proactive about the needs of our users and future user base, which is growing. Right now we&#8217;re about around 700,000 users and growing a few thousand a day. We foresee that MySQL can&#8217;t handle our data ingestion needs and it also doesn&#8217;t address things like search and analytics.</p>
<p>But I would say the biggest motivator for the switch is that Cassandra scales readily and is designed for scale. We need it for resilience, stability and to handle the growth.</p>
<p><b>DataStax</b>: You mentioned search and analytics, so I understand you&#8217;re interested in <a rel="nofollow" target="_blank" href="http://www.datastax.com/what-we-offer/products-services/datastax-enterprise">Datastax Enterprise</a> and just doing some research and getting to grips with it. What are some of the features that you find attractive in that platform?</p>
<p><b>Stephen</b>:  For Cassandra itself, I initially did the research for a different company years back and I was interested in the scalability and performance for reads and writes. Specifically, with DataStax Enterprise, I&#8217;d say that we really love the idea of an out-of-the-box solution that combines, without ETL, Cassandra for storage, Solr for search (which we&#8217;re already using with MySQL to great effect), and also Hadoop for analytics. And we have work to do on designing our algorithms and getting smarter for not just matching up creators, but also with images and aesthetics. Bringing this all together, being a small business without a big budget, we can do a lot for our users with DSE.</p>
<p><b>DataStax</b>: What advice do you have to pass on to people who come from a relational background and are getting started with NoSQL and DataStax Enterprise?</p>
<p><b>Stephen</b>:  I would say take advantage of the fact that Cassandra has, from very early on, been very open sourced and community engaged.  There are lots of resources including webinars and videos that convey to you, faster than reading the manual in some cases, how you can use this to get up and running quickly and that was important to me.  Participating in a webinar allowed me to understand not just which version of Cassandra was right for my company, but which version of Cassandra I needed to sell to my teammates to make the transition from MySQL and to really sell the importance/impact of it for people who are not on the front lines of data.</p>
<p><b>DataStax</b>: Steve, thank you very much for taking a few minutes today at NYC* Big Data Tech Day to talk to us. Enjoy the rest of the show and we look forward to having your participation in the community moving forward!</p>
<p><b>Stephen</b>:  Thanks!</p>
<p>For more information on see.me, visit: <a rel="nofollow" target="_blank" href="http://www.see.me/">http://www.see.me/</a></p>]]></content:encoded>
         <category>Blog Post - Corporate</category>
      </item>
   </channel>
</rss>
<!-- fe2.yql.bf1.yahoo.com compressed/chunked Sun May 26 07:38:21 UTC 2013 -->
