- State of current NoSQL databases : A very detailed post about many NoSQL solutions. A lot of work went into this one.
- Truth about joins: Google app engine datastoreâ€™s limitation of not allowing joins might soon be a thing of the past. Simple joins may now be possible on GAE if you are using Java. Its still beta, but the fact that this is being tested is very encouraging.
- Should you switch to NoSQL too ?
- Notes on MongoDB: A very nice summary of MongoDB.
- Redis real-life examples [More here] : Iâ€™ve been seeing a lot of discussions around Redis lately. Here are some use cases Iâ€™ve gathered from a couple of posts. Havenâ€™t yet seen it being used by a large organization.
- Replication technologies
- Using Varnish to assist with AB testing â€“ Testing new uncooked features by external customers get more difficult as products become more mature and stable. Tools like â€œvarnishâ€ could be used to test different pages/features.
- Redundant Array of Independent Datacenters
- Twitter at 50M Daily Tweets
- Funny: Hitler rants about cloud security and updates his Facebook page
February 27, 2010
February 26, 2010
Few weeks ago while I was mulling over what kind of service registry/discovery system to use for a scalable application deployment platform, I realized that for mid-size organizations with complex set of services, building one from scratch may be the only option.
I also found out that many AWS/EC2 customers have already been using S3 and SimpleDB to publish/discover services. That discussion eventually led me to investigate Cassandra as the service registry datastore in an enterprise network.
Here are some of the observations I made as I played with Cassandra for this purpose. I welcome feedback from readers if you think Iâ€™m doing something wrong or if you think I can improve the design further.
- The biggest issue I noticed with Cassandra was the absence of inverted index which could be worked around as I have blogged here. I later realized there is something called Lucandra as well which I need to look at, at some point.
- The keyspace structure I used was very simpleâ€¦ ( I skipped some configuration lines to keep it simple )
- Using an â€œOrderPreservingPartitionerâ€ seemed important to do â€œrange scansâ€. Order Preserving partitioner keeps objects with similar looking keys together to allow bulk reads and writes. By default Cassandra randomly distributes the objects across the cluster which works well if you only have a few nodes.
- I eventually plan to use this application across two datacenters. The best way to mirror data across datacenters in Cassandra is by using â€œRackAwareStrategyâ€. If you select this option, it tells Cassandra to try to pick replicas of each token from different datacenters/racks. The default algorithm uses IP addresses to determine if two nodes are part of the same rack/datacenter, but there are other interesting ways to do it as well.
- Some of the APIs changed significantly between the versions I was playing with. Cassandra developers will remind you that this is expected in a product which is still at 0.5 version. What amazes me, however, is the fact that Facebook, Digg and now Twitter have been using this product in production without bringing down everything.
- I was eventually able to build a thin java webapp to front-end Cassandra, which provided the REST/json interface for registry/discovery service. This is also the app which managed the inverted indexes.
- Direct Cassandra access from remote services was disabled for security/stability reasons.
- The app used DNS to loadbalance queries across multiple servers.
- My initial performance tests on this cluster performed miserably because I forgot that all of my requests were hitting the same node. The right way to tests Cassandraâ€™s capacity is by loadbalancing requests across all Cassandra nodes.
- Also realized, that by default, the logging mode was set to â€œDEBUGâ€ which is very verbose. Shutting that down seemed to speed up response times as well.
- Playing with different consistency levels for reading and writing was also an interesting experience, especially when I started killing nodes just to see the app break. This is what tweeking CAP is all about.
- Due to an interesting problem related to â€œeventual consistencyâ€, Cassandra doesnâ€™t completely delete data which was marked deletion or was intentionally changed. In the default configuration that data is kept around for 10 days before its completely removed from the system.
- Some documentation on the core operational aspects of Cassandra exist, but it would be nice if there were more.
Cassandra was designed as a scalable,highly available datastore. But because of its interesting self-healing and â€œRackAwareâ€ features, it can become an interesting communication medium as well.
- Scalability is not improving latency, but increasing throughput
- But overall performance shouldnâ€™t degrade
- Throw hardware, not people at the problem
- Traditional databases use b-tree indexes. But requires the entire index to be in-memory at the same place.
- Easy bandaid #1â€“ SSD storage is better for b-tree indexes which need to hit disk
- Easy bandaid #2 â€“ Buy faster server every 2 years. As long as your userbase doesnâ€™t grow faster that Mooreâ€™s law
- Easy bandaid #3 â€“ Use caching to handle hotspots (Distributed)
- Memcache server failures can change where hashing keys are kept
- Consistent hashing solves the problem by mapping keys to tokens. The tokens can move around to more or less server. Apps would be able to figure out which keys are where.
February 25, 2010
Syslog is a commonly used transport mechanism for system logs. But people sometimes forget it could be used for a lot of other purposes as well.
Take, for example, the interesting challenge of aggregating web server logs from 100 different servers into one server and then figuring out how to merge them. If you have built your own tool to do this, you would have figured out by now how expensive it is to poll all the servers and how out-of-date these logs could get by the time you process it. If you are not inserting them into some kind of datastore which sorts the rows by timestamp, you now also have to take up the challenge of building merge-sort script.
There is nothing which stops applications from using syslog as well. If your apps are in Java, you should try out Syslog appender for log4j [Ref 1] [Ref 2]. Not only do you get central logging, you also get get to see real-time â€œtail -fâ€ of events as they happen in a merged file. If there are issues anywhere in your network, you have just one place to look at. If your logging volume is high, you would have to use other tools (or build your own) to do log analysis.
Here are some things you might have to think about if you plan to use syslog for your environment.
- Setup different syslog servers for each of your datacenters using split DNS or by use different hostnames.
- Try not to send logs across WAN links
- Rotate logs on a nightly basis, or depending on the log volume
- Reduce amount of logging (donâ€™t do â€œdebugâ€ in production for example)
- Write tools to detect change in logging volume in dev/qa environment. If you follow good logging practice, you should be able to identify components which are responsible for the increase very quickly.
- Identify log patterns which could be causes of concerns and setup some kind of alerting using your regular monitoring service (nagios for example). Donâ€™t be afraid to use 3rd party tools which do this very well.
- Syslog over UDP is non-blocking, but the syslog server can overloaded if logging volume is not controlled. The most expensive part of logging is disk i/o. If you notice high i/o
- UDP doesnâ€™t guarantee that every log event will make it to the syslog server. Find out if that level of uncertainty in logging is ok for your environment.
Other interesting observations
- The amount of changes required in a java app which is already using log4j to log to a syslog server is trivial
- Logging to local files can be disabled, which means you donâ€™t have to worry about disk storage on each server..
- If you are using or want to use tools like splunk or hadoop/hbase for log analysis, syslog is probably the easiest way to get there.
- You can always loadbalance syslog servers by using DNS loadbalancing.
- Apache webservers canâ€™t do syslog out of the box, but you can still make it happen
- I personally like haproxy more and it does do syslog out of the box.
- If you want to log events from startup/shutdown scripts, you can use the â€œloggerâ€ *nix command to send events to the syslog server.
How is log aggregated in your environment ?
February 24, 2010
We discussed Brewerâ€™s Theorm a few days ago and how its challenging to obtain Consistency, Availability and Partition tolerance in any distributed system. We also discussed that many of the distributed datastores allow CAP to be tweaked to attain certain operational goals.
Amazon SimpleDB, which was released as an â€œEventually Consistentâ€ datastore, today launched a few features to do just that.
- Consistent reads: Select and GetAttributes request now include an optional Boolean flag â€œConsistentReadâ€ which requests datastore to return consistent results only. If you have noticed scenarios where read right after a write returned an old value, it shouldnâ€™t happen anymore.
- Conditional put/puts, delete/deletes : By providing â€œconditionsâ€ in the form of a key/value pair SimpleDB can now conditionally execute/discard an operation. This might look like a minor feature, but can go a long way in providing reliable datastore operations.
Even though SimpleDB now enables operations that support a stronger consistency model, under the covers SimpleDB remains the same highly-scalable, highly-available, and highly durable structured data store. Even under extreme failure scenarios, such as complete datacenter failures, SimpleDB is architected to continue to operate reliably. However when one of these extreme failure conditions occurs it may be that the stronger consistency options are briefly not available while the software reorganizes itself to ensure that it can provide strong consistency. Under those conditions the default, eventually consistent read will remain available to use.
February 22, 2010
NoSQL solutions have one thing in common. They are generally designed for horizontal scalability. So its no wonder that lot of applications in the â€œtwitterâ€ world have picked NoSQL based datastores for their persistence layer. Here is a collection of these apps from MyNoSQL blog.
- Twitter uses Cassandra
- MusicTweets used Redis [ Ref ] â€“ The site is dead, but you can still read about it
- Tstore uses CouchDB
- Retwis uses CouchDB
- Retwis-RB uses Redis and Sinatra ?? - No idea what sinatra is. Will have to look into it. [ Update: Sinatra is not a DB store ]
- Floxee uses MongoDB
- Twidoop uses Hadoop
- Swordfish built on top of Tokyo Cabinet comes with a twitter clone app with it.
- Tweetarium uses Tokyo Cabinet
Do you know of any more ?
February 21, 2010
So there is someone who thinks â€œeventual consistency is just cachingâ€. Though I liked the idea of discussing this, I donâ€™t agree with Udiâ€™s views on this.
â€œCacheâ€ is generally used to store data which is more expensive to obtain from the primary location. For example, caching mysql queries is ideal for queries which could take more than fraction of a second to execute. Another example is caching queries to S3, SimpleDB or Googleâ€™s datastore which could cost money and introduce network latency into the mix. Though most applications are built to use such caches, they are also designed to be responsive in absence of caching layer.
The most important difference between â€œcacheâ€ and a â€œdatastoreâ€ is that the dataflow is generally from â€œdatastoreâ€ to â€œcacheâ€ rather than the other way round. Though one could queue data on â€œcacheâ€ first and then update datastore later (for performance reasons) that is not the way one should use it. If you are using â€œcacheâ€ to queue data for slower storage, you are using the wrong product. There are better â€œqueuingâ€ solutions (activemq for example) that can do it for you in a more reliable way.
In most â€œeventually consistentâ€ systems, there is no concept of primary and secondary nodes. Most nodes on such systems are considered equal and have similar performance characteristics.
Since â€œcachingâ€ solutions are designed for speed, they generally donâ€™t have a concept of â€œreplicasâ€ or allow persistence to disk. Synchronizing between replicaâ€™s or to a disk can be expensive and be counter productive which is why its rare to find them on â€œcachingâ€ products. But many â€œeventually consistentâ€ systems do provide a way for developers to request the level of â€œconsistencyâ€ (or disk persistence) desired.
Do you have an opinion on this ? Please share examples if you have seen â€œcachingâ€ layer being used as an â€œeventually consistent datastoreâ€.
Update: Udi mentioned on twitter that â€œwrite through cachesâ€ are eventually consistent. Sure, they are if you are talking about a caching layer on top of a persistent layer. I think there is an argument which could be made that â€œcachesâ€ are eventually consistent, but the reverse may not be true which is what his original post mentioned.
February 17, 2010
Some of interesting links for today
- A very good post about the need of event driven Cloud API model for monitoring. I think its a matter of time before this happens. Just like feed crawlers are embracing even driven publication notification using protocols like Pubsubhubbub, we need something similar to snmp traps for the cloud notification world.
- Translate SQL to MongoDB MapReduce
- Real-time web for web developers: An example of how the problem of polling huge number of websites for updates was transformed by simply using an event driven push model.
- Logging: unsexy, important and now usable
- Comparing Pig Latin and SQL for Constructing Data processing pipelines
- Cassandra backend for Lucene ? This seems to solve the problem of building reverse index on cassandra which I previously blogged about.
- Cloud MR : A Map/Reduce framework over Amazonâ€™s S3/SQS/EC2 service.
- Interesting NoSQL Categorization
- Writing twitter service on App engine
If you missed the AWS S3 versioning webcast, I have a copy of the video here. And here are the highlights..
- You can enable and disable this at the bucket level
- They donâ€™t think there is a performance penalty of turning versioning (but it was kind of obvious S3 would be doing slightly extra work to figure out which is the latest version of any object you have)
- There isnâ€™t any additional cost for using versioning. But you have to pay for extra copy of each object.
- MFA (multi factor authentication) to delete objects is not mandatory when versioning is turned on. It needs to be turned on. This was slightly confusing in the original email I got from AWS.
- If you are planning to use this, please watch this video. There is a part where they explain what happens if you disable versioning after using the feature. This is something you might like to know about.
- They use GUID for versioning of each object
- You can iterate over objects and figure out how many versions you have for each object, but currently its not possible to find all objects which have versions older than X date. This is important if you are planning to garbage collection (cleaning up older copies of data) for a later time.
February 14, 2010
It states, that though its desirable to have Consistency, High-Availability and Partition-tolerance in every system, unfortunately no system can achieve all three at the same time.
Consistent: A fully Consistent system is one where the system can guarantee that once you store a state (lets say â€œx=yâ€) in the system, it will report the same state in every subsequent operation until the state is explicitly changed by something outside the system. [Example 1] A single MySQL database instance is automatically fully consistent since there is only one node keeping the state. [Example 2] If two MySQL servers are involved, and if the system is designed in such a way that all keys starting â€œaâ€ to â€œmâ€ is kept on server 1, and keys â€œnâ€ to â€œzâ€ are kept on server 2â€, then system can still easily guarantee consistency. Lets now setup the DBs as master-master replicas[Example 3] . If one of the database accepts get a â€œrow insertâ€ request, that information has to be committed to the second system before the operation is considered complete. To require 100% consistency in such a replicated environment, communication between nodes is paramount. The over all performance of such a system could drop as number of replicaâ€™s required goes up.
Available: The database in [Example 1] or [Example 2] are not highly Available. In [Example 1] if the node goes down there would 100% data loss. In [Example 2] if one node goes down, you will have 50% data loss. [Example 3] is the only solution which solves that problem. A simple MySQL server replication [ Multi-master mode] setup could provide 100% availability. Increasing the number of nodes with copies of the data directly increases the availability of the system. Availability is not just a protection from hardware failure. Replicas also help in loadbalancing concurrent operations, especially the read operations. â€œSlaveâ€ MySQL instances are a perfect example of such â€œreplicasâ€.
Partition-tolerance: So you got Consistency and Availability by replicating data. Lets say you had these two MySQL servers in Example 3, in two different datacenters, and you loose the network connectivity between the two datacenters making both databases incapable of synchronizing state between the two. Would the two DBs be fully functional in such a scenario ? If you somehow do manage allow read/write operations on these two databases, it can be proved that the two servers wonâ€™t be consistent anymore. A banking application which keeps â€œstate of your accountâ€ at all times is a perfect example where its bad to have inconsistent bank records. If I withdraw 1000 bucks from California, it should be instantly reflected in the NY branch so that the system accurately knows how much more I can withdraw at any given time. If the system fails to do this, it could potentially cause problems which could make some customers very unhappy. If the banks decide consistency is very important, and disable write operations during the outage, it will will loose â€œavailabilityâ€ of the cluster since all the bank accounts at both the branches will now be frozen until network comes up again.
This gets more interesting when you realize that C.A.P. rules donâ€™t have to be applied in an â€œall or nothingâ€ fashion. Different systems can choose various levels of consistency, availability or partition tolerance to meet their business objective. Increasing the number of replicas, for example, increases high availability but it could at the same time reduce partition tolerance or consistency.
When you are discussing distributed systems ( or distributed storage ), you should try to identify which of the three rules it is trying to achieve. BigTable, used by Google App engine, and HBase, which runs over Hadoop, claim to be always consistent and highly available. Amazonâ€™s Dynamo, which is used by S3 service and datastores like Cassandra instead sacrifice consistency in favor of availability and partition tolerance.
CAP theorem doesnâ€™t just apply to databases. Even simple web applications which store state in session objects have this problem. There are many â€œsolutionsâ€ out there which allow you to replicate session objects and make sessions data â€œhighly availableâ€, but they all suffer from the same basic problem and its important for you to understand it.
Recognizing which of the â€œC.A.P.â€ rules your business really needs should be the first step in building any successful distributed, scalable, highly-available system.
February 10, 2010
Lots of interesting updates today.
But would like to first mention the fantastic work Cloud computing group at UCSB are doing to make appengine framework more open. They have done significant work at making appscale â€œworkâ€ with different kinds of data sources including HBase, Cassandra, Voldemort, MongoDB, Hypertable and Mysql and MemcacheDB. Appscale is actively looking for folks interested in working with them to make this stable and production ready.
- GAE 1.3.1 released: I think the biggest news about this release is the fact that 1000 row limit has now been removed. You still have to deal with the 30 second processing limit per http request, but at least the row limit is not there anymore. They have also introduced support for automatic transparent datastore api retries for most operations. This should dramatically increase reliability of datastore queries, and reduces the amount of work developers have to do to build this auto-retry logic.
- Elastic search is a lucene based indexing product which seems to do what Solr used to do with the exception that it can now scale across multiple servers. Very interesting product. Iâ€™m going to try this out soon.
- MemcacheDB: A distributed key-value store which is designed to be persistent. It uses memcached protocol, but its actually a datastore (using Berkley DB) rather than cache.
- Nasuni seems to have come up with NAS software which uses cloud storage as the persistent datastore. It has capability to cache data locally for faster access to frequently accessed data.
- Guys at Flickr have two interesting posts you should glance over. â€œUsing, Abusing and Scaling MySQL at Flickrâ€ seems to be the first in a series of post about how flickr scales using Mysql. The next one in the series is â€œTicket Servers: Distributed Unique Primary Keys on the Cheapâ€
- Finally a fireside chat by Mike Schroepfer, VP of Engineering, about Scaling Facebook.
February 08, 2010
One of the problem with Amazonâ€™s S3 was the inability to take a â€œsnapshotâ€ of the state of S3 at any given moment. This is one of the most important DR (disaster recovery) steps of any major upgrade which could potentially corrupt data during a release. Until now the applications using S3 would have had to manage versioning of data, but it seems Amazon has launched a versioning feature built into S3 itself to do this particular task. In addition to that, they have made it a requirement that delete operations on versioned data can only be done using MFA (Multi factor authentication).
Versioning allows you to preserve, retrieve, and restore every version of every object in an Amazon S3 bucket. Once you enable Versioning for a bucket, Amazon S3 preserves existing objects any time you perform a PUT, POST, COPY, or DELETE operation on them. By default, GET requests will retrieve the most recently written version. Older versions of an overwritten or deleted object can be retrieved by specifying a version in the request.
The way AWS Blog describes the feature, it looks like a version would be created every time an object is modified and each object in S3 could have different number of copies depending on the number of times it was modified.
This kind of reminds me of SVN/CVS like versioning control system and I wonder how long it will take for someone to build a source code versioning system on S3.
BTW, data requests to a versioned object is priced the same way as regular data, which basically means you are getting this feature for free.
February 06, 2010
Cassandra is the only NOSQL datastore Iâ€™m aware of, which is scalable, distributed, self replicating, eventually consistent, schema-less key-value store running on java which doesnâ€™t have a single point of failure. HBase could also match most of these requirements, but Cassandra is easier to manage due to its tiny footprint.
The one thing Cassandra doesnâ€™t do today is indexing columns.
Lets take a specific example to explain the problem. Lets say there are 100 rows in the datastore which have 5 columns each. If you want to find the row which says â€œService=app2â€, you will have to iterate one row at a time which is like full database scan. In a 100 row datastore if only one row had that particular column, it could take on an average about 50 rows before you find your data.
While Iâ€™m sure there is a good reason why this doesnâ€™t exist yet, the application inserting the data could build such an inverted index itself even today. Here is an example of how a table of inverted index would look like.
If you want to find the â€œstatusâ€ of all rows where â€œService=app2â€, all you have to do is find the list of keys by making a single call to this table. The second call would be to get all the columns values for that row. Even if you have 100 different rows in a table, finding that one particular row, matching your search query, could now be done in two calls.
Of course there is a penalty you have to pay. Every time you insert one row of data, you would also have to insert multiple rows to build the inverted index. You would also have to update the inverted index yourself if any of the column values are updated or deleted. Cassandra 0.5.0 which was recently released has been benchmarked to insert about 10000 rows per second on a 4 core server with 2GB of RAM. If you have an average of 5 columns per row, that is about 1.5k actual row inserts per second (that includes 5 rows of inserts/updates required for an inverted index). For more throughput you always have an option to add more servers.
Facebook and Digg are both extensively using Cassandra in their architectures. Here are some interesting reading materials on Cassandra if youâ€™d like to explore more.
- Digg: Looking to the future of Cassandra
- Facebook: Structured storage system on a P2P Network
- Jonathan Ellisâ€™ cassandra reading list
- An example of how â€œdelicousâ€ schema would look like in cassandra : asenchi
- Cassandra : Articles and presentations
- Getting started
- WTF is a supercolumn
- Cassandra internals
February 05, 2010
Last year Dealnews.com unexpectedly got listed on the front page of yahoo.com for a couple of hours. No matter how optimistic one is, unexpected events like these can take down a regular website with almost no effort at all. What is your plan if you get slashdotted ? Are you ok with a short outage ? What is the acceptable level of service for your website anyway.
One way to handle such unexpected traffic is having multiple layers of cache. Database query cache is one, generating and caching dynamic content is another way (may be using a cronjob). Tools like memcached, varnish, squid can all help to reduce the load on application servers.
Proxy servers ( or webservers ) in front of application servers play a special role in dealnews. They understood the limitations of application servers they were using, and the fact that slow client connections means longer lasting tcp sessions to the application servers. Proxy servers, like varnish, could off-load that job and take care of content delivery without keeping application servers busy. In addition Varnish also acts as a content caching service which further reduces load on the application servers.
Dealnewsâ€™ content is extremely dynamic because of which the it uses a very low TTL of 5 minutes for most of its pages. It may not look a lot but at thousands of pages per second, such a cache can do miracles. While caching is great, the one thing every loaded website has to go through is figure out how to avoid the â€œcache stampedeâ€ when the TTL expires. â€œCache stampedeâ€ is what happens when 100s of request requesting the same resource hit the server at the same time forcing the webserver to forward all 100 request to the app server and the database server because the caches were not good.
Dealnews solves this problem by separating content generation from content delivery. There is a process which they run which converts data from more than 300 tables of normalized data, into 30 tables with highly redundant de-normalized data. This data is kept in such a way that the application servers are required to make queries using primary keys or unique keys only. With such a design a cluster of Mysql DB servers shouldnâ€™t have any problem handling 1000s of queries per second from the front end application servers.
Twitter drives a lot of traffic and since a lot of that data is redundant, it heavily relies on caches. Its actually so much that the site could completely go down if a few memcached servers go down. Dealnews explicitly tested their application with the memcached servers disabled to see what the worst case scenario was for reinitializing cache. They then optimized their app to the point where the response time only doubled from about 0.75 seconds to 1.5 second per page without memcached servers.
Handling 3rd party content could be tricky. Dealnews treats 3rd party content as lower class citizens. They not only load 3rd party at the very end of the page, they also try to use iframes wherever possible to keep loading of those objects from loading of dealnews.com
If you are interested in the video recording or the slides from the talk, click on the following links.
February 04, 2010
Most of the newer, successful, web startups have one thing in common. They release smaller changes more often. Being in operations, I am often surprised how these organizations manage such a feat without breaking their website. Here are some notes from someone in flickr about how they do it. The two most important part of this talk is the observation that Dev, Qa and Operations teams have to slightly blend into each other to achieve deployments at such a velocity, and the fact that they are not afraid to break the website by deploying code from trunk.
- Donâ€™t be afraid to do releases
- Automate infrastructure (hardware/OS and app deployment)
- Share version control system
- Enable one step build
- Enable one step build and deploy
- Do small frequent changes
- Use feature flags (branching without source code branching)
- Always ship trunk
- Do private betas
- Share metrics
- Provide applications ability to talk back (IM/IRC) - build logs, deploy logs, alert monitors. real time traffic can all go here
- Share runbooks and escalation plans
- Prepare/Plan for failures
- Do firedrills
- No fingerpointingâ€¦
February 03, 2010
While PHP is very popular, it unfortunately doesn't perform as some of its competitors. One of the ways to make things faster is to write PHP Extensions in C++. In this post we will describe two different ways developers can solve this problem and the milage you might get from either model may vary.
Since Facebook is mostly running PHP, it noticed this problem pretty early, but instead of asking its developers to move from PHP to C++ one of their developers hacked up a solution to transform PHP code into C++.
Yesterday, Facebook announced they are opening up HipHop, a source code transformer, which changes PHP code into a more optimized C++ code and uses g++ to compile it. With some minor sacrifices (no eval support) they noticed they were able to get 50% performance improvement. And since they serve 400 billion page views every month, that kind of saving can free up a lot of servers.
More info on HipHop
Quercus on the other hand is a 100% java implementation of PHP 5. What makes this more interesting is that Quercus can now run in Google App Engine pretty much the same way JSPs can.
Cauchoâ€™s Quercus presents a new mixed Java/PHP approach to web applications and services where Java and PHP tightly integrate with each other. PHP applications can choose to use Java libraries and technologies like JMS, EJB, SOA frameworks, Hibernate, and Spring. This revolutionary capability is made possible because 1) PHP code is interpreted/compiled into Java and 2) Quercus and its libraries are written entirely in Java. This architecture allows PHP applications and Java libraries to talk directly with one another at the program level. To facilitate this new Java/PHP architecture, Quercus provides and API and interface to expose Java libraries to PHP.
The demo of Quercus running on GAE was very impressive. Any pure PHP code which doesnâ€™t need to interact with external services would work beautifully without any issues on GAE. But the absence of Mysql in GAE means SQL queries have to be mapped to datastore (bigtable) which might require a major rewrite to parts of the application. But its not impossible, as they have shown by making wordpress run on GAE (crawl might be a better word though).
While Quercus is opensource and is as fast as regular PHP code in interpreted mode, the compiler which is way faster is not free. Regardless Quercus is a step in the right direction, and I sincerely hope PHP support on GAE is here to stay.
February 02, 2010
Networking devices on the edges have become smarter over time. So have the firewalls and switches used internally within the networks. Whether we like it or not, web applications over time have grown to depend on them.
Its impossible to build a flawless product because of which its standard practice to disable all unused services on a server. Most organizations today try to follow the n-tier approach to create different logical security zones with the core asset inside the most secure zone. The objective is to make it difficult for an attacker to get to the core asset without breaching multiple sets of firewalls.
Doing frequent system patches, auditing file system permissions and setting up intrusion detection (host or network based) are some of the other mundane ways of keeping web applications safe from attacks.
Though cloud has made deployment of on-demand infrastructure simpler, its hard to build a walled garden around customers cluster of servers on the cloud in an efficient way anymore. And the absence of such walled gardens and logical security zones means there are more points of entry into the infrastructure which could be exploited. If you replace 10 powerful internal servers with 100 small servers on the cloud, all of a sudden you might have to worry about protecting 100 individual servers instead of protecting a couple of edge devices. In a worst case scenario, one week server in the cluster could expose the entire cluster to an attacker. Here are a few other things to think about...
- Host based firewalls should allow only traffic which are required/expected
- Non-essential services should be shut off on the server
- Some kind of Intrusion detection might be important to have
- Keys/passwords should be changed periodically
- System patches (update OS image) need to be applied periodically
- Authenticate/Authorize all inter-server communication
- Maintain audit trail for all changes to images/servers if possible
An organization which is completely on the cloud may not have an IT department in its current form, but it might still have an operations team which makes the security policies, updates OS images, manages billing, monitors system health (and IDS) and trains developers to do the things in the right way.
If your infrastructure is on the cloud, do write back with a note about what you do to protect your applications.
Image source: AMagill
February 01, 2010
Windows Azure is an application platform provided by Microsoft to allow others to run applications on Microsoftâ€™s â€œcloudâ€ infrastructure. Its finally open for business (as of Feb 1, 2010). Below are some links about Azure for those who are still catching up.
Wikipedia: Windows Azure has three core components: Compute, Storage and Fabric. As the names suggest, Compute provides computation environment with Web Role and Worker Role while Storage focuses on providing scalable storage (Blobs, Tables, Queue) for large scale needs.
The hosting environment of Windows Azure is called the Fabric Controller - which pools individual systems into a network that automatically manages resources, load balancing, geo-replication and application lifecycle without requiring the hosted apps to explicitly deal with those requirements. In addition, it also provides other services that most applications require â€” such as the Windows Azure Storage Service that provides applications with the capability to store unstructured data such as binary large objects, queues and non-relational tables. Applications can also use other services that are a part of the Azure Services Platform.
- MS: Microsoft â€“ Azure Services Platform
- MS: How much does it cost ?
- MS-Dev: Cloud Computing Tools
- MS-Dev: Windows Azure SDK
- MS-Dev: Windows Azure Tools for Microsoft Visual Studio 1.1
- MS-Dev: Windows Azure platform AppFabric SDK 1.0
- MS-Dev: Windows Azure Platform Training Kit
- Blog: Everything you want to know about Azure
- News: Microsoft doesnâ€™t want private Azure clouds
- Learn: Wikipedia â€“ Azure Services Platform
- Learn: Introducing the Azure Services platform, David Chappell
- Learn: Virtual Lab: Windows Azure
- Library: JAzure â€“ A Java API for Microsoft Windows Azure Storage Services
- Library: AppFabric SDK for Java Developers
- Library: AppFabric SDK for Ruby Develppers
- Slides: Windows Azure Cases
- Slides: Understanding Azure â€“ David Gristwood
- Slides: Azure for Science
- Slides: Considering Windows Azure
- Slides: Microsoft Cloud in 5 minutes
- Slides: Storing Data in the cloud
- Slides: Windows in the cloud
- Slides: Windows Azure
- Slides: What is Windows Azure
- Slides: Azure real world
- Slides: Introduction to SQL services
- Slides: Data in the Azure cloud
- Slides: Architecting for the Windows Azure Platform
- Slides: 15 minute overview of Microsoftâ€™s Cloud Database
- Slides: Windows Azure and a little SQL Data Services
- Slides: Windows Azure storage
- Slides: Building Applications with SQL Data services and Windows Azure