One of the trends of the moment is "Big Data" and in many organizations it is a fair bet that some CEO is asking the CIO to come up with a briefing book on Big Data. That message will go down to the CTO, the Chief Architect and so on. Eventually someone will come back with a note to the effect that
What is it? Well, corporations constantly create and acquire bits of unstructured data, and they are beginning to experiment with private cloud computing and associated data approaches using Hadoop and MapReduce to do high-speed data analysis across compute clusters. Cloud advocates Google and Amazon - companies whose use of technology has disrupted the status quo of modern business - are big on Big Data and they and a small handful of other successes represent the lode star for Big Data quests. The mere fact that open source dominates much of Big Data means that, if more companies go this route, application development managers will be busy for a long time.
A recent Big Data piece in a PriceWaterhouseCoopers Technology Forecast considered revisions to the CIO's data playbook. In the course of this came a sampler of skills for the new age Big Data analytics team. Skills required are natural language processing and text mining, and familiarity with Clojure, Scala, Python, Hadoop and Java; also useful are data mining skills with tools like R and Mathlab; add to that scripting and functional language skills with such as Erlang and LISP and new database development skills with such rarities as Cassandra and CouchDB.
It is hard to imagine such individuals actually exist - in some ways they resemble the classic Google job applicant of 2002: A really smart mathematician who programs in Python and Java and who, given a lever and one firm spot on which to stand, can move the Earth.