Quantcast
Channel: Paul Miller - The Cloud of Data » Open Source
Viewing all 10 articles
Browse latest View live

Talking to Simon Wardley about Ubuntu and the Cloud

$
0
0
Wubi
Image via Wikipedia

Most readers of this blog are probably well aware that a new version of the Ubuntu Linux distribution is coming this week, and that it will be putting code from the Open Source EUCALYPTUS Project to work in simplifying the creation of private Clouds that look remarkably like Amazon’s EC2. You’ve probably also read RightScale‘s announcements with respect to Ubuntu, and heard that Sun Microsystems were also making supportive noises about EUCALYPTUS and the EC2 API before their recent change in circumstances.

Earlier today I spoke with Simon Wardley of Canonical (the commercial organisation that sells support and consultancy for Ubuntu) to hear a little more about what those downloading Ubuntu will get… and what it might mean for the rapidly shifting Cloud landscape.

Production of this podcast was supported by Talis, and show notes are available on their Nodalities blog.

Reblog this post [with Zemanta]

EUCALYPTUS Project closes $5.5 Million Series A with Benchmark, moves out of UC Santa Barbara’s Ivory Tower

$
0
0

web-logo-eucalyptus-systemsIt’s only a few short weeks since I last spoke with Rich Wolski, Director of the open source EUCALYPTUS Project at the University of California, Santa Barbara.

“EUCALYPTUS — Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems — is an open-source software infrastructure for implementing Elastic/Utility/Cloud computing using computing clusters and/or workstation farms. The current interface to EUCALYPTUS is interface-compatible with Amazon.com’s EC2 (arguably the most commercially successful Cloud computing service), but the infrastructure is designed to be modified and extended so that multiple client-side interfaces can be supported. In addition, EUCALYPTUS is implemented using commonly-available Linux tools and basic web service technology making it easy to install and maintain.”

Since then, we’ve seen EUCALYPTUS embraced by Sun Microsystems in their Cloud announcements and potentially downloaded to millions of machines around the world as part of the latest update to Linux’s popular Ubuntu distribution.

Today the UCSB research project takes the next step, announcing a successful Series A investment round led by Benchmark Capital that moves the team out of the University and onto a professional footing with $5.5 Million to spend. Project Director Wolski becomes CTO, with Woody Rollins as CEO and Matt Reid as VP Sales & Marketing rounding out Eucalyptus Systems‘ fledgling management team. Wolski reports that the entire UCSB development team is moving across to the newly capitalised company, which is licensing IP from UCSB in return for an undisclosed equity stake. Benchmark’s Kevin Harvey takes a seat on the Board, which is Chaired by former AOL Europe CEO Andreas Von Blottnitz.

Wolski is quick to stress that Eucalyptus Systems is an open source company; there is no intention to start charging for software that is freely available for download today, and this will be actively maintained and developed moving forward. A point release is expected ‘in about a week’ to resolve minor issues with respect to Ubuntu 9.04, and version 1.6 of Eucalyptus will follow ‘in 6-8 weeks.’

Speaking of the company’s proposition to new customers, Wolski suggests that

“Eucalyptus Systems will enable businesses of any size to leverage their own IT resources to get the benefits of cloud computing without the concerns of lock-in, security ambiguity, and unexpected storage costs that can be associated with public clouds.”

Interestingly, Wolski describes the revenue model traditionally employed by companies seeking to make money from open source as no more than a way of boot-strapping Eucalyptus Systems toward their real goal. Rather than simply concentrating on the provision of for-profit support and consultancy, Wolski has his sights set on the sale of new Eucalyptus-powered software solutions directly addressed to Enterprise customers. It’s not yet clear whether these solutions will be SaaS offerings or for on-premise installation, but Wolski is confident that the company’s early customers will begin to receive their new software during the third quarter of 2009.

Although best known for providing an on-premise equivalent to Amazon’s Elastic Compute Cloud (EC2), Wolski stresses that the Eucalyptus architecture is API agnostic and could be extended to emulate other Cloud solutions relatively easily. This, allied with Eucalyptus’ industry-leading support for both the KVM and Xen hypervisors, raises the prospect of enterprise customers integrating their own (Eucalyptus powered) internal Cloud with different public Clouds, seamlessly and at will.

Reblog this post [with Zemanta]

Zmanda CEO talks about Backup and the Cloud

$
0
0

Chander Kant, image (c) Zmanda Corp.Chander Kant is Founder and CEO of Zmanda, a provider of Open Source backup solutions that exploit the capabilities of Cloud storage services (such as Amazon’s S3) to supplement the traditional on-premise backup practices of their business customers.

I spoke with Chander recently to learn more about the company, and to understand the particular issues associated with moving backup data to the Cloud.

Production of this podcast was supported by Talis, and show notes are available on their Nodalities blog.

In addition to leveraging Cloud storage as part of a rounded backup strategy for on-premise data, we also discussed Zmanda’s approach to providing useful backups of data already stored in the Cloud.

Reblog this post [with Zemanta]

Sun moves their Cloud forward at CommunityOne

$
0
0
SANTA CLARA, CA - NOVEMBER 14:  A sign is seen...
Image by Getty Images via Daylife

Sun Microsystems used the CommunityOne East event in New York City this past March to unveil their Cloud Computing offering. I spoke with the company’s Juan Carlos Soto recently, to learn more.

Today, David Douglas (Senior VP, Cloud Computing) opened CommunityOne West in San Francisco discussing ‘Communities, Open Source Platforms, and Clouds.’ I joined the live webcast to see what he had to say.

Dave Douglas kicks off, talking to the importance of ‘community’. He stresses the underlying value of open – source code, protocols, formats, ideas.

“‘Open’ lowers barriers to adoption and innovation.”

A lot of the ideas he’s highlighting are similar to Tim O’Reilly’s call to ‘do stuff that matters;’ but oddly Dave doesn’t mention this.

Lew Tucker, Sun’s Cloud CTO, gets up on stage to talk about Sun’s Cloud Computing with Dave. Their opening gambit is around the on-demand nature of the Cloud, with its ability to pull up (and shut down) Cloud resources on demand, with a credit card. Lew argues that the Cloud doesn’t create lock-in, as it’s based upon open software such as Apache, Solaris and Linux.

Sun’s Storage Service, announced in March, is still on track to be available this summer… so no surprise unveiling from the stage today.

Lew shows some demonstrations of the Sun Compute and Storage Services, building upon those we saw in March to manage resources in the data centre via GUI.

Dave mentioned that ‘several thousand’ Sun staff currently use the Sun Cloud internally, every day, “in Open Office” and elsewhere. Is this ‘just’ Cloud-based file storage, or something more?

On an intriguing mix of laptops, other examples from Sun Partners include Vertica and webappVM. The examples definitely leaned towards the sysadmin and developer crowd, and I look forward to seeing some user-facing apps down the line. Dave cites ‘dozens and dozens’ of partners, as their logos flash up on screen behind him.

Lew suggests that the Cloud introduces a change from ‘Download -> Install -> Config’ to ‘Deploy,’ with the implication that this will always be easier.

Turning to Security, Lew points to a new ‘secure hardened VM for OpenSolaris,’ available on Amazon S3 today. The Center for Internet Security has assessed this new VM and verified it as secure.

Eric Baldeschwieler from Yahoo! gets up on stage, to talk about the ways in which Apache Hadoop is being used at Yahoo! – and their use of the Sun Cloud.

I look forward to hearing more, face to face, during June’s Semantic Technology and Cloud Computing tour around Silicon Valley; Menlo Park is already on my itinerary, along with sojourns to San Jose, San Francisco and Sunnyvale. Anyone else got things they want to show me, June 14-21?

Reblog this post [with Zemanta]

Garlik releases Open Source RDF triple store, claims capacity for 60 billion triples

$
0
0

4storeGarlik CEO Tom Ilube is increasingly coming to represent a voice of reason in the UK’s ongoing angst about Identity, with many a hysterically gibbering Home Office official put in their place by Tom’s more reasoned words in debates on the Today programme and across the UK’s mainstream media.

As the company’s press materials note,

“Garlik, the online identity expert, was founded by Mike Harris, founding CEO of Egg plc, former Egg CIO Tom Ilube and former British Computer Society president Professor Nigel Shadbolt. As the first company to develop a web-scale commercial application of semantic technology, Garlik enables consumers to protect themselves against identity theft and financial fraud.”

According to Wikipedia, ‘Egg… is now the world’s largest internet bank,’ so effective management of identity information is clearly nothing new to Ilube and his team.

Founded in 2005, Garlik has secured some £4.5million from 3i, Doughty Hanson and Noble Venture Finance to offer products such as their DataPatrol solution for tracking sensitive personal information online, and the less ‘serious’ measure of online status, QDOS.

Behind the scenes, data is aggregated from across the open Web and various proprietary databases, and stored in Garlik’s own RDF triple store.

Now the company is releasing their triple store — 4store — under a GNU GPL license and making it available for download. Capable of scaling to handle as many as 60billion triples (perhaps at least three times more than their closest competitors), 4store has the potential to address many concerns about the scalability of triple store technology.

I took the opportunity to talk with Garlik’s Tom Ilube and 4store’s designer, Steve Harris, before the launch and the result has just been released as a podcast.

Reblog this post [with Zemanta]

David Eaves talks about Vancouver’s Open Data initiative

$
0
0
City of Vancouver
Image via Wikipedia

Back in May, ReadWriteWeb reported on a Motion put before legislators in the Canadian city of Vancouver. Duly passed, the Motion commits the city to three closely related ‘open’ agendas;

  • the City of Vancouver will move as quickly as possible to adopt prevailing open standards for data, documents, maps, and other formats of media;
  • the City of Vancouver, when replacing existing software or considering new applications, will place open source software on an equal footing with commercial systems during procurement cycles;
  • the City of Vancouver will freely share with citizens, businesses and other jurisdictions the greatest amount of data possible while respecting privacy and security concerns.

Last week I spoke with David Eaves, a co-author of the Motion, both to understand the city’s rationale, and to explore intentions for the third area — Open Data — in a little more depth. The result has just been released as a podcast, which is available below.

Production of this podcast was supported by Talis, and show notes are available on their Nodalities blog.

As more and more data become available as a matter of course, the examples set by organisations such as MySociety become increasingly attainable for us all. Other than ensuring that it is ‘open,’ do we need to be asking for more from those making data available? And once it’s there, will its use and scrutiny move beyond the enthusiasts and activists to encompass the population at large?

David shares his views on these and other questions during our conversation.

Reblog this post [with Zemanta]

If Government is a Platform, what are people building?

$
0
0

3645305910_9a8a9ca68b_mI’ve written and spoken before about a recent upsurge in enthusiasm for exposing data from Government in ways that facilitate use and re-use, and will doubtless be returning to this topic in the ‘Government Data’ panel session at the Linked Data Meetup in London on Wednesday.

Tim Berners-Lee has been amongst those rallying to the cause, and working with Governments here and overseas to realise the opportunities in — first — simply getting data out and — second — ensuring the structure and linkages required if Government data is to form a useful foundation upon which others really can build.

Tim O’Reilly, too, has been pushing his notion of ‘Government as a Platform’ for some time, driving toward next week’s gov2.0 summit in Washington, DC. His arguments reached a broader audience with yesterday’s guest spot on TechCrunch.

O’Reilly points to some of the uses of Government data in his TechCrunch post, and I’m continuing my own efforts to secure podcast interviews with some of the more interesting examples that I come across.

One of the things, I think, that is most interesting about this ‘platform’ that O’Reilly describes is the myriad ways in which it potentially benefits so many different constituencies.

Government itself should certainly become more efficient, making better use of its own information ‘simply’ because it’s so much easier to see what’s there. John Sheridan of the UK Government’s Office of Public Sector Information (OPSI) touched on some of these issues in our recent conversation, and the potential for discovering synergies across the Departmental and Agency divide is surely just beginning to be realised.

‘Activists’ of various kinds will have easier access to information in support of their various causes. Some of this information will, undoubtedly, be used to embarrass the authorities, and much of it will be skewed to present the truth in rather odd ways. Rather than simply leading to more informed activism by a vocal minority, however, David Eaves in Vancouver and Sunlight Labs’ David James both argue that better availability of data will make it easier for everybody to hold Government to account.

Researchers, such as Jim Hendler’s team at Rensselaer, are turning to resources like Data.gov in search of interesting technological problems and large pools of data upon which to test new techniques and ideas. The resulting data exhaust (in Hendler’s case, RDF versions of Data.gov resources) is then available for others to use in further innovation.

And (perhaps) most interestingly of all, the well understood trinity of open source software, commodity hardware and near-ubiquitous connectivity is coming together with increasingly available data to make the public sector information space — for far too long the expensive preserve of the big Consultancy firms and their ilk — interesting to start-ups and innovators.

Sunlight LabsApps for America2 contest is into its final stage (don’t forget to vote!), and the range of applications received is a clear indication that innovation in and around Government information is both possible and long overdue. Amongst the three finalists, This We Know appeals to my Semantic Web interests because of the particular technological approach they’ve adopted, but there’s plenty to admire in all three.

The test will come down the line, when working with Government data is no longer incentivised by competition glory and prize money, when it’s no longer the hot new source of Big Data for academic exploration, and when the activist arms race levels out at a new plateau of comparable informedness. When that day arrives, will enthusiastic entrepreneurs still be competing to extract additional capability from the Data.gov and Guardian apis, or will all of the hits to these sites emanate from the cubicles of Accenture, IBM and EDS?

Whilst I certainly hope that these mainstream exploiters of public sector information embrace the possibilities, it would be a sad day if independent grass-roots innovation atop the Platform of Government were so short-lived.

As such, I’ll be watching next week’s proceedings in Washington with interest.

Image of the Scottish Parliament by Stéphane Goldstein, shared under Creative Commons license on Flickr.

Reblog this post [with Zemanta]

Talking with Jim Curry about OpenStack and the Cloud

$
0
0

In my latest podcast I talk with Jim Curry, VP Corporate Development at Rackspace and Chief Stacker at OpenStack.

The OpenStack activity was unveiled by Rackspace, NASA, and their partners back in July, and is on track to deliver functional initial releases in the next few weeks. We discuss the relationship between OpenStack’s deliverables and earlier developments from Rackspace and NASA’s Nebula project, and begin to explore the implications of an Open Source Cloud Computing stack for the wider industry.

This podcast was recorded on Friday 10 September, 2010.

During our conversation we referred to the following resources;

Enhanced by Zemanta

Open is good – but encouragement better than mandate

$
0
0
English: Open Data stickers

Image via Wikipedia

Openness is undeniably cool right now, at least if you move in the slightly odd circles that I do. Openly available scientific papers are disrupting the world of scholarly publishing (which may not be all good, but that’s a post for another day). Openly available university courses are finally beginning to work out how to offer meaningful accreditation to students. Openly accessible data from government agencies around the world bulks out almost every data marketplace, and anchors many an analysis. Openly available code for cloud infrastructure or networking is challenging the hold of the tech world’s giants. Everywhere you look, ‘incumbents’ are apparently being ‘challenged’ and ‘disrupted’ by the power of open.

The truth, of course, is a little more complex and a lot more nuanced, as business models shift and evolve just like they always have. In sustainable systems, some people still need to be rewarded (often through being paid) for their effort. And in sustainable systems, paying someone can often be a pretty straightforward means of ensuring that you have a throat to choke if something breaks; big companies adopting open source often seek a proper financial relationship with someone who installs and maintains the ‘free’ software or hardware they’re depending upon.

One area of openness that I’ve been involved with for about ten years is that of open licensing for both creative works and data. And it’s come a very long way.

Here in Europe, for example, the (badly flawed) 2003 Public Sector Information Directive is under review, and there’s every likelihood that the replacement will make a number of sensible moves toward greater openness, transparency, and reusability for publicly funded data. As the EPSI Platform site notes today, Andrés Nin proposes going a step further than the European Commission is currently contemplating, by instituting a common open license across Europe;

“The creation of a single public information re-use space in Europe requires much more, it requires a common European OpenData license applicable to all data generated by European public administrations.”

I would certainly welcome a model license that European member states might be enabled to use. I’d also welcome — and support — vigorous efforts to dissuade individual member states or ministries from their usual practice of tweaking and otherwise modifying perfectly good documents in order to demonstrate how ‘special’ or ‘different’ their circumstances apparently are. When will they all realise that they are neither as special nor as different as they like to think?

But — and it’s a big but — it seems unwise, premature, and unhelpful to even begin to suggest that such a license might be mandated across Europe. It isn’t required, and attempts to develop a single document that everyone could accept would be an unhelpful distraction that would result in something so bureaucratic, so ringed in opt-outs and prevarications, as to be utterly worthless. It would also, in all likelihood, be one of those exercises in which the process very quickly subsumed the point. A prime candidate for, in the words of an old boss, being too busy to be effective.

Data Market Chat: Stephen O’Grady of RedMonk examines the bigger picture

$
0
0
Image representing RedMonk as depicted in Crun...

Image via CrunchBase

RedMonk don’t offer a data marketplace and, so far as I know, they have no intention of doing so. Nevertheless, this series of data market podcasts would not have been complete without an opportunity to hear what RedMonk co-founder Stephen O’Grady had to say. A blog post of his, from late last year, was the thing that finally persuaded me to stop thinking about a series on data markets and actually get on and do it.

So blame Stephen. Or thank him. Depending upon how you feel about this series so far.

Drawing upon some thinking about markets, and a lot of experience from the development of sustainable models for open source, Stephen packs a lot into this half hour conversation. If you’ve been wondering about the space, or trying to understand whether or not its viable, this should be compulsory listening.

Following up on a blog post that I wrote at the start of 2012, this is the sixth in an ongoing series of podcasts with key stakeholders in the emerging category of Data Markets.

Viewing all 10 articles
Browse latest View live




Latest Images