Making the Case for Open Source Hardware, Open Data, and Open AI

Screen Shot 2018-03-07 at 10.33.38 AM

This is a longish article.  TL;DR; If we don’t get a handle on open hardware, open data policies and open access to intelligence, we’ll be back to buttoned-up, closed ecosystems in no time.  It may already be too late.

Open Source has won in the software world.

Let that sink in for a minute.  Do you agree?  Disagree?  Is that statement too bold?  Look across the spectrum of tools that we use to build the software and services that run the world.  Now, look at how many of them are open.  From the operating system up to protocols and service infrastructure to libraries to the toolkits we use to build user experiences, you’ll find open source and open standards everywhere you look.  Open has become the de facto way we build important pieces of software.

All this open source love in the software sphere is great, and it is changing the world, but I can’t help but feel like we are just beginning this journey.  In the early days of technology, it was open by default because so much of the innovation was coming from universities and government projects which were (arguably) used to open collaboration models.  Even your electronics came with a schematic (basically the hardware source code) so they could be repaired.  Commercialization and disposable products led to much of the software that ran the world becoming closed, and hardware designs that were no longer user serviceable.  Closed was harder to reverse engineer and imitate, so it was an attractive way to distribute technology for a commercial product.

This transition was not without its pain when engineers and technicians that were used to being able to crack open a product and fix it discovered that they could no longer do so.  It is said that this pain is, in part, what led to the creation of the FSF and GNU project, the history of which needs a book, not a blog post.  During the dark days of the 1980s and 90s (when I started my career as a developer) closed products were how most work got done.  We got poorly documented APIs into a black box of logic, and it was a Bad Thing.

The Internet turned that tide back.  Suddenly two developers on opposite ends of the earth with an idea could collaborate.  They could communicate.  They could rally others to their cause.  We needed common platforms to build things on in this new world of connectivity.  We wanted to tinker, to experiment, and closed systems are directly opposed to those goals.  We got GNU/Linux.  We got Apache and all the projects that the foundation incubated.  Open was back in the mainstream.

Fast forward to 2018, and for building stuff on the web, open is the default choice.

While the Internet brought new life to open source, it also created a time of rapidly contracting, overlapping technology cycles.  Terminal or standalone apps were replaced with client-server architectures which rapidly gave way to the web, which in turn quickly shifted to mobile as the dominant way to interact.  The next disruption is already underway; the shift to IoT and ubiquitous computing, led by voice platforms like Alexa, Google Home, and others.  What does open source mean in this world?

In a world of ubiquitous computing, technology is everywhere and nowhere at the same time.  It isn’t something you need to consciously pick up and use, it is baked into the things you already touch every day.  Your car.  Your glasses.  Your home.  Your clothing and shoes.  Your watch.  Your kid’s toys.  Your stereo.  Your appliances.  Computing will respond to your voice, gestures, expressions, location, and data from all your other sensors as well as broader sets of data about the world.  Done right, it will be a seamless experience that brings you the information you need when and where you need it and will let you affect your world from wherever you happen to be.  Achieving this means a boatload of new hardware, it means unfathomable volumes of data being captured and made available across all the platforms you use, and it means intelligence that can make sense of it all.

While we may have open source dominance in many parts of the software world, that won’t be enough in the very near future.  Moving into a world of ubiquitous computing while maintaining open source relevance means we need open components up and down the stack.  This starts at the bottom, with hardware.  We have a plethora of options today, thanks to projects and platforms like the Arduino, the Raspberry Pi, BeagleBoard, Particle, and others. This is not intended to be an exhaustive list of open hardware platforms, just a few examples.  If I left out your favorite, please leave a comment! These aren’t all fully open source, as some depend on hardware that may be patent encumbered or proprietary.  They are open in other important ways though, from board designs to developer toolchains to the software that runs on them.  With these types of components and a good idea, a modestly funded group or dedicated individual can build and launch something meaningful.

Am I saying that in some kind of open hardware utopia that everybody will hack together their own smartwatch in their kitchen?  No, no I’m not.  What I am saying is that these open source hardware options drop the barrier to entry for Internet-connected hardware down to where it needs to be (as low as possible), just like the Apache HTTP server did for people serving up HTML in the early days of the web.  These new tinkerers will birth tomorrow’s products or find other ways to participate in ubiquitous computing through their experiences.  If the barriers to entry aren’t kept low, only larger companies will be able to play in this space, and that would be a Bad Thing.  Say hello to the 1980s all over again.  Hopefully the hair doesn’t come back too.

If open hardware is the foundation, what about the data that it generates?  The data and the actions driven by that data is where the value exists, after all.  What good is an Internet-connected shower head if it can’t text you when you are about to jump in and scald yourself, while also queueing up care instructions for first degree burns on your television and ordering a new bottle of aloe lotion from Amazon?  Again, there is a robust ecosystem of open source tools for collecting and managing the data so we have somewhere to start.  You can even run these open source tools on your cloud provider of choice, which again keeps the barrier to entry nice and low.  Say what you will about utility computing, but it sure makes it cheap and easy to try out a new idea.

This is all well and good for the things we might build ourselves, but those are not going to be the only things that exist in our world of ubiquitous computing.  We’ll have devices and services from many vendors and open projects running in our lives.  Given that the data has such value, how can we ensure that data is open as well?  It is (or should be) ours, after all.  We should get access to it as if it were a bank balance, and decide when and how we share it with others.  Open source can inform this discussion as well through the development and application of open data policies.  I envision a future where these policies are themselves code that runs across providers, and they can be forked and merged and improved by the community of people to whom they apply.  The policy then becomes both a mechanism for opening data up to a myriad of uses that we control and a form of open source code itself.  This could enable the emergence of new marketplaces, where we set prices for the various types of data that we generate and companies bid with something of value (services, cash, whatever) to access it.  This happens today, albeit with limited scope.  If you use Facebook, Gmail, Instagram, LinkedIn or any other freebie like these, you are already buying a service with your data in a siloed sort of way.  Your data is your currency and the product that these companies resell.  Their service is the price they pay you to use to use your data.

The final piece of the puzzle is intelligence.  That is, the tools to sift through all the data our lives generate, extract meaningful patterns and insights, and to act on the same.  Because so much of the AI world today is still straddling the academic and the practical, the open mentality has a strong foothold here much like it did when the Internet itself was emerging from research projects and becoming a commercial force.  Take a quick look around the software projects used in companies building their future on AI.  You’ll quickly find that many of the most important are open source.  That’s great and all, but without data and models the tools themselves are of little use.  Combining the open source intelligence tooling with an open data policy framework creates a future where open source matters.

The combination of programmatic open data policies and open intelligence is powerful.  Open data policies would make it possible for new competitors to create something amazing without needing to generate huge sets of data themselves, all they need are users excited enough about what they are building to agree to share the data that already exists.  Much like the market for open data, this could create a market for intelligence.  Instead of being tied to what our existing providers build and decide we need, we might opt to use intelligence services that are tailored to our lives.  Interested in health and wellness?  Use your data as currency to buy an intelligence service that pulls together all your data from across your other providers to suggest personalized ways to be healthier.  Music nut?  Maybe a different intelligence service that looks at parts of your data that correspond to your mood and puts together the perfect playlist.  Trying to save money?  How about an intelligence that analyzes all the waste in your life and suggests ways to improve it.  None of these things will reach their full potential without the ability to use data from across your ubiquitous computing experience.  Importantly, with open data policies, you are in control of how your data is used and you can shape that control collaboratively with others that want the same thing you do.

What happens if we don’t do this?  What happens if we continue to allow our data to be locked up in proprietary silos owned by what will rapidly become legacy providers?  If that trend continues we’ll be back where we started.  Closed ecosystems, black boxes of logic with an API, no ownership of our own data or our own future, and a bar set so high that new entrants into the market will need pockets deeper than most possess to get a foothold.  It is this last point that worries me the most, and where a continued commitment to open source will have the biggest impact.  As I pointed out earlier, I don’t expect that most people will have the time, skills or resources to build their own solutions end to end.  That’s not the point.  The point is keeping the barriers to entry as low as they can be so the next generation of innovations can be born in places other than the R&D labs of the world’s biggest companies.  The democratization of technology birthed the web as we know it and it would be a shame to lose that now.

What’s next?  How do we make this better future happen?  Fortunately, many of the pieces are already falling into place, and more are coming.  Groups like the Open Source Hardware Association (OSHWA) are defining what it means to be open source hardware.  The non-profit Open AI research company has backing from the industry and publishes public papers and open source tools in the intelligence space.  The European General Data Protection Regulation (GDPR) contains important language about right of access (article 15), right of erasure (article 17), and data portability (article 20) that put ownership of your data back where it belongs.  With you.  Open source projects around big data, IoT, intelligence and other key technologies continue to thrive, and with a choice of utility computing providers you can spin them up without much upfront investment.  If an open future matters to you (and it should!), seek out and support these organizations.  Find a way to participate in the open source ecosystem around the work you do.  Support legislation that gives you control over your data.

This article isn’t meant to be gloomy, I think the future of open source is brighter than ever.  Realizing that future means we need look carefully at ways to ensure things other than just the code are open.

Teaching Open Source Practices, Version 4.0″ by Libby Levi is licensed under CC BY 2.0

Open Source in an AI World. Open Matters More Now Than Ever.


Technological unemployment is about to become a really big problem.  I don’t think the impact of automation on jobs is in any doubt at this point, the remaining questions are mostly around magnitude and timeline.  How many jobs will be affected, and how fast will it happen?  One of the things that worries me the most is the inevitable consolidation of wealth that will come from automation.  When you have workers building a product or providing a service, a portion of the wealth generated by those activities always flows to the people that do the work.  You have to pay your people, provide them benefits, time off, etc.  Automation changes the game, and the people that control the automation are able to keep a much higher percentage of the wealth generated by their business.

When people talk about technological unemployment, they often talk about robots assuming roles that humans used to do.  Robots to build cars, to build houses, to drive trucks, to plant and harvest crops, etc.  This part of the automation equation is huge, but it isn’t the only way that technology is going to make some jobs obsolete.  Just as large (if not larger) are the more ethereal ways that AI will take on larger and more complex jobs that don’t need a physical embodiment.  Both of these things will affect employment, but they differ in one fundamental way:  Barrier to entry.

High barriers

Building robots requires large capital investments for machining, parts, raw materials and other physical things.  Buying robots from a vendor frees you from the barriers of building, but you still need the capital to purchase them as well as an expensive physical facility in which you can deploy them.  They need ongoing physical maintenance, which means staff where the robots are (at least until robots can do maintenance on each other).  You need logistics and supply chain for getting raw materials into your plant and finished goods out.  This means that the financial barrier to entry for starting a business using robots is still quite high.  In many ways this isn’t so different from starting a physical business today.  If you want to start a restaurant you need a building with a kitchen, registers, raw materials, etc.  The difference is that you can make a one time up-front investment in automation in exchange for a lower ongoing cost in staff.  Physical robots are also not terribly elastic.  If you plan to build an automated physical business, you need to provision enough automation to handle your peak loads. This means idle capacity when you aren’t doing enough business to keep your machines busy.  You can’t just cut a machine’s hours and reduce operating costs in the same way you can with people.  There are strategies for dealing with this like there are in human-run facilities, but that’s beyond the scope of this article.

Low barriers

At the other end of the automation spectrum is AI without a physical embodiment.  I’ve been unable to find an agreed upon term for this concept of a “bodiless” AI.  Discorporate AI?  Nonmaterial AI?  The important point is that this category includes automation that isn’t a physical robot.  Whatever you want to call it, a significant amount of technological unemployment will come from this category of automation.  AI that is an expert in a given domain will be able to provide meaningful work delivered through existing channels like the web, mobile devices, voice assistants like Alexa or Google Home, IoT devices, etc.  While you still need somewhere for the AI to run, it can be run on commodity computing resources from any number of cloud providers or on your own hardware.  Because it is simply applied compute capacity, it is easier to scale up or down based on demand, helping to control costs during times of low usage.  Most AI relies on large data sets, which means storage, but storage costs continue to plummet to varying degrees depending on your performance, retrieval time, durability and other requirements.  In short, the barrier to entry for this type of automation is much lower.  It takes a factory and a huge team to build a complete market-ready self driving car.  You can build an AI to analyze data and provide insights in a small domain with a handful of skilled people working remotely.  Generally speaking, the capital investment will be smaller, and thus the barrier to entry is lower.

Open source democratizes AI

I don’t want to leave you with the impression that AI is easy.  It isn’t.  The biggest players in technology have struggled with it for decades.  Many of the hardest problems are yet to be solved.  On the individual level, anybody that has tried Siri, or Google Assistant or Alexa can attest to the fact that while these devices are a huge step forward, they get a LOT wrong.  Siri, for example, was never able to respond correctly when I asked it to play a specific genre of music.  This is a task that a 10 year old human can do with ease.  It still requires a lot of human smarts to build out fairly basic machine intelligence.

Why does open source matter more now than ever?  That was the title of this post, after all, and it’s taking an awfully long time to get to the point.  The short version is that open source AI technologies further lower the barriers to entry for the second category of automation described above.  This is a Good Thing because it means that the wealth created by automation can be spread across more people, not just those that have the capital to build physical robots.  It opens the door for more participation in the AI economy, instead of restricting it to a few companies with deep pockets.

Whoever controls automation controls the future of the economy, and open source puts that control in the hands of more people.

Thankfully, most areas of AI are already heavily colonized by open source technologies.  I’m not going to put together a list here, Google can find you more comprehensive answers.  Machine learning / deep learning, natural language processing, and speech recognition and synthesis all have robust open source tools supporting them.  Most of the foundational technologies underpinning these advancements are also open source.  The mots popular languages for doing AI research are open.  The big data and analytics technologies used for AI are open (mostly).  Even robotics and IoT have open platforms available.  What this means is that the tools for using AI for automation are available to anybody with the right skills to use them and a good idea for how to apply them.  I’m hopeful that this will lead to broad participation in the AI boom, and will help mitigate to a small degree the trend toward wealth consolidation that will come from automation.  It is less a silver bullet, more of a silver lining.

Image Credit: By Johannes Spielhagen, Bamberg, Germany [CC BY-SA 3.0], via Wikimedia Commons